Latest revision as of 13:55, 12 April 2021

Home * People * Shalabh Bhatnagar

Shalabh Bhatnagar ^[1]

Shalabh Bhatnagar,
an Indian computer scientist, Professor at Department of Computer Science and Automation, Indian Institute of Science, Bangalore. His research interests are in control/optimization in stochastic dynamic systems, in particular reinforcement learning and simulation optimization.

Selected Publications

^[2]

2005 ...

Shalabh Bhatnagar, Sandeep Kumar (2005). A reinforcement learning based algorithm for Markov decision processes. ICISIP 2005
Mohammed Shahid Abdulla, Shalabh Bhatnagar (2007). Reinforcement Learning Based Algorithms for Average Cost Markov Decision Processes. Discrete Event Dynamic Systems, Vol.17, No.1 » SPSA
Hamid Reza Maei, Csaba Szepesvári, Shalabh Bhatnagar, Doina Precup, David Silver, Richard Sutton (2009). Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation. NIPS 2009, pdf
Richard Sutton, Hamid Reza Maei, Doina Precup, Shalabh Bhatnagar, David Silver, Csaba Szepesvári, Eric Wiewiora. (2009). Fast Gradient-Descent Methods for Temporal-Difference Learning with Linear Function Approximation. ICML 2009

2010 ...

Hamid Reza Maei, Csaba Szepesvári, Shalabh Bhatnagar, Richard Sutton (2010). Toward Off-Policy Learning Control with Function Approximation. ICML 2010, pdf
Debarghya Ghoshdastidar, Ambedkar Dukkipati, Shalabh Bhatnagar (2012). Smoothed Functional Algorithms for Stochastic Optimization using q-Gaussian Distributions. arXiv:1206.4832
Shalabh Bhatnagar, H.L. Prasad, L.A. Prashanth (2013). Stochastic Recursive Algorithms for Optimization: Simultaneous Perturbation Methods. Lecture Notes in Control and Information Sciences, Vol. 434, Springer » SPSA
Debarghya Ghoshdastidar, Ambedkar Dukkipati, Shalabh Bhatnagar (2013). Newton based Stochastic Optimization using q-Gaussian Smoothed Functional Algorithms. arXiv:1311.2296
H.L. Prasad, L.A. Prashanth, Shalabh Bhatnagar (2014). Actor-Critic Algorithms for Learning Nash Equilibria in N-player General-Sum Games. arXiv:1401.2086

2015 ...

Vinayaka Yaji, Shalabh Bhatnagar (2015). A bi-convex optimization problem to compute Nash equilibrium in n-player games and an algorithm. arXiv:1504.06828
H.L. Prasad, Shalabh Bhatnagar (2015). A Study of Gradient Descent Schemes for General-Sum Stochastic Games. arXiv:1507.00093
Ajin George Joseph, Shalabh Bhatnagar (2016). A Cross Entropy based Stochastic Approximation Algorithm for Reinforcement Learning with Linear Function Approximation. arXiv:1609.09449
Jayvant Anantpur, Nagendra Gulur Dwarakanath, Shivaram Kalyanakrishnan, Shalabh Bhatnagar, R. Govindarajan (2017). RLWS: A Reinforcement Learning based GPU Warp Scheduler. arXiv:1712.04303
Ajin George Joseph, Shalabh Bhatnagar (2018). An Online Prediction Algorithm for Reinforcement Learning with Linear Function Approximation using Cross Entropy Method. arXiv:1806.06720
Chandramouli Kamanchi, Raghuram Bharadwaj Diddigi, Shalabh Bhatnagar (2019). Successive Over Relaxation Q-Learning. arXiv:1903.03812
Chandramouli Kamanchi, Raghuram Bharadwaj Diddigi, Shalabh Bhatnagar (2019). Second Order Value Iteration in Reinforcement Learning. arXiv:1905.03927
Raghuram Bharadwaj Diddigi, Chandramouli Kamanchi, Shalabh Bhatnagar (2019). Solution of Two-Player Zero-Sum Game by Successive Relaxation. arXiv:1906.06659

2020 ...

Indu John, Chandramouli Kamanchi, Shalabh Bhatnagar (2020). Generalized Speedy Q-Learning. IEEE Control Systems Letters, Vol. 4, No. 3, arXiv:1911.00397

External Links

References

Up one level

[1] Shalabh Bhatnagar

[2] : Shalabh Bhatnagar

[1]

[2]

@@ Line 12: / Line 12: @@
 * [[Shalabh Bhatnagar]], [https://dblp.org/pid/06/5441.html Sandeep Kumar] ('''2005'''). ''[https://ieeexplore.ieee.org/document/1529448 A reinforcement learning based algorithm for Markov decision processes]''. [https://www.computer.org/csdl/proceedings-article/icisip/2005/01619447/12OmNC943G2 ICISIP 2005]
 * [https://dblp.org/pid/70/382.html Mohammed Shahid Abdulla], [[Shalabh Bhatnagar]] ('''2007'''). ''[https://link.springer.com/article/10.1007/s10626-006-0003-y Reinforcement Learning Based Algorithms for Average Cost Markov Decision Processes]''. [https://www.springer.com/journal/10626 Discrete Event Dynamic Systems], Vol.17, No.1  » [[SPSA]]
-* [[Hamid Reza Maei]], [[Csaba Szepesvári]], [[Shalabh Bhatnagar]], [[Doina Precup]], [[David Silver]], [[Richard Sutton]] ('''2009'''). ''[http://papers.nips.cc/paper/3809-convergent-temporal-difference-learning-with-arbitrary-smooth-function-approximation Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation]''. [https://dblp.org/db/conf/nips/nips2009.html#MaeiSBPSS09 NIPS 2009]
+* [[Hamid Reza Maei]], [[Csaba Szepesvári]], [[Shalabh Bhatnagar]], [[Doina Precup]], [[David Silver]], [[Richard Sutton]] ('''2009'''). ''Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation''. [https://dblp.uni-trier.de/db/conf/nips/nips2009.html#MaeiSBPSS09 NIPS 2009], [https://papers.nips.cc/paper/2009/file/3a15c7d0bbe60300a39f76f8a5ba6896-Paper.pdf pdf]
-* [[Richard Sutton]], [[Hamid Reza Maei]], [[Doina Precup]], [[Shalabh Bhatnagar]], [[David Silver]], [[Csaba Szepesvári]], [[Eric Wiewiora]]. ('''2009'''). ''[https://dl.acm.org/doi/10.1145/1553374.1553501 Fast Gradient-Descent Methods for Temporal-Difference Learning with Linear Function Approximation]''. [https://dblp.org/db/conf/icml/icml2009.html#SuttonMPBSSW09 ICML 2009], [http://cseweb.ucsd.edu/~gary/190-RL/SMPBSSW-09.pdf pdf]
+* [[Richard Sutton]], [[Hamid Reza Maei]], [[Doina Precup]], [[Shalabh Bhatnagar]], [[David Silver]], [[Csaba Szepesvári]], [[Eric Wiewiora]]. ('''2009'''). ''[https://dl.acm.org/doi/10.1145/1553374.1553501 Fast Gradient-Descent Methods for Temporal-Difference Learning with Linear Function Approximation]''. [https://dblp.uni-trier.de/db/conf/icml/icml2009.html#SuttonMPBSSW09 ICML 2009]
 ==2010 ...==
+* [[Hamid Reza Maei]], [[Csaba Szepesvári]], [[Shalabh Bhatnagar]], [[Richard Sutton]] ('''2010'''). ''Toward Off-Policy Learning Control with Function Approximation''. [https://dblp.uni-trier.de/db/conf/icml/icml2010.html#MaeiSBS10 ICML 2010], [https://icml.cc/Conferences/2010/papers/627.pdf pdf]
 * [https://scholar.google.de/citations?user=Kp-enVQAAAAJ&hl=en Debarghya Ghoshdastidar], [https://scholar.google.com/citations?user=0y2aAvgAAAAJ&hl=en Ambedkar Dukkipati], [[Shalabh Bhatnagar]] ('''2012'''). ''Smoothed Functional Algorithms for Stochastic Optimization using q-Gaussian Distributions''. [https://arxiv.org/abs/1206.4832 arXiv:1206.4832]
 * [[Shalabh Bhatnagar]], [https://dblp.org/pid/31/10493.html H.L. Prasad], [https://scholar.google.co.in/citations?user=Q1YXWpoAAAAJ&hl=en L.A. Prashanth] ('''2013'''). ''[https://link.springer.com/book/10.1007/978-1-4471-4285-0 Stochastic Recursive Algorithms for Optimization: Simultaneous Perturbation Methods]''. [https://www.springer.com/series/642 Lecture Notes in Control and Information Sciences], Vol. 434, [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer] » [[SPSA]]

Difference between revisions of "Shalabh Bhatnagar"

Latest revision as of 13:55, 12 April 2021

Contents

Selected Publications

2005 ...

2010 ...

2015 ...

2020 ...

External Links

References

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools