Difference between revisions of "Shalabh Bhatnagar"
GerdIsenberg (talk | contribs) |
GerdIsenberg (talk | contribs) |
||
Line 12: | Line 12: | ||
* [[Shalabh Bhatnagar]], [https://dblp.org/pid/06/5441.html Sandeep Kumar] ('''2005'''). ''[https://ieeexplore.ieee.org/document/1529448 A reinforcement learning based algorithm for Markov decision processes]''. [https://www.computer.org/csdl/proceedings-article/icisip/2005/01619447/12OmNC943G2 ICISIP 2005] | * [[Shalabh Bhatnagar]], [https://dblp.org/pid/06/5441.html Sandeep Kumar] ('''2005'''). ''[https://ieeexplore.ieee.org/document/1529448 A reinforcement learning based algorithm for Markov decision processes]''. [https://www.computer.org/csdl/proceedings-article/icisip/2005/01619447/12OmNC943G2 ICISIP 2005] | ||
* [https://dblp.org/pid/70/382.html Mohammed Shahid Abdulla], [[Shalabh Bhatnagar]] ('''2007'''). ''[https://link.springer.com/article/10.1007/s10626-006-0003-y Reinforcement Learning Based Algorithms for Average Cost Markov Decision Processes]''. [https://www.springer.com/journal/10626 Discrete Event Dynamic Systems], Vol.17, No.1 » [[SPSA]] | * [https://dblp.org/pid/70/382.html Mohammed Shahid Abdulla], [[Shalabh Bhatnagar]] ('''2007'''). ''[https://link.springer.com/article/10.1007/s10626-006-0003-y Reinforcement Learning Based Algorithms for Average Cost Markov Decision Processes]''. [https://www.springer.com/journal/10626 Discrete Event Dynamic Systems], Vol.17, No.1 » [[SPSA]] | ||
− | * [[Hamid Reza Maei]], [[Csaba Szepesvári]], [[Shalabh Bhatnagar]], [[Doina Precup]], [[David Silver]], [[Richard Sutton]] ('''2009'''). '' | + | * [[Hamid Reza Maei]], [[Csaba Szepesvári]], [[Shalabh Bhatnagar]], [[Doina Precup]], [[David Silver]], [[Richard Sutton]] ('''2009'''). ''Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation''. [https://dblp.uni-trier.de/db/conf/nips/nips2009.html#MaeiSBPSS09 NIPS 2009], [https://papers.nips.cc/paper/2009/file/3a15c7d0bbe60300a39f76f8a5ba6896-Paper.pdf pdf] |
− | * [[Richard Sutton]], [[Hamid Reza Maei]], [[Doina Precup]], [[Shalabh Bhatnagar]], [[David Silver]], [[Csaba Szepesvári]], [[Eric Wiewiora]]. ('''2009'''). ''[https://dl.acm.org/doi/10.1145/1553374.1553501 Fast Gradient-Descent Methods for Temporal-Difference Learning with Linear Function Approximation]''. [https://dblp. | + | * [[Richard Sutton]], [[Hamid Reza Maei]], [[Doina Precup]], [[Shalabh Bhatnagar]], [[David Silver]], [[Csaba Szepesvári]], [[Eric Wiewiora]]. ('''2009'''). ''[https://dl.acm.org/doi/10.1145/1553374.1553501 Fast Gradient-Descent Methods for Temporal-Difference Learning with Linear Function Approximation]''. [https://dblp.uni-trier.de/db/conf/icml/icml2009.html#SuttonMPBSSW09 ICML 2009] |
==2010 ...== | ==2010 ...== | ||
+ | * [[Hamid Reza Maei]], [[Csaba Szepesvári]], [[Shalabh Bhatnagar]], [[Richard Sutton]] ('''2010'''). ''Toward Off-Policy Learning Control with Function Approximation''. [https://dblp.uni-trier.de/db/conf/icml/icml2010.html#MaeiSBS10 ICML 2010], [https://icml.cc/Conferences/2010/papers/627.pdf pdf] | ||
* [https://scholar.google.de/citations?user=Kp-enVQAAAAJ&hl=en Debarghya Ghoshdastidar], [https://scholar.google.com/citations?user=0y2aAvgAAAAJ&hl=en Ambedkar Dukkipati], [[Shalabh Bhatnagar]] ('''2012'''). ''Smoothed Functional Algorithms for Stochastic Optimization using q-Gaussian Distributions''. [https://arxiv.org/abs/1206.4832 arXiv:1206.4832] | * [https://scholar.google.de/citations?user=Kp-enVQAAAAJ&hl=en Debarghya Ghoshdastidar], [https://scholar.google.com/citations?user=0y2aAvgAAAAJ&hl=en Ambedkar Dukkipati], [[Shalabh Bhatnagar]] ('''2012'''). ''Smoothed Functional Algorithms for Stochastic Optimization using q-Gaussian Distributions''. [https://arxiv.org/abs/1206.4832 arXiv:1206.4832] | ||
* [[Shalabh Bhatnagar]], [https://dblp.org/pid/31/10493.html H.L. Prasad], [https://scholar.google.co.in/citations?user=Q1YXWpoAAAAJ&hl=en L.A. Prashanth] ('''2013'''). ''[https://link.springer.com/book/10.1007/978-1-4471-4285-0 Stochastic Recursive Algorithms for Optimization: Simultaneous Perturbation Methods]''. [https://www.springer.com/series/642 Lecture Notes in Control and Information Sciences], Vol. 434, [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer] » [[SPSA]] | * [[Shalabh Bhatnagar]], [https://dblp.org/pid/31/10493.html H.L. Prasad], [https://scholar.google.co.in/citations?user=Q1YXWpoAAAAJ&hl=en L.A. Prashanth] ('''2013'''). ''[https://link.springer.com/book/10.1007/978-1-4471-4285-0 Stochastic Recursive Algorithms for Optimization: Simultaneous Perturbation Methods]''. [https://www.springer.com/series/642 Lecture Notes in Control and Information Sciences], Vol. 434, [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer] » [[SPSA]] |
Latest revision as of 13:55, 12 April 2021
Home * People * Shalabh Bhatnagar
Shalabh Bhatnagar,
an Indian computer scientist, Professor at Department of Computer Science and Automation, Indian Institute of Science, Bangalore.
His research interests are in control/optimization in stochastic dynamic systems, in particular reinforcement learning and simulation optimization.
Contents
Selected Publications
2005 ...
- Shalabh Bhatnagar, Sandeep Kumar (2005). A reinforcement learning based algorithm for Markov decision processes. ICISIP 2005
- Mohammed Shahid Abdulla, Shalabh Bhatnagar (2007). Reinforcement Learning Based Algorithms for Average Cost Markov Decision Processes. Discrete Event Dynamic Systems, Vol.17, No.1 » SPSA
- Hamid Reza Maei, Csaba Szepesvári, Shalabh Bhatnagar, Doina Precup, David Silver, Richard Sutton (2009). Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation. NIPS 2009, pdf
- Richard Sutton, Hamid Reza Maei, Doina Precup, Shalabh Bhatnagar, David Silver, Csaba Szepesvári, Eric Wiewiora. (2009). Fast Gradient-Descent Methods for Temporal-Difference Learning with Linear Function Approximation. ICML 2009
2010 ...
- Hamid Reza Maei, Csaba Szepesvári, Shalabh Bhatnagar, Richard Sutton (2010). Toward Off-Policy Learning Control with Function Approximation. ICML 2010, pdf
- Debarghya Ghoshdastidar, Ambedkar Dukkipati, Shalabh Bhatnagar (2012). Smoothed Functional Algorithms for Stochastic Optimization using q-Gaussian Distributions. arXiv:1206.4832
- Shalabh Bhatnagar, H.L. Prasad, L.A. Prashanth (2013). Stochastic Recursive Algorithms for Optimization: Simultaneous Perturbation Methods. Lecture Notes in Control and Information Sciences, Vol. 434, Springer » SPSA
- Debarghya Ghoshdastidar, Ambedkar Dukkipati, Shalabh Bhatnagar (2013). Newton based Stochastic Optimization using q-Gaussian Smoothed Functional Algorithms. arXiv:1311.2296
- H.L. Prasad, L.A. Prashanth, Shalabh Bhatnagar (2014). Actor-Critic Algorithms for Learning Nash Equilibria in N-player General-Sum Games. arXiv:1401.2086
2015 ...
- Vinayaka Yaji, Shalabh Bhatnagar (2015). A bi-convex optimization problem to compute Nash equilibrium in n-player games and an algorithm. arXiv:1504.06828
- H.L. Prasad, Shalabh Bhatnagar (2015). A Study of Gradient Descent Schemes for General-Sum Stochastic Games. arXiv:1507.00093
- Ajin George Joseph, Shalabh Bhatnagar (2016). A Cross Entropy based Stochastic Approximation Algorithm for Reinforcement Learning with Linear Function Approximation. arXiv:1609.09449
- Jayvant Anantpur, Nagendra Gulur Dwarakanath, Shivaram Kalyanakrishnan, Shalabh Bhatnagar, R. Govindarajan (2017). RLWS: A Reinforcement Learning based GPU Warp Scheduler. arXiv:1712.04303
- Ajin George Joseph, Shalabh Bhatnagar (2018). An Online Prediction Algorithm for Reinforcement Learning with Linear Function Approximation using Cross Entropy Method. arXiv:1806.06720
- Chandramouli Kamanchi, Raghuram Bharadwaj Diddigi, Shalabh Bhatnagar (2019). Successive Over Relaxation Q-Learning. arXiv:1903.03812
- Chandramouli Kamanchi, Raghuram Bharadwaj Diddigi, Shalabh Bhatnagar (2019). Second Order Value Iteration in Reinforcement Learning. arXiv:1905.03927
- Raghuram Bharadwaj Diddigi, Chandramouli Kamanchi, Shalabh Bhatnagar (2019). Solution of Two-Player Zero-Sum Game by Successive Relaxation. arXiv:1906.06659
2020 ...
- Indu John, Chandramouli Kamanchi, Shalabh Bhatnagar (2020). Generalized Speedy Q-Learning. IEEE Control Systems Letters, Vol. 4, No. 3, arXiv:1911.00397