Difference between revisions of "Shalabh Bhatnagar"
GerdIsenberg (talk | contribs) (Created page with "'''Home * People * Shalabh Bhatnagar''' FILE:ShalabhBhatnagar.jpg|border|right|thumb|link=https://www.csa.iisc.ac.in/~shalabh/| Shalabh Bhatnagar <ref>[h...") |
GerdIsenberg (talk | contribs) |
||
Line 29: | Line 29: | ||
* [https://scholar.google.co.in/citations?user=nx4NlpsAAAAJ&hl=en Raghuram Bharadwaj Diddigi], [https://scholar.google.co.in/citations?user=1QlrvHkAAAAJ&hl=en Chandramouli Kamanchi], [[Shalabh Bhatnagar]] ('''2019'''). ''Solution of Two-Player Zero-Sum Game by Successive Relaxation''. [https://arxiv.org/abs/1906.06659 arXiv:1906.06659] | * [https://scholar.google.co.in/citations?user=nx4NlpsAAAAJ&hl=en Raghuram Bharadwaj Diddigi], [https://scholar.google.co.in/citations?user=1QlrvHkAAAAJ&hl=en Chandramouli Kamanchi], [[Shalabh Bhatnagar]] ('''2019'''). ''Solution of Two-Player Zero-Sum Game by Successive Relaxation''. [https://arxiv.org/abs/1906.06659 arXiv:1906.06659] | ||
==2020 ...== | ==2020 ...== | ||
− | * [https://dblp.org/pid/233/8144.html Indu John], [https://scholar.google.co.in/citations?user=1QlrvHkAAAAJ&hl=en Chandramouli Kamanchi], [[Shalabh Bhatnagar]] ('''2020'''). ''Generalized Speedy Q-Learning''. | + | * [https://dblp.org/pid/233/8144.html Indu John], [https://scholar.google.co.in/citations?user=1QlrvHkAAAAJ&hl=en Chandramouli Kamanchi], [[Shalabh Bhatnagar]] ('''2020'''). ''Generalized Speedy Q-Learning''. [[IEEE#CSL|IEEE Control Systems Letters]], Vol. 4, No. 3, [https://arxiv.org/abs/1911.00397 arXiv:1911.00397] |
=External Links= | =External Links= |
Revision as of 18:55, 5 September 2020
Home * People * Shalabh Bhatnagar
Shalabh Bhatnagar,
an Indian computer scientist, Professor at Department of Computer Science and Automation, Indian Institute of Science, Bangalore.
His research interests are in control/optimization in stochastic dynamic systems, in particular reinforcement learning and simulation optimization.
Contents
Selected Publications
2005 ...
- Shalabh Bhatnagar, Sandeep Kumar (2005). A reinforcement learning based algorithm for Markov decision processes. ICISIP 2005
- Mohammed Shahid Abdulla, Shalabh Bhatnagar (2007). Reinforcement Learning Based Algorithms for Average Cost Markov Decision Processes. Discrete Event Dynamic Systems, Vol.17, No.1 » SPSA
- Hamid Reza Maei, Csaba Szepesvári, Shalabh Bhatnagar, Doina Precup, David Silver, Richard Sutton (2009). Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation. NIPS 2009
- Richard Sutton, Hamid Reza Maei, Doina Precup, Shalabh Bhatnagar, David Silver, Csaba Szepesvári, Eric Wiewiora. (2009). Fast Gradient-Descent Methods for Temporal-Difference Learning with Linear Function Approximation. ICML 2009, pdf
2010 ...
- Debarghya Ghoshdastidar, Ambedkar Dukkipati, Shalabh Bhatnagar (2012). Smoothed Functional Algorithms for Stochastic Optimization using q-Gaussian Distributions. arXiv:1206.4832
- Shalabh Bhatnagar, H.L. Prasad, L.A. Prashanth (2013). Stochastic Recursive Algorithms for Optimization: Simultaneous Perturbation Methods. Lecture Notes in Control and Information Sciences, Vol. 434, Springer » SPSA
- Debarghya Ghoshdastidar, Ambedkar Dukkipati, Shalabh Bhatnagar (2013). Newton based Stochastic Optimization using q-Gaussian Smoothed Functional Algorithms. arXiv:1311.2296
- H.L. Prasad, L.A. Prashanth, Shalabh Bhatnagar (2014). Actor-Critic Algorithms for Learning Nash Equilibria in N-player General-Sum Games. arXiv:1401.2086
2015 ...
- Vinayaka Yaji, Shalabh Bhatnagar (2015). A bi-convex optimization problem to compute Nash equilibrium in n-player games and an algorithm. arXiv:1504.06828
- H.L. Prasad, Shalabh Bhatnagar (2015). A Study of Gradient Descent Schemes for General-Sum Stochastic Games. arXiv:1507.00093
- Ajin George Joseph, Shalabh Bhatnagar (2016). A Cross Entropy based Stochastic Approximation Algorithm for Reinforcement Learning with Linear Function Approximation. arXiv:1609.09449
- Jayvant Anantpur, Nagendra Gulur Dwarakanath, Shivaram Kalyanakrishnan, Shalabh Bhatnagar, R. Govindarajan (2017). RLWS: A Reinforcement Learning based GPU Warp Scheduler. arXiv:1712.04303
- Ajin George Joseph, Shalabh Bhatnagar (2018). An Online Prediction Algorithm for Reinforcement Learning with Linear Function Approximation using Cross Entropy Method. arXiv:1806.06720
- Chandramouli Kamanchi, Raghuram Bharadwaj Diddigi, Shalabh Bhatnagar (2019). Successive Over Relaxation Q-Learning. arXiv:1903.03812
- Chandramouli Kamanchi, Raghuram Bharadwaj Diddigi, Shalabh Bhatnagar (2019). Second Order Value Iteration in Reinforcement Learning. arXiv:1905.03927
- Raghuram Bharadwaj Diddigi, Chandramouli Kamanchi, Shalabh Bhatnagar (2019). Solution of Two-Player Zero-Sum Game by Successive Relaxation. arXiv:1906.06659
2020 ...
- Indu John, Chandramouli Kamanchi, Shalabh Bhatnagar (2020). Generalized Speedy Q-Learning. IEEE Control Systems Letters, Vol. 4, No. 3, arXiv:1911.00397