Shalabh Bhatnagar
Revision as of 18:55, 5 September 2020 by GerdIsenberg (talk | contribs)
Home * People * Shalabh Bhatnagar
Shalabh Bhatnagar,
an Indian computer scientist, Professor at Department of Computer Science and Automation, Indian Institute of Science, Bangalore.
His research interests are in control/optimization in stochastic dynamic systems, in particular reinforcement learning and simulation optimization.
Contents
Selected Publications
2005 ...
- Shalabh Bhatnagar, Sandeep Kumar (2005). A reinforcement learning based algorithm for Markov decision processes. ICISIP 2005
- Mohammed Shahid Abdulla, Shalabh Bhatnagar (2007). Reinforcement Learning Based Algorithms for Average Cost Markov Decision Processes. Discrete Event Dynamic Systems, Vol.17, No.1 » SPSA
- Hamid Reza Maei, Csaba Szepesvári, Shalabh Bhatnagar, Doina Precup, David Silver, Richard Sutton (2009). Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation. NIPS 2009
- Richard Sutton, Hamid Reza Maei, Doina Precup, Shalabh Bhatnagar, David Silver, Csaba Szepesvári, Eric Wiewiora. (2009). Fast Gradient-Descent Methods for Temporal-Difference Learning with Linear Function Approximation. ICML 2009, pdf
2010 ...
- Debarghya Ghoshdastidar, Ambedkar Dukkipati, Shalabh Bhatnagar (2012). Smoothed Functional Algorithms for Stochastic Optimization using q-Gaussian Distributions. arXiv:1206.4832
- Shalabh Bhatnagar, H.L. Prasad, L.A. Prashanth (2013). Stochastic Recursive Algorithms for Optimization: Simultaneous Perturbation Methods. Lecture Notes in Control and Information Sciences, Vol. 434, Springer » SPSA
- Debarghya Ghoshdastidar, Ambedkar Dukkipati, Shalabh Bhatnagar (2013). Newton based Stochastic Optimization using q-Gaussian Smoothed Functional Algorithms. arXiv:1311.2296
- H.L. Prasad, L.A. Prashanth, Shalabh Bhatnagar (2014). Actor-Critic Algorithms for Learning Nash Equilibria in N-player General-Sum Games. arXiv:1401.2086
2015 ...
- Vinayaka Yaji, Shalabh Bhatnagar (2015). A bi-convex optimization problem to compute Nash equilibrium in n-player games and an algorithm. arXiv:1504.06828
- H.L. Prasad, Shalabh Bhatnagar (2015). A Study of Gradient Descent Schemes for General-Sum Stochastic Games. arXiv:1507.00093
- Ajin George Joseph, Shalabh Bhatnagar (2016). A Cross Entropy based Stochastic Approximation Algorithm for Reinforcement Learning with Linear Function Approximation. arXiv:1609.09449
- Jayvant Anantpur, Nagendra Gulur Dwarakanath, Shivaram Kalyanakrishnan, Shalabh Bhatnagar, R. Govindarajan (2017). RLWS: A Reinforcement Learning based GPU Warp Scheduler. arXiv:1712.04303
- Ajin George Joseph, Shalabh Bhatnagar (2018). An Online Prediction Algorithm for Reinforcement Learning with Linear Function Approximation using Cross Entropy Method. arXiv:1806.06720
- Chandramouli Kamanchi, Raghuram Bharadwaj Diddigi, Shalabh Bhatnagar (2019). Successive Over Relaxation Q-Learning. arXiv:1903.03812
- Chandramouli Kamanchi, Raghuram Bharadwaj Diddigi, Shalabh Bhatnagar (2019). Second Order Value Iteration in Reinforcement Learning. arXiv:1905.03927
- Raghuram Bharadwaj Diddigi, Chandramouli Kamanchi, Shalabh Bhatnagar (2019). Solution of Two-Player Zero-Sum Game by Successive Relaxation. arXiv:1906.06659
2020 ...
- Indu John, Chandramouli Kamanchi, Shalabh Bhatnagar (2020). Generalized Speedy Q-Learning. IEEE Control Systems Letters, Vol. 4, No. 3, arXiv:1911.00397