Shalabh Bhatnagar

Home * People * Shalabh Bhatnagar



Shalabh Bhatnagar, an Indian computer scientist, Professor at Department of Computer Science and Automation, Indian Institute of Science, Bangalore. His research interests are in control/optimization in stochastic dynamic systems, in particular reinforcement learning and simulation optimization.

=Selected Publications=

2005 ...

 * Shalabh Bhatnagar, Sandeep Kumar (2005). A reinforcement learning based algorithm for Markov decision processes. ICISIP 2005
 * Mohammed Shahid Abdulla, Shalabh Bhatnagar (2007). Reinforcement Learning Based Algorithms for Average Cost Markov Decision Processes. Discrete Event Dynamic Systems, Vol.17, No.1 » SPSA
 * Hamid Reza Maei, Csaba Szepesvári, Shalabh Bhatnagar, Doina Precup, David Silver, Richard Sutton (2009). Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation. NIPS 2009, pdf
 * Richard Sutton, Hamid Reza Maei, Doina Precup, Shalabh Bhatnagar, David Silver, Csaba Szepesvári, Eric Wiewiora. (2009). Fast Gradient-Descent Methods for Temporal-Difference Learning with Linear Function Approximation. ICML 2009

2010 ...

 * Hamid Reza Maei, Csaba Szepesvári, Shalabh Bhatnagar, Richard Sutton (2010). Toward Off-Policy Learning Control with Function Approximation. ICML 2010, pdf
 * Debarghya Ghoshdastidar, Ambedkar Dukkipati, Shalabh Bhatnagar (2012). Smoothed Functional Algorithms for Stochastic Optimization using q-Gaussian Distributions. arXiv:1206.4832
 * Shalabh Bhatnagar, H.L. Prasad, L.A. Prashanth (2013). Stochastic Recursive Algorithms for Optimization: Simultaneous Perturbation Methods. Lecture Notes in Control and Information Sciences, Vol. 434, Springer » SPSA
 * Debarghya Ghoshdastidar, Ambedkar Dukkipati, Shalabh Bhatnagar (2013). Newton based Stochastic Optimization using q-Gaussian Smoothed Functional Algorithms. arXiv:1311.2296
 * H.L. Prasad, L.A. Prashanth, Shalabh Bhatnagar (2014). Actor-Critic Algorithms for Learning Nash Equilibria in N-player General-Sum Games. arXiv:1401.2086

2015 ...

 * Vinayaka Yaji, Shalabh Bhatnagar (2015). A bi-convex optimization problem to compute Nash equilibrium in n-player games and an algorithm. arXiv:1504.06828
 * H.L. Prasad, Shalabh Bhatnagar (2015). A Study of Gradient Descent Schemes for General-Sum Stochastic Games. arXiv:1507.00093
 * Ajin George Joseph, Shalabh Bhatnagar (2016). A Cross Entropy based Stochastic Approximation Algorithm for Reinforcement Learning with Linear Function Approximation. arXiv:1609.09449
 * Jayvant Anantpur, Nagendra Gulur Dwarakanath, Shivaram Kalyanakrishnan, Shalabh Bhatnagar, R. Govindarajan (2017). RLWS: A Reinforcement Learning based GPU Warp Scheduler. arXiv:1712.04303
 * Ajin George Joseph, Shalabh Bhatnagar (2018). An Online Prediction Algorithm for Reinforcement Learning with Linear Function Approximation using Cross Entropy Method. arXiv:1806.06720
 * Chandramouli Kamanchi, Raghuram Bharadwaj Diddigi, Shalabh Bhatnagar (2019). Successive Over Relaxation Q-Learning. arXiv:1903.03812
 * Chandramouli Kamanchi, Raghuram Bharadwaj Diddigi, Shalabh Bhatnagar (2019). Second Order Value Iteration in Reinforcement Learning. arXiv:1905.03927
 * Raghuram Bharadwaj Diddigi, Chandramouli Kamanchi, Shalabh Bhatnagar (2019). Solution of Two-Player Zero-Sum Game by Successive Relaxation. arXiv:1906.06659

2020 ...

 * Indu John, Chandramouli Kamanchi, Shalabh Bhatnagar (2020). Generalized Speedy Q-Learning. IEEE Control Systems Letters, Vol. 4, No. 3, arXiv:1911.00397

=External Links=
 * Shalabh Bhatnagar
 * Shalabh Bhatnagar - Google Scholar

=References= Up one level