Changes

Jump to: navigation, search

Shalabh Bhatnagar

6,793 bytes added, 18:42, 5 September 2020
Created page with "'''Home * People * Shalabh Bhatnagar''' FILE:ShalabhBhatnagar.jpg|border|right|thumb|link=https://www.csa.iisc.ac.in/~shalabh/| Shalabh Bhatnagar <ref>[h..."
'''[[Main Page|Home]] * [[People]] * Shalabh Bhatnagar'''

[[FILE:ShalabhBhatnagar.jpg|border|right|thumb|link=https://www.csa.iisc.ac.in/~shalabh/| Shalabh Bhatnagar <ref>[https://www.csa.iisc.ac.in/~shalabh/ Shalabh Bhatnagar]</ref> ]]

'''Shalabh Bhatnagar''',<br/>
an Indian computer scientist, Professor at ''Department of Computer Science and Automation'', [https://en.wikipedia.org/wiki/Indian_Institute_of_Science Indian Institute of Science], [https://en.wikipedia.org/wiki/Bangalore Bangalore].
His research interests are in control/optimization in stochastic dynamic systems, in particular [[Reinforcement Learning|reinforcement learning]] and [https://en.wikipedia.org/wiki/Simulation-based_optimization simulation optimization].

=Selected Publications=
<ref>[https://dblp.org/pid/71/2542.html dblp: Shalabh Bhatnagar]</ref>
==2005 ...==
* [[Shalabh Bhatnagar]], [https://dblp.org/pid/06/5441.html Sandeep Kumar] ('''2005'''). ''[https://ieeexplore.ieee.org/document/1529448 A reinforcement learning based algorithm for Markov decision processes]''. [https://www.computer.org/csdl/proceedings-article/icisip/2005/01619447/12OmNC943G2 ICISIP 2005]
* [https://dblp.org/pid/70/382.html Mohammed Shahid Abdulla], [[Shalabh Bhatnagar]] ('''2007'''). ''[https://link.springer.com/article/10.1007/s10626-006-0003-y Reinforcement Learning Based Algorithms for Average Cost Markov Decision Processes]''. [https://www.springer.com/journal/10626 Discrete Event Dynamic Systems], Vol.17, No.1 » [[SPSA]]
* [[Hamid Reza Maei]], [[Csaba Szepesvári]], [[Shalabh Bhatnagar]], [[Doina Precup]], [[David Silver]], [[Richard Sutton]] ('''2009'''). ''[http://papers.nips.cc/paper/3809-convergent-temporal-difference-learning-with-arbitrary-smooth-function-approximation Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation]''. [https://dblp.org/db/conf/nips/nips2009.html#MaeiSBPSS09 NIPS 2009]
* [[Richard Sutton]], [[Hamid Reza Maei]], [[Doina Precup]], [[Shalabh Bhatnagar]], [[David Silver]], [[Csaba Szepesvári]], [[Eric Wiewiora]]. ('''2009'''). ''[https://dl.acm.org/doi/10.1145/1553374.1553501 Fast Gradient-Descent Methods for Temporal-Difference Learning with Linear Function Approximation]''. [https://dblp.org/db/conf/icml/icml2009.html#SuttonMPBSSW09 ICML 2009], [http://cseweb.ucsd.edu/~gary/190-RL/SMPBSSW-09.pdf pdf]
==2010 ...==
* [https://scholar.google.de/citations?user=Kp-enVQAAAAJ&hl=en Debarghya Ghoshdastidar], [https://scholar.google.com/citations?user=0y2aAvgAAAAJ&hl=en Ambedkar Dukkipati], [[Shalabh Bhatnagar]] ('''2012'''). ''Smoothed Functional Algorithms for Stochastic Optimization using q-Gaussian Distributions''. [https://arxiv.org/abs/1206.4832 arXiv:1206.4832]
* [[Shalabh Bhatnagar]], [https://dblp.org/pid/31/10493.html H.L. Prasad], [https://scholar.google.co.in/citations?user=Q1YXWpoAAAAJ&hl=en L.A. Prashanth] ('''2013'''). ''[https://link.springer.com/book/10.1007/978-1-4471-4285-0 Stochastic Recursive Algorithms for Optimization: Simultaneous Perturbation Methods]''. [https://www.springer.com/series/642 Lecture Notes in Control and Information Sciences], Vol. 434, [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer] » [[SPSA]]
* [https://scholar.google.de/citations?user=Kp-enVQAAAAJ&hl=en Debarghya Ghoshdastidar], [https://scholar.google.com/citations?user=0y2aAvgAAAAJ&hl=en Ambedkar Dukkipati], [[Shalabh Bhatnagar]] ('''2013'''). ''Newton based Stochastic Optimization using q-Gaussian Smoothed Functional Algorithms''. [https://arxiv.org/abs/1311.2296 arXiv:1311.2296]
* [https://dblp.org/pid/31/10493.html H.L. Prasad], [https://scholar.google.co.in/citations?user=Q1YXWpoAAAAJ&hl=en L.A. Prashanth], [[Shalabh Bhatnagar]] ('''2014'''). ''Actor-Critic Algorithms for Learning Nash Equilibria in N-player General-Sum Games''. [https://arxiv.org/abs/1401.2086 arXiv:1401.2086]
==2015 ...==
* [https://dblp.org/pid/161/9906.html Vinayaka Yaji], [[Shalabh Bhatnagar]] ('''2015'''). ''A bi-convex optimization problem to compute Nash equilibrium in n-player games and an algorithm''. [https://arxiv.org/abs/1504.06828 arXiv:1504.06828]
* [https://dblp.org/pid/31/10493.html H.L. Prasad], [[Shalabh Bhatnagar]] ('''2015'''). ''A Study of Gradient Descent Schemes for General-Sum Stochastic Games''. [https://arxiv.org/abs/1507.00093 arXiv:1507.00093]
* [https://dblp.org/pid/171/3116.html Ajin George Joseph], [[Shalabh Bhatnagar]] ('''2016'''). ''A Cross Entropy based Stochastic Approximation Algorithm for Reinforcement Learning with Linear Function Approximation''. [https://arxiv.org/abs/1609.09449 arXiv:1609.09449]
* [https://scholar.google.com/citations?user=zLksndkAAAAJ&hl=en Jayvant Anantpur], [https://dblp.org/pid/09/10702.html Nagendra Gulur Dwarakanath], [https://dblp.org/pid/16/4410.html Shivaram Kalyanakrishnan], [[Shalabh Bhatnagar]], [https://dblp.org/pid/45/3592.html R. Govindarajan] ('''2017'''). ''RLWS: A Reinforcement Learning based GPU Warp Scheduler''. [https://arxiv.org/abs/1712.04303 arXiv:1712.04303]
* [https://dblp.org/pid/171/3116.html Ajin George Joseph], [[Shalabh Bhatnagar]] ('''2018'''). ''An Online Prediction Algorithm for Reinforcement Learning with Linear Function Approximation using Cross Entropy Method''. [https://arxiv.org/abs/1806.06720 arXiv:1806.06720]
* [https://scholar.google.co.in/citations?user=1QlrvHkAAAAJ&hl=en Chandramouli Kamanchi], [https://scholar.google.co.in/citations?user=nx4NlpsAAAAJ&hl=en Raghuram Bharadwaj Diddigi], [[Shalabh Bhatnagar]] ('''2019'''). ''Successive Over Relaxation Q-Learning''. [https://arxiv.org/abs/1903.03812 arXiv:1903.03812]
* [https://scholar.google.co.in/citations?user=1QlrvHkAAAAJ&hl=en Chandramouli Kamanchi], [https://scholar.google.co.in/citations?user=nx4NlpsAAAAJ&hl=en Raghuram Bharadwaj Diddigi], [[Shalabh Bhatnagar]] ('''2019'''). ''Second Order Value Iteration in Reinforcement Learning''. [https://arxiv.org/abs/1905.03927 arXiv:1905.03927]
* [https://scholar.google.co.in/citations?user=nx4NlpsAAAAJ&hl=en Raghuram Bharadwaj Diddigi], [https://scholar.google.co.in/citations?user=1QlrvHkAAAAJ&hl=en Chandramouli Kamanchi], [[Shalabh Bhatnagar]] ('''2019'''). ''Solution of Two-Player Zero-Sum Game by Successive Relaxation''. [https://arxiv.org/abs/1906.06659 arXiv:1906.06659]
==2020 ...==
* [https://dblp.org/pid/233/8144.html Indu John], [https://scholar.google.co.in/citations?user=1QlrvHkAAAAJ&hl=en Chandramouli Kamanchi], [[Shalabh Bhatnagar]] ('''2020'''). ''Generalized Speedy Q-Learning''.

=External Links=
* [https://www.csa.iisc.ac.in/~shalabh/ Shalabh Bhatnagar]
* [https://scholar.google.com/citations?user=cj3fJJsbjAoC&hl=th Shalabh Bhatnagar - Google Scholar]

=References=
<references />
'''[[People|Up one level]]'''
[[Category:Researcher|Bhatnagar]]

Navigation menu