Difference between revisions of "Shalabh Bhatnagar"

From Chessprogramming wiki
Jump to: navigation, search
 
Line 12: Line 12:
 
* [[Shalabh Bhatnagar]], [https://dblp.org/pid/06/5441.html Sandeep Kumar] ('''2005'''). ''[https://ieeexplore.ieee.org/document/1529448 A reinforcement learning based algorithm for Markov decision processes]''. [https://www.computer.org/csdl/proceedings-article/icisip/2005/01619447/12OmNC943G2 ICISIP 2005]
 
* [[Shalabh Bhatnagar]], [https://dblp.org/pid/06/5441.html Sandeep Kumar] ('''2005'''). ''[https://ieeexplore.ieee.org/document/1529448 A reinforcement learning based algorithm for Markov decision processes]''. [https://www.computer.org/csdl/proceedings-article/icisip/2005/01619447/12OmNC943G2 ICISIP 2005]
 
* [https://dblp.org/pid/70/382.html Mohammed Shahid Abdulla], [[Shalabh Bhatnagar]] ('''2007'''). ''[https://link.springer.com/article/10.1007/s10626-006-0003-y Reinforcement Learning Based Algorithms for Average Cost Markov Decision Processes]''. [https://www.springer.com/journal/10626 Discrete Event Dynamic Systems], Vol.17, No.1  » [[SPSA]]
 
* [https://dblp.org/pid/70/382.html Mohammed Shahid Abdulla], [[Shalabh Bhatnagar]] ('''2007'''). ''[https://link.springer.com/article/10.1007/s10626-006-0003-y Reinforcement Learning Based Algorithms for Average Cost Markov Decision Processes]''. [https://www.springer.com/journal/10626 Discrete Event Dynamic Systems], Vol.17, No.1  » [[SPSA]]
* [[Hamid Reza Maei]], [[Csaba Szepesvári]], [[Shalabh Bhatnagar]], [[Doina Precup]], [[David Silver]], [[Richard Sutton]] ('''2009'''). ''[http://papers.nips.cc/paper/3809-convergent-temporal-difference-learning-with-arbitrary-smooth-function-approximation Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation]''. [https://dblp.org/db/conf/nips/nips2009.html#MaeiSBPSS09 NIPS 2009]
+
* [[Hamid Reza Maei]], [[Csaba Szepesvári]], [[Shalabh Bhatnagar]], [[Doina Precup]], [[David Silver]], [[Richard Sutton]] ('''2009'''). ''Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation''. [https://dblp.uni-trier.de/db/conf/nips/nips2009.html#MaeiSBPSS09 NIPS 2009], [https://papers.nips.cc/paper/2009/file/3a15c7d0bbe60300a39f76f8a5ba6896-Paper.pdf pdf]
* [[Richard Sutton]], [[Hamid Reza Maei]], [[Doina Precup]], [[Shalabh Bhatnagar]], [[David Silver]], [[Csaba Szepesvári]], [[Eric Wiewiora]]. ('''2009'''). ''[https://dl.acm.org/doi/10.1145/1553374.1553501 Fast Gradient-Descent Methods for Temporal-Difference Learning with Linear Function Approximation]''. [https://dblp.org/db/conf/icml/icml2009.html#SuttonMPBSSW09 ICML 2009], [http://cseweb.ucsd.edu/~gary/190-RL/SMPBSSW-09.pdf pdf]
+
* [[Richard Sutton]], [[Hamid Reza Maei]], [[Doina Precup]], [[Shalabh Bhatnagar]], [[David Silver]], [[Csaba Szepesvári]], [[Eric Wiewiora]]. ('''2009'''). ''[https://dl.acm.org/doi/10.1145/1553374.1553501 Fast Gradient-Descent Methods for Temporal-Difference Learning with Linear Function Approximation]''. [https://dblp.uni-trier.de/db/conf/icml/icml2009.html#SuttonMPBSSW09 ICML 2009]
 
==2010 ...==
 
==2010 ...==
 +
* [[Hamid Reza Maei]], [[Csaba Szepesvári]], [[Shalabh Bhatnagar]], [[Richard Sutton]] ('''2010'''). ''Toward Off-Policy Learning Control with Function Approximation''. [https://dblp.uni-trier.de/db/conf/icml/icml2010.html#MaeiSBS10 ICML 2010], [https://icml.cc/Conferences/2010/papers/627.pdf pdf]
 
* [https://scholar.google.de/citations?user=Kp-enVQAAAAJ&hl=en Debarghya Ghoshdastidar], [https://scholar.google.com/citations?user=0y2aAvgAAAAJ&hl=en Ambedkar Dukkipati], [[Shalabh Bhatnagar]] ('''2012'''). ''Smoothed Functional Algorithms for Stochastic Optimization using q-Gaussian Distributions''. [https://arxiv.org/abs/1206.4832 arXiv:1206.4832]
 
* [https://scholar.google.de/citations?user=Kp-enVQAAAAJ&hl=en Debarghya Ghoshdastidar], [https://scholar.google.com/citations?user=0y2aAvgAAAAJ&hl=en Ambedkar Dukkipati], [[Shalabh Bhatnagar]] ('''2012'''). ''Smoothed Functional Algorithms for Stochastic Optimization using q-Gaussian Distributions''. [https://arxiv.org/abs/1206.4832 arXiv:1206.4832]
 
* [[Shalabh Bhatnagar]], [https://dblp.org/pid/31/10493.html H.L. Prasad], [https://scholar.google.co.in/citations?user=Q1YXWpoAAAAJ&hl=en L.A. Prashanth] ('''2013'''). ''[https://link.springer.com/book/10.1007/978-1-4471-4285-0 Stochastic Recursive Algorithms for Optimization: Simultaneous Perturbation Methods]''. [https://www.springer.com/series/642 Lecture Notes in Control and Information Sciences], Vol. 434, [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer] » [[SPSA]]
 
* [[Shalabh Bhatnagar]], [https://dblp.org/pid/31/10493.html H.L. Prasad], [https://scholar.google.co.in/citations?user=Q1YXWpoAAAAJ&hl=en L.A. Prashanth] ('''2013'''). ''[https://link.springer.com/book/10.1007/978-1-4471-4285-0 Stochastic Recursive Algorithms for Optimization: Simultaneous Perturbation Methods]''. [https://www.springer.com/series/642 Lecture Notes in Control and Information Sciences], Vol. 434, [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer] » [[SPSA]]

Latest revision as of 13:55, 12 April 2021

Home * People * Shalabh Bhatnagar

Shalabh Bhatnagar [1]

Shalabh Bhatnagar,
an Indian computer scientist, Professor at Department of Computer Science and Automation, Indian Institute of Science, Bangalore. His research interests are in control/optimization in stochastic dynamic systems, in particular reinforcement learning and simulation optimization.

Selected Publications

[2]

2005 ...

2010 ...

2015 ...

2020 ...

External Links

References

Up one level