Changes

Jump to: navigation, search

David Silver

705 bytes added, 11:24, 16 April 2021
no edit summary
'''2009'''
* [[David Silver]], [[Gerald Tesauro]] ('''2009'''). ''Monte-Carlo Simulation Balancing''. In Proceedings of the 26th International Conference on Machine Learning (ICML-09).
* [[Richard Sutton]], [[Hamid Reza Maei]], [[Doina PrecupCsaba Szepesvári]], [[Shalabh Bhatnagar]], [[David SilverDoina Precup]], [[Csaba SzepesváriDavid Silver]], [[Eric WiewioraRichard Sutton]]. ('''2009'''). ''Fast Gradient-Descent Methods for Convergent Temporal-Difference Learning with Linear Arbitrary Smooth Function Approximation''. In Proceedings of the 26th International Conference on Machine Learning (ICML[https://dblp.uni-09)trier.de/db/conf/nips/nips2009. html#MaeiSBPSS09 NIPS 2009], [httphttps://wwwpapers.sztakinips.hucc/~szcsabapaper/papers2009/file/GTD3a15c7d0bbe60300a39f76f8a5ba6896-ICML09Paper.pdf pdf]* [[Richard Sutton]], [[Hamid Reza Maei]], [[Csaba SzepesváriDoina Precup]], [[Shalabh Bhatnagar]], [[Doina PrecupDavid Silver]], [[David SilverCsaba Szepesvári]], [[Richard SuttonEric Wiewiora]] . ('''2009'''). ''Convergent [https://dl.acm.org/doi/10.1145/1553374.1553501 Fast Gradient-Descent Methods for Temporal-Difference Learning with Arbitrary Smooth Linear Function Approximation.]'' Accepted in Advances in Neural Information Processing Systems 22, Vancouver, BC. December 2009. MIT Press. [httphttps://booksdblp.nipsuni-trier.ccde/papersdb/filesconf/nips22icml/NIPS2009_1121icml2009.pdf pdfhtml#SuttonMPBSSW09 ICML 2009]
* [[Joel Veness]], [[David Silver]], [[William Uther]], [[Alan Blair]] ('''2009'''). ''[http://papers.nips.cc/paper/3722-bootstrapping-from-game-tree-search Bootstrapping from Game Tree Search]''. [http://webdocs.cs.ualberta.ca/~silver/David_Silver/Applications_files/bootstrapping.pdf pdf]
* [[David Silver]] ('''2009'''). ''Reinforcement Learning and Simulation-Based Search''. Ph.D. thesis, [[University of Alberta]], [http://www0.cs.ucl.ac.uk/staff/D.Silver/web/Applications_files/thesis.pdf pdf]
* [[Johannes Heinrich]], [[Marc Lanctot]], [[David Silver]] ('''2015'''). ''Fictitious Self-Play in Extensive-Form Games''. [http://proceedings.mlr.press/v37/ JMLR: W&CP, Vol. 37], [http://proceedings.mlr.press/v37/heinrich15.pdf pdf]
* [[Johannes Heinrich]], [[David Silver]] ('''2015'''). ''Smooth UCT Search in Computer Poker''. [[Conferences#IJCA2015|IJCAI 2015]], [http://www0.cs.ucl.ac.uk/staff/d.silver/web/Publications_files/smooth_uct.pdf pdf]
* [[Volodymyr Mnih]], [[Koray Kavukcuoglu]], [[David Silver]], [[Mathematician#AARusu|Andrei A. Rusu]], [[Joel Veness]], [[Marc G. Bellemare]], [[Alex Graves]], [[Martin Riedmiller]], [[Andreas K. Fidjeland]], [[Georg Ostrovski]], [[Stig Petersen]], [[Charles Beattie]], [[Amir Sadik]], [[Ioannis Antonoglou]], [[Helen King]], [[Dharshan Kumaran]], [[Daan Wierstra]], [[Shane Legg]], [[Demis Hassabis]] ('''2015'''). ''[http://www.nature.com/nature/journal/v518/n7540/abs/nature14236.html Human-level control through deep reinforcement learning]''. [https://en.wikipedia.org/wiki/Nature_%28journal%29 Nature], Vol. 518
* [[Arun Nair]], [[Praveen Srinivasan]], [[Sam Blackwell]], [[Cagdas Alcicek]], [[Rory Fearon]], [[Alessandro De Maria]], [[Veda Panneershelvam]], [[Mustafa Suleyman]], [[Charles Beattie]], [[Stig Petersen]], [[Shane Legg]], [[Volodymyr Mnih]], [[Koray Kavukcuoglu]], [[David Silver]] ('''2015'''). ''Massively Parallel Methods for Deep Reinforcement Learning''. [http://arxiv.org/abs/1507.04296 arXiv:1507.04296]
* [[Timothy Lillicrap]], [[Jonathan J. Hunt]], [[Alexander Pritzel]], [[Nicolas Heess]], [[Tom Erez]], [[Yuval Tassa]], [[David Silver]], [[Daan Wierstra]] ('''2015'''). ''Continuous Control with Deep Reinforcement Learning''. [https://arxiv.org/abs/1509.02971 arXiv:1509.02971]
'''2019'''
* [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Thomas Hubert]], [[Karen Simonyan]], [[Laurent Sifre]], [[Simon Schmitt]], [[Arthur Guez]], [[Edward Lockhart]], [[Demis Hassabis]], [[Thore Graepel]], [[Timothy Lillicrap]], [[David Silver]] ('''2019'''). ''Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model''. [https://arxiv.org/abs/1911.08265 arXiv:1911.08265]
==2020 ...==
* [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Thomas Hubert]], [[Karen Simonyan]], [[Laurent Sifre]], [[Simon Schmitt]], [[Arthur Guez]], [[Edward Lockhart]], [[Demis Hassabis]], [[Thore Graepel]], [[Timothy Lillicrap]], [[David Silver]] ('''2020'''). ''[https://www.nature.com/articles/s41586-020-03051-4 Mastering Atari, Go, chess and shogi by planning with a learned model]''. [https://en.wikipedia.org/wiki/Nature_%28journal%29 Nature], Vol. 588 <ref>[https://deepmind.com/blog/article/muzero-mastering-go-chess-shogi-and-atari-without-rules?fbclid=IwAR3mSwrn1YXDKr9uuGm2GlFKh76wBilex7f8QvBiQecwiVmAvD6Bkyjx-rE MuZero: Mastering Go, chess, shogi and Atari without rules]</ref>
=External Links=

Navigation menu