Revision as of 13:57, 12 April 2021

Home * People * David Silver

David Silver ^[1]

David Silver,
a British computer scientist at Google DeepMind, and co-author of AlphaGo and AlphaZero. Before, since 2010, he was researcher at University College London, postdoc at Massachusetts Institute of Technology ^[2], Ph.D student and postdoc at University of Alberta, and CTO for Elixir Studios and lead programmer on the PC strategy game Republic: the Revolution ^[3]. His research interests covers simulation-based search, reinforcement learning, and cooperative pathfinding.

Selected Publications

^[4] ^[5] ^[6]

2006 ...

David Silver (2006). Cooperative Pathﬁnding. In AI Game Programming Wisdom 3, pages 99–111. Charles River Media, pdf

2007

David Silver, Richard Sutton, Martin Müller (2007). Reinforcement learning of local shape in the game of Go. 20th IJCAI, pdf
Sylvain Gelly, David Silver (2007). Combining Online and Offline Knowledge in UCT. pdf

2008

David Silver, Richard Sutton, Martin Müller (2008). Sample-Based Learning and Search with Permanent and Transient Memories. In Proceedings of the 25th International Conference on Machine Learning, pdf
Sylvain Gelly, David Silver (2008). Achieving Master Level Play in 9 x 9 Computer Go. pdf

2009

David Silver, Gerald Tesauro (2009). Monte-Carlo Simulation Balancing. In Proceedings of the 26th International Conference on Machine Learning (ICML-09).
Hamid Reza Maei, Csaba Szepesvári, Shalabh Bhatnagar, Doina Precup, David Silver, Richard Sutton (2009). Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation. NIPS 2009, pdf
Richard Sutton, Hamid Reza Maei, Doina Precup, Shalabh Bhatnagar, David Silver, Csaba Szepesvári, Eric Wiewiora. (2009). Fast Gradient-Descent Methods for Temporal-Difference Learning with Linear Function Approximation. ICML 2009
Joel Veness, David Silver, William Uther, Alan Blair (2009). Bootstrapping from Game Tree Search. pdf
David Silver (2009). Reinforcement Learning and Simulation-Based Search. Ph.D. thesis, University of Alberta, pdf
Joel Veness, Kee Siong Ng, Marcus Hutter, David Silver (2009). A Monte Carlo AIXI Approximation, pdf
David Silver, Gerald Tesauro (2009). Monte-Carlo Simulation Balancing. ICML 2009, pdf

2010 ...

Joel Veness, Kee Siong Ng, Marcus Hutter, David Silver (2010). Reinforcement Learning via AIXI Approximation. Association for the Advancement of Artificial Intelligence (AAAI), pdf

2011

Sylvain Gelly, David Silver (2011). Monte-Carlo tree search and rapid action value estimation in computer Go. Artificial Intelligence, Vol. 175, No. 11
Joel Veness, Kee Siong Ng, Marcus Hutter, William Uther , David Silver (2011). A Monte-Carlo AIXI Approximation. JAIR, Vol. 40, pdf

2012

Sylvain Gelly, Marc Schoenauer, Michèle Sebag, Olivier Teytaud, Levente Kocsis, David Silver, Csaba Szepesvári (2012). The Grand Challenge of Computer Go: Monte Carlo Tree Search and Extensions. Communications of the ACM, Vol. 55, No. 3, pdf preprint
Arthur Guez, David Silver, Peter Dayan (2012). Efficient Bayes-Adaptive Reinforcement Learning using Sample-Based Search. NIPS 2012, pdf

2013

Arthur Guez, David Silver, Peter Dayan (2013). Scalable and Efficient Bayes-Adaptive Reinforcement Learning Based on Monte-Carlo Tree Search. Journal of Artificial Intelligence Research, Vol. 48, pdf
David Silver, Richard Sutton, Martin Mueller (2013). Temporal-Difference Search in Computer Go. Proceedings of the ICAPS-13 Workshop on Planning and Learning, pdf
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, Martin Riedmiller (2013). Playing Atari with Deep Reinforcement Learning. arXiv:1312.5602 ^[7]

2014

Tom Schaul, Ioannis Antonoglou, David Silver (2014). Unit Tests for Stochastic Optimization. arXiv:1312.6055v3 ^[8]
Arthur Guez, David Silver, Peter Dayan (2014). Better Optimism By Bayes: Adaptive Planning with Rich Models. arXiv:1402.1958v1
Arthur Guez, Nicolas Heess, David Silver, Peter Dayan (2014). Bayes-Adaptive Simulation-based Search with Value Function Approximation. NIPS 2014, pdf
Chris J. Maddison, Aja Huang, Ilya Sutskever, David Silver (2014). Move Evaluation in Go Using Deep Convolutional Neural Networks. arXiv:1412.6564v1 ^[9] » DCNN in Go
Johannes Heinrich, David Silver (2014). Self-Play Monte-Carlo Tree Search in Computer Poker. AAAI-14 Workshop

2015 ...

Johannes Heinrich, Marc Lanctot, David Silver (2015). Fictitious Self-Play in Extensive-Form Games. JMLR: W&CP, Vol. 37, pdf
Johannes Heinrich, David Silver (2015). Smooth UCT Search in Computer Poker. IJCAI 2015, pdf
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, Demis Hassabis (2015). Human-level control through deep reinforcement learning. Nature, Vol. 518
Arun Nair, Praveen Srinivasan, Sam Blackwell, Cagdas Alcicek, Rory Fearon, Alessandro De Maria, Veda Panneershelvam, Mustafa Suleyman, Charles Beattie, Stig Petersen, Shane Legg, Volodymyr Mnih, Koray Kavukcuoglu, David Silver (2015). Massively Parallel Methods for Deep Reinforcement Learning. arXiv:1507.04296
Timothy Lillicrap, Jonathan J. Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, Daan Wierstra (2015). Continuous Control with Deep Reinforcement Learning. arXiv:1509.02971
Hado van Hasselt, Arthur Guez, David Silver (2015). Deep Reinforcement Learning with Double Q-learning. arXiv:1509.06461
Tom Schaul, John Quan, Ioannis Antonoglou, David Silver (2015). Prioritized Experience Replay. arXiv:1511.05952
Nicolas Heess, Jonathan J. Hunt, Timothy Lillicrap, David Silver (2015). Memory-based control with recurrent neural networks. arXiv:1512.04455

2016

David Silver, Aja Huang, Chris J. Maddison, Arthur Guez, Laurent Sifre, George van den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, Sander Dieleman, Dominik Grewe, John Nham, Nal Kalchbrenner, Ilya Sutskever, Timothy Lillicrap, Madeleine Leach, Koray Kavukcuoglu, Thore Graepel, Demis Hassabis (2016). Mastering the game of Go with deep neural networks and tree search. Nature, Vol. 529 » AlphaGo
Volodymyr Mnih, Adrià Puigdomènech Badia, Mehdi Mirza, Alex Graves, Timothy Lillicrap, Tim Harley, David Silver, Koray Kavukcuoglu (2016). Asynchronous Methods for Deep Reinforcement Learning. arXiv:1602.01783v2
Max Jaderberg, Volodymyr Mnih, Wojciech Marian Czarnecki, Tom Schaul, Joel Z. Leibo, David Silver, Koray Kavukcuoglu (2016). Reinforcement Learning with Unsupervised Auxiliary Tasks. arXiv:1611.05397v1
Hado van Hasselt, Arthur Guez, Matteo Hessel, Volodymyr Mnih, David Silver (2016). Learning values across many orders of magnitude. arXiv:1602.07714v2, NIPS 2016
Johannes Heinrich, David Silver (2016). Deep Reinforcement Learning from Self-Play in Imperfect-Information Games. arXiv:1603.01121

2017

David Silver, Julian Schrittwieser, Karen Simonyan, Ioannis Antonoglou, Aja Huang, Arthur Guez, Thomas Hubert, Lucas Baker, Matthew Lai, Adrian Bolton, Yutian Chen, Timothy Lillicrap, Fan Hui, Laurent Sifre, George van den Driessche, Thore Graepel, Demis Hassabis (2017). Mastering the game of Go without human knowledge. Nature, Vol. 550 ^[10]
Marc Lanctot, Vinícius Flores Zambaldi, Audrunas Gruslys, Angeliki Lazaridou, Karl Tuyls, Julien Pérolat, David Silver, Thore Graepel (2017). A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning. arXiv:1711.00832
David Silver, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Matthew Lai, Arthur Guez, Marc Lanctot, Laurent Sifre, Dharshan Kumaran, Thore Graepel, Timothy Lillicrap, Karen Simonyan, Demis Hassabis (2017). Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm. arXiv:1712.01815 » AlphaZero

2018

Arthur Guez, Théophane Weber, Ioannis Antonoglou, Karen Simonyan, Oriol Vinyals, Daan Wierstra, Rémi Munos, David Silver (2018). Learning to Search with MCTSnets. arXiv:1802.04697
David Silver, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Matthew Lai, Arthur Guez, Marc Lanctot, Laurent Sifre, Dharshan Kumaran, Thore Graepel, Timothy Lillicrap, Karen Simonyan, Demis Hassabis (2018). A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play. Science, Vol. 362, No. 6419 ^[11]

2019

Julian Schrittwieser, Ioannis Antonoglou, Thomas Hubert, Karen Simonyan, Laurent Sifre, Simon Schmitt, Arthur Guez, Edward Lockhart, Demis Hassabis, Thore Graepel, Timothy Lillicrap, David Silver (2019). Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model. arXiv:1911.08265

2020 ...

Julian Schrittwieser, Ioannis Antonoglou, Thomas Hubert, Karen Simonyan, Laurent Sifre, Simon Schmitt, Arthur Guez, Edward Lockhar, Demis Hassabis, Thore Graepel, Timothy Lillicrap, David Silver (2020). Mastering Atari, Go, chess and shogi by planning with a learned model. Nature, Vol. 588 ^[12]

External Links

David Silver homepage
David Silver - Google Scholar Citation
David Silver - The Mathematics Genealogy Project
Advanced Topics: RL by David Silver
Monte-Carlo Simulation Balancing - videolectures.net ^[13]
AlphaGo's next move by Demis Hassabis and David Silver, DeepMind, May 27, 2017
AlphaGo Zero: Learning from scratch by Demis Hassabis and David Silver, DeepMind, October 18, 2017

AlphaGo Zero: Discovering new knowledge by David Silver, YouTube Video

References

↑ David Silver
↑ Research Staff > David Silver
↑ David Silver - Applications
↑ David Silver - Publications
↑ David M. Silver from Microsoft Academic Search
↑ Joel Veness - Publications
↑ Demystifying Deep Reinforcement Learning by Tambet Matiisen, Nervana, December 21, 2015
↑ GitHub - IoannisAntonoglou/optimBench: Benchmark testbed for assessing the performance of optimisation algorithms
↑ Move Evaluation in Go Using Deep Convolutional Neural Networks by Aja Huang, The Computer-go Archives, December 19, 2014
↑ AlphaGo Zero: Learning from scratch by Demis Hassabis and David Silver, DeepMind, October 18, 2017
↑ AlphaZero: Shedding new light on the grand games of chess, shogi and Go by David Silver, Thomas Hubert, Julian Schrittwieser and Demis Hassabis, DeepMind, December 03, 2018
↑ MuZero: Mastering Go, chess, shogi and Atari without rules
↑ David Silver, Gerald Tesauro (2009). Monte-Carlo Simulation Balancing. ICML 2009, pdf

Up one level

[1] David Silver

[2] Research Staff > David Silver

[3] David Silver - Applications

[4] David Silver - Publications

[5] David M. Silver from Microsoft Academic Search

[6] Joel Veness - Publications

[7] Demystifying Deep Reinforcement Learning by Tambet Matiisen, Nervana, December 21, 2015

[8] GitHub - IoannisAntonoglou/optimBench: Benchmark testbed for assessing the performance of optimisation algorithms

[9] Move Evaluation in Go Using Deep Convolutional Neural Networks by Aja Huang, The Computer-go Archives, December 19, 2014

[10] AlphaGo Zero: Learning from scratch by Demis Hassabis and David Silver, DeepMind, October 18, 2017

[11] AlphaZero: Shedding new light on the grand games of chess, shogi and Go by David Silver, Thomas Hubert, Julian Schrittwieser and Demis Hassabis, DeepMind, December 03, 2018

[12] MuZero: Mastering Go, chess, shogi and Atari without rules

[13] David Silver, Gerald Tesauro (2009). Monte-Carlo Simulation Balancing. ICML 2009, pdf

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

@@ Line 23: / Line 23: @@
 '''2009'''
 * [[David Silver]], [[Gerald Tesauro]] ('''2009'''). ''Monte-Carlo Simulation Balancing''. In Proceedings of the 26th International Conference on Machine Learning (ICML-09).
-* [[Richard Sutton]], [[Hamid Reza Maei]], [[Doina Precup]], [[Shalabh Bhatnagar]], [[David Silver]], [[Csaba Szepesvári]], [[Eric Wiewiora]]. ('''2009'''). ''Fast Gradient-Descent Methods for Temporal-Difference Learning with Linear Function Approximation''. In Proceedings of the 26th International Conference on Machine Learning (ICML-09). [http://www.sztaki.hu/~szcsaba/papers/GTD-ICML09.pdf pdf]
+* [[Hamid Reza Maei]], [[Csaba Szepesvári]], [[Shalabh Bhatnagar]], [[Doina Precup]], [[David Silver]], [[Richard Sutton]] ('''2009'''). ''Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation''. [https://dblp.uni-trier.de/db/conf/nips/nips2009.html#MaeiSBPSS09 NIPS 2009], [https://papers.nips.cc/paper/2009/file/3a15c7d0bbe60300a39f76f8a5ba6896-Paper.pdf pdf]
-* [[Hamid Reza Maei]], [[Csaba Szepesvári]], [[Shalabh Bhatnagar]], [[Doina Precup]], [[David Silver]], [[Richard Sutton]] ('''2009'''). ''Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation.'' Accepted in Advances in Neural Information Processing Systems 22, Vancouver, BC. December 2009. MIT Press. [http://books.nips.cc/papers/files/nips22/NIPS2009_1121.pdf pdf]
+* [[Richard Sutton]], [[Hamid Reza Maei]], [[Doina Precup]], [[Shalabh Bhatnagar]], [[David Silver]], [[Csaba Szepesvári]], [[Eric Wiewiora]]. ('''2009'''). ''[https://dl.acm.org/doi/10.1145/1553374.1553501 Fast Gradient-Descent Methods for Temporal-Difference Learning with Linear Function Approximation]''. [https://dblp.uni-trier.de/db/conf/icml/icml2009.html#SuttonMPBSSW09 ICML 2009]
 * [[Joel Veness]], [[David Silver]], [[William Uther]], [[Alan Blair]] ('''2009'''). ''[http://papers.nips.cc/paper/3722-bootstrapping-from-game-tree-search Bootstrapping from Game Tree Search]''. [http://webdocs.cs.ualberta.ca/~silver/David_Silver/Applications_files/bootstrapping.pdf pdf]
 * [[David Silver]] ('''2009'''). ''Reinforcement Learning and Simulation-Based Search''. Ph.D. thesis, [[University of Alberta]], [http://www0.cs.ucl.ac.uk/staff/D.Silver/web/Applications_files/thesis.pdf pdf]

Difference between revisions of "David Silver"

Revision as of 13:57, 12 April 2021

Contents

See also

Selected Publications

2006 ...

2010 ...

2015 ...

2020 ...

External Links

References

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools