Difference between revisions of "Monte-Carlo Tree Search"

From Chessprogramming wiki
Jump to: navigation, search
Line 4: Line 4:
  
 
'''Monte Carlo Tree Search''', (Monte-Carlo Tree Search, MCTS)<br/>
 
'''Monte Carlo Tree Search''', (Monte-Carlo Tree Search, MCTS)<br/>
is a [[Best-First|Best-First search]] algorithm based on [https://en.wikipedia.org/wiki/Randomness random] playouts. In conjunction with [[UCT]] ('''U'''pper '''C'''onfidence bounds applied to '''T'''rees) Monte-Carlo Tree Search has yielded in a breakthrough in [[Go|Computer Go]] <ref>[[Rémi Coulom]] ('''2009'''). ''The Monte-Carlo Revolution in Go''. JFFoS'2008: Japanese-French Frontiers of Science Symposium, [http://remi.coulom.free.fr/JFFoS/JFFoS.pdf slides as pdf]</ref>, and is also successful in [[Amazons]] <ref>[[Julien Kloetzer]], [[Hiroyuki Iida]], [[Bruno Bouzy]] ('''2007'''). ''The Monte-Carlo approach in Amazons''. [[CGW 2007]]</ref> <ref>[[Richard J. Lorentz]] ('''2008'''). ''[http://link.springer.com/chapter/10.1007/978-3-540-87608-3_2 Amazons Discover Monte-Carlo]''. [[CG 2008]]</ref>, [[Lines of Action]] <ref>[[Mark Winands]] and [[Yngvi Björnsson]] ('''2009'''). ''Evaluation Function Based Monte-Carlo LOA''. [http://www.ru.is/faculty/yngvi/pdf/WinandsB09.pdf pdf]</ref>, [[Havannah]] <ref>[[Richard J. Lorentz]] ('''2010'''). ''[http://www.springerlink.com/content/p4x16832317r1214/ Improving Monte-Carlo Tree Search in Havannah]''. [[CG 2010]]</ref>, [[Hex]] <ref>[[Broderick Arneson]], [[Ryan Hayward]], [[Philip Henderson]] ('''2010'''). ''Monte Carlo Tree Search in Hex''. [[IEEE#TOCIAIGAMES|IEEE Transactions on Computational Intelligence and AI in Games]], Vol. 2, No. 4, [http://webdocs.cs.ualberta.ca/~hayward/papers/mcts-hex.pdf pdf]</ref>, [[Checkers]] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=38554 UCT surprise for checkers !] by [[Daniel Shawul]], [[CCC]], March 25, 2011</ref> and other [[Games]] with some difficulties in position evaluation, but until December 2017, when a  [[Google]] [[DeepMind]] team reported on [[AlphaZero]] <ref>[[David Silver]], [[Thomas Hubert]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Matthew Lai]], [[Arthur Guez]], [[Marc Lanctot]], [[Laurent Sifre]], [[Dharshan Kumaran]], [[Thore Graepel]], [[Timothy Lillicrap]], [[Karen Simonyan]], [[Demis Hassabis]] ('''2017'''). ''Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm''. [https://arxiv.org/abs/1712.01815 arXiv:1712.01815]</ref>, not for [[Chess]] <ref>[[Raghuram Ramanujan]], [[Ashish Sabharwal]], [[Bart Selman]] ('''2010'''). ''[http://www.aaai.org/ocs/index.php/ICAPS/ICAPS10/paper/view/1458 On Adversarial Search Spaces and Sampling-Based Planning]''. [http://www.aaai.org/Press/Proceedings/icaps10.php ICAPS 2010]</ref> <ref>[[Oleg Arenz]] ('''2012'''). ''[http://www.ke.tu-darmstadt.de/bibtex/publications/show/2321 Monte Carlo Chess]''.  B.Sc. thesis, [[Darmstadt University of Technology]], advisor [[Johannes Fürnkranz]], [http://www.ke.tu-darmstadt.de/lehre/arbeiten/bachelor/2012/Arenz_Oleg.pdf pdf]</ref>.  
+
is a [[Best-First|Best-First search]] algorithm based on [https://en.wikipedia.org/wiki/Randomness random] playouts. In conjunction with [[UCT]] ('''U'''pper '''C'''onfidence bounds applied to '''T'''rees) Monte-Carlo Tree Search has yielded in a breakthrough in [[Go|Computer Go]] <ref>[[Rémi Coulom]] ('''2009'''). ''The Monte-Carlo Revolution in Go''. JFFoS'2008: Japanese-French Frontiers of Science Symposium, [http://remi.coulom.free.fr/JFFoS/JFFoS.pdf slides as pdf]</ref>, and is also successful in [[Amazons]] <ref>[[Julien Kloetzer]], [[Hiroyuki Iida]], [[Bruno Bouzy]] ('''2007'''). ''The Monte-Carlo approach in Amazons''. [[CGW 2007]]</ref> <ref>[[Richard J. Lorentz]] ('''2008'''). ''[http://link.springer.com/chapter/10.1007/978-3-540-87608-3_2 Amazons Discover Monte-Carlo]''. [[CG 2008]]</ref>, [[Lines of Action]] <ref>[[Mark Winands]], [[Yngvi Björnsson]] ('''2009'''). ''Evaluation Function Based Monte-Carlo LOA''. [http://www.ru.is/faculty/yngvi/pdf/WinandsB09.pdf pdf]</ref>, [[Havannah]] <ref>[[Richard J. Lorentz]] ('''2010'''). ''[http://www.springerlink.com/content/p4x16832317r1214/ Improving Monte-Carlo Tree Search in Havannah]''. [[CG 2010]]</ref>, [[Hex]] <ref>[[Broderick Arneson]], [[Ryan Hayward]], [[Philip Henderson]] ('''2010'''). ''Monte Carlo Tree Search in Hex''. [[IEEE#TOCIAIGAMES|IEEE Transactions on Computational Intelligence and AI in Games]], Vol. 2, No. 4, [http://webdocs.cs.ualberta.ca/~hayward/papers/mcts-hex.pdf pdf]</ref>, [[Checkers]] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=38554 UCT surprise for checkers !] by [[Daniel Shawul]], [[CCC]], March 25, 2011</ref> and other [[Games]] with some difficulties in position evaluation, but until December 2017, when a  [[Google]] [[DeepMind]] team reported on [[AlphaZero]] <ref>[[David Silver]], [[Thomas Hubert]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Matthew Lai]], [[Arthur Guez]], [[Marc Lanctot]], [[Laurent Sifre]], [[Dharshan Kumaran]], [[Thore Graepel]], [[Timothy Lillicrap]], [[Karen Simonyan]], [[Demis Hassabis]] ('''2017'''). ''Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm''. [https://arxiv.org/abs/1712.01815 arXiv:1712.01815]</ref>, not for [[Chess]] <ref>[[Raghuram Ramanujan]], [[Ashish Sabharwal]], [[Bart Selman]] ('''2010'''). ''[http://www.aaai.org/ocs/index.php/ICAPS/ICAPS10/paper/view/1458 On Adversarial Search Spaces and Sampling-Based Planning]''. [http://www.aaai.org/Press/Proceedings/icaps10.php ICAPS 2010]</ref> <ref>[[Oleg Arenz]] ('''2012'''). ''[http://www.ke.tu-darmstadt.de/bibtex/publications/show/2321 Monte Carlo Chess]''.  B.Sc. thesis, [[Darmstadt University of Technology]], advisor [[Johannes Fürnkranz]], [http://www.ke.tu-darmstadt.de/lehre/arbeiten/bachelor/2012/Arenz_Oleg.pdf pdf]</ref>.  
  
 
MCTS is based on randomized explorations of the [[Search Space|search space]]. Using the results of previous explorations, the algorithm gradually grows a [[Search Tree|game tree]] in [[Memory|memory]], and successively becomes better at accurately estimating the values of the most promising moves <ref>[[Guillaume Chaslot]], [[Mark Winands]], [[Jaap van den Herik]] ('''2008'''). ''[http://link.springer.com/chapter/10.1007/978-3-540-87608-3_6 Parallel Monte-Carlo Tree Search]''. [[CG 2008]], [https://dke.maastrichtuniversity.nl/m.winands/documents/multithreadedMCTS2.pdf pdf]</ref>.   
 
MCTS is based on randomized explorations of the [[Search Space|search space]]. Using the results of previous explorations, the algorithm gradually grows a [[Search Tree|game tree]] in [[Memory|memory]], and successively becomes better at accurately estimating the values of the most promising moves <ref>[[Guillaume Chaslot]], [[Mark Winands]], [[Jaap van den Herik]] ('''2008'''). ''[http://link.springer.com/chapter/10.1007/978-3-540-87608-3_6 Parallel Monte-Carlo Tree Search]''. [[CG 2008]], [https://dke.maastrichtuniversity.nl/m.winands/documents/multithreadedMCTS2.pdf pdf]</ref>.   
Line 75: Line 75:
 
* [[Guillaume Chaslot]], [[Louis Chatriot]], [[Christophe Fiter]], [[Sylvain Gelly]], [[Jean-Baptiste Hoock]], [[Julien Pérez]], [[Arpad Rimmel]], [[Olivier Teytaud]] ('''2008'''). ''Combining expert, offline, transient and online knowledge in Monte-Carlo exploration''. [http://www.lri.fr/%7Eteytaud/eg.pdf pdf]
 
* [[Guillaume Chaslot]], [[Louis Chatriot]], [[Christophe Fiter]], [[Sylvain Gelly]], [[Jean-Baptiste Hoock]], [[Julien Pérez]], [[Arpad Rimmel]], [[Olivier Teytaud]] ('''2008'''). ''Combining expert, offline, transient and online knowledge in Monte-Carlo exploration''. [http://www.lri.fr/%7Eteytaud/eg.pdf pdf]
 
* [[Guillaume Chaslot]], [[Mark Winands]], [[Jos Uiterwijk]], [[Jaap van den Herik]], [[Bruno Bouzy]] ('''2008'''). ''Progressive Strategies for Monte-Carlo Tree Search''. [http://www.worldscinet.com/nmnc/nmnc.shtml New Mathematics and Natural Computation], Vol. 4, No. 3, [http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.106.3015&rep=rep1&type=pdf pdf] <ref>[[Jeff Rollason]] ('''2015'''). ''[http://www.aifactory.co.uk/newsletter/2015_02_mixing_immiscible.htm Mixing the Immiscible - MCTS and evaluation]''. [[AI Factory]], October 2015</ref>
 
* [[Guillaume Chaslot]], [[Mark Winands]], [[Jos Uiterwijk]], [[Jaap van den Herik]], [[Bruno Bouzy]] ('''2008'''). ''Progressive Strategies for Monte-Carlo Tree Search''. [http://www.worldscinet.com/nmnc/nmnc.shtml New Mathematics and Natural Computation], Vol. 4, No. 3, [http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.106.3015&rep=rep1&type=pdf pdf] <ref>[[Jeff Rollason]] ('''2015'''). ''[http://www.aifactory.co.uk/newsletter/2015_02_mixing_immiscible.htm Mixing the Immiscible - MCTS and evaluation]''. [[AI Factory]], October 2015</ref>
* [[Guillaume Chaslot]], [[Sander Bakkes]], [[István Szita]] and [[Pieter Spronck]] ('''2008'''). ''Monte-Carlo Tree Search: A New Framework for Game AI''. [http://sander.landofsand.com/publications/AIIDE08_Chaslot.pdf pdf]
+
* [[Guillaume Chaslot]], [[Sander Bakkes]], [[István Szita]], [[Pieter Spronck]] ('''2008'''). ''Monte-Carlo Tree Search: A New Framework for Game AI''. [http://sander.landofsand.com/publications/AIIDE08_Chaslot.pdf pdf]
* [[Guillaume Chaslot]], [[Mark Winands]], [[István Szita]], and [[Jaap van den Herik]]. ('''2008'''). ''Cross-entropy for Monte-Carlo Tree Search''. [[ICGA Journal#31_3|ICGA Journal, Vol. 31, No. 3]], [http://www.personeel.unimaas.nl/m-winands/documents/crossmc.pdf pdf]
+
* [[Guillaume Chaslot]], [[Mark Winands]], [[István Szita]], [[Jaap van den Herik]]. ('''2008'''). ''Cross-entropy for Monte-Carlo Tree Search''. [[ICGA Journal#31_3|ICGA Journal, Vol. 31, No. 3]], [http://www.personeel.unimaas.nl/m-winands/documents/crossmc.pdf pdf]
 
* [[Maarten Schadd]], [[Mark Winands]], [[Jaap van den Herik]], [[Huib Aldewereld]] ('''2008'''). ''Addressing NP-Complete Puzzles with Monte-Carlo Methods''. In Volume 9: Proceedings of the AISB 2008 Symposium on Logic and the Simulation of Interaction and Reasoning, pages 55-61, Brighton, UK, 2008. The Society for the study of Artificial Intelligence and Simulation of Behaviour. [http://www.personeel.unimaas.nl/Maarten-Schadd/Papers/2008SameGameAISB.pdf pdf]
 
* [[Maarten Schadd]], [[Mark Winands]], [[Jaap van den Herik]], [[Huib Aldewereld]] ('''2008'''). ''Addressing NP-Complete Puzzles with Monte-Carlo Methods''. In Volume 9: Proceedings of the AISB 2008 Symposium on Logic and the Simulation of Interaction and Reasoning, pages 55-61, Brighton, UK, 2008. The Society for the study of Artificial Intelligence and Simulation of Behaviour. [http://www.personeel.unimaas.nl/Maarten-Schadd/Papers/2008SameGameAISB.pdf pdf]
 
* [[Maarten Schadd]], [[Mark Winands]], [[Jaap van den Herik]], [[Guillaume Chaslot]], [[Jos Uiterwijk]] ('''2008'''). ''[http://link.springer.com/chapter/10.1007/978-3-540-87608-3_1 Single-Player Monte-Carlo Tree Search]''. [[CG 2008]], [http://www.personeel.unimaas.nl/Maarten-Schadd/Papers/2008SameGameCG.pdf pdf]
 
* [[Maarten Schadd]], [[Mark Winands]], [[Jaap van den Herik]], [[Guillaume Chaslot]], [[Jos Uiterwijk]] ('''2008'''). ''[http://link.springer.com/chapter/10.1007/978-3-540-87608-3_1 Single-Player Monte-Carlo Tree Search]''. [[CG 2008]], [http://www.personeel.unimaas.nl/Maarten-Schadd/Papers/2008SameGameCG.pdf pdf]
Line 86: Line 86:
 
* [[Keh-Hsun Chen|Ken Chen]], [[Dawei Du]], [[Peigang Zhang]] ('''2008'''). ''[http://link.springer.com/chapter/10.1007/978-3-540-87608-3_9 A Fast Indexing Method for Monte-Carlo Go]''. [[CG 2008]]
 
* [[Keh-Hsun Chen|Ken Chen]], [[Dawei Du]], [[Peigang Zhang]] ('''2008'''). ''[http://link.springer.com/chapter/10.1007/978-3-540-87608-3_9 A Fast Indexing Method for Monte-Carlo Go]''. [[CG 2008]]
 
* [[Yizao Wang]], [[Jean-Yves Audibert]], [[Rémi Munos]] ('''2008'''). ''Algorithms for Infinitely Many-Armed Bandits''. Advances in Neural Information Processing Systems, [http://www.stat.lsa.umich.edu/%7Eyizwang/publications/wang08algorithms.pdf pdf], Supplemental material - [http://www.stat.lsa.umich.edu/%7Eyizwang/publications/wang08algorithmsSupp.pdf pdf]
 
* [[Yizao Wang]], [[Jean-Yves Audibert]], [[Rémi Munos]] ('''2008'''). ''Algorithms for Infinitely Many-Armed Bandits''. Advances in Neural Information Processing Systems, [http://www.stat.lsa.umich.edu/%7Eyizwang/publications/wang08algorithms.pdf pdf], Supplemental material - [http://www.stat.lsa.umich.edu/%7Eyizwang/publications/wang08algorithmsSupp.pdf pdf]
* [http://en.scientificcommons.org/james_h_brodeur James H. Brodeur], [http://en.scientificcommons.org/benjamin_e_childs Benjamin E. Childs] and [[Levente Kocsis]] ('''2008'''). ''Transpositions and Move Groups in Monte Carlo Tree Search.'' [http://eprints.pascal-network.org/archive/00004571/01/8057.pdf pdf]
+
* [http://en.scientificcommons.org/james_h_brodeur James H. Brodeur], [http://en.scientificcommons.org/benjamin_e_childs Benjamin E. Childs], [[Levente Kocsis]] ('''2008'''). ''Transpositions and Move Groups in Monte Carlo Tree Search.'' [http://eprints.pascal-network.org/archive/00004571/01/8057.pdf pdf]
* [[Hilmar Finnsson]] and [[Yngvi Björnsson]]. ('''2008'''). ''Simulation-Based Approach to General Game Playing.'' In The Twenty-Third AAAI Conference on Artificial Intelligence, [[AAAI]] Press, 2008. Accepted. [http://www.ru.is/faculty/yngvi/pdf/FinnssonB08a.pdf pdf], [http://www.aaai.org/Papers/AAAI/2008/AAAI08-041.pdf pdf] » [[General Game Playing]]
+
* [[Hilmar Finnsson]], [[Yngvi Björnsson]]. ('''2008'''). ''Simulation-Based Approach to General Game Playing.'' In The Twenty-Third AAAI Conference on Artificial Intelligence, [[AAAI]] Press, 2008. Accepted. [http://www.ru.is/faculty/yngvi/pdf/FinnssonB08a.pdf pdf], [http://www.aaai.org/Papers/AAAI/2008/AAAI08-041.pdf pdf] » [[General Game Playing]]
 
* [[Jean Méhat]], [[Tristan Cazenave]] ('''2008'''). ''Monte-Carlo Tree Search for General Game Playing''. [http://www.lamsade.dauphine.fr/~cazenave/papers/ggp2008.pdf pdf] » [[General Game Playing]]
 
* [[Jean Méhat]], [[Tristan Cazenave]] ('''2008'''). ''Monte-Carlo Tree Search for General Game Playing''. [http://www.lamsade.dauphine.fr/~cazenave/papers/ggp2008.pdf pdf] » [[General Game Playing]]
 
* [[Tristan Cazenave]], [[Nicolas Jouandeau]] ('''2008'''). ''A Parallel Monte-Carlo Tree Search Algorithm''. [http://www.lamsade.dauphine.fr/%7Ecazenave/papers/parallelMCTS.pdf pdf]
 
* [[Tristan Cazenave]], [[Nicolas Jouandeau]] ('''2008'''). ''A Parallel Monte-Carlo Tree Search Algorithm''. [http://www.lamsade.dauphine.fr/%7Ecazenave/papers/parallelMCTS.pdf pdf]
Line 97: Line 97:
 
* [[Guillaume Chaslot]], [[Christophe Fiter]], [[Jean-Baptiste Hoock]], [[Arpad Rimmel]], [[Olivier Teytaud]] ('''2009'''). ''Adding Expert Knowledge and Exploration in Monte-Carlo Tree Search''. [[Advances in Computer Games 12]], [http://www.personeel.unimaas.nl/g-chaslot/papers/acg09.pdf pdf], [http://www.lri.fr/%7Erimmel/publi/EK_explo.pdf pdf]
 
* [[Guillaume Chaslot]], [[Christophe Fiter]], [[Jean-Baptiste Hoock]], [[Arpad Rimmel]], [[Olivier Teytaud]] ('''2009'''). ''Adding Expert Knowledge and Exploration in Monte-Carlo Tree Search''. [[Advances in Computer Games 12]], [http://www.personeel.unimaas.nl/g-chaslot/papers/acg09.pdf pdf], [http://www.lri.fr/%7Erimmel/publi/EK_explo.pdf pdf]
 
* [[Guillaume Chaslot]], [[Jean-Baptiste Hoock]], [[Julien Pérez]], [[Arpad Rimmel]], [[Olivier Teytaud]], [[Mark Winands]] ('''2009'''). ''Meta Monte-Carlo Tree Search for Automatic Opening Book Generation''. [http://www.personeel.unimaas.nl/m-winands/documents/ouvertures9x9.pdf pdf]
 
* [[Guillaume Chaslot]], [[Jean-Baptiste Hoock]], [[Julien Pérez]], [[Arpad Rimmel]], [[Olivier Teytaud]], [[Mark Winands]] ('''2009'''). ''Meta Monte-Carlo Tree Search for Automatic Opening Book Generation''. [http://www.personeel.unimaas.nl/m-winands/documents/ouvertures9x9.pdf pdf]
* [[Markus Enzenberger]] and [[Martin Müller]] ('''2009'''). ''A lock-free multithreaded Monte-Carlo tree search algorithm'', [[Advances in Computer Games 12]], [http://webdocs.cs.ualberta.ca/%7Emmueller/ps/enzenberger-mueller-acg12.pdf pdf]
+
* [[Markus Enzenberger]], [[Martin Müller]] ('''2009'''). ''A lock-free multithreaded Monte-Carlo tree search algorithm'', [[Advances in Computer Games 12]], [http://webdocs.cs.ualberta.ca/%7Emmueller/ps/enzenberger-mueller-acg12.pdf pdf]
* [[Markus Enzenberger]] and [[Martin Müller]] ('''2009'''). ''[http://www.cs.ualberta.ca/research/theses-publications/technical-reports/2009/tr09-08 Fuego - An Open-source Framework for Board Games and Go Engine Based on Monte-Carlo Tree Search]'', [http://www.cs.ualberta.ca/system/files/tech_report/2009/TR09-08.pdf pdf]
+
* [[Markus Enzenberger]], [[Martin Müller]] ('''2009'''). ''[http://www.cs.ualberta.ca/research/theses-publications/technical-reports/2009/tr09-08 Fuego - An Open-source Framework for Board Games and Go Engine Based on Monte-Carlo Tree Search]'', [http://www.cs.ualberta.ca/system/files/tech_report/2009/TR09-08.pdf pdf]
 
* [[Rémi Coulom]] ('''2009'''). ''The Monte-Carlo Revolution in Go''. JFFoS'2008: Japanese-French Frontiers of Science Symposium, [http://remi.coulom.free.fr/JFFoS/JFFoS.pdf slides as pdf]
 
* [[Rémi Coulom]] ('''2009'''). ''The Monte-Carlo Revolution in Go''. JFFoS'2008: Japanese-French Frontiers of Science Symposium, [http://remi.coulom.free.fr/JFFoS/JFFoS.pdf slides as pdf]
* [[Mark Winands]] and [[Yngvi Björnsson]] ('''2009'''). ''Evaluation Function Based Monte-Carlo LOA''. [http://www.ru.is/faculty/yngvi/pdf/WinandsB09.pdf pdf] <ref>[http://www.open-chess.org/viewtopic.php?f=5&t=886 Monte Carlo in LOA] by [[Mark Watkins]], [[Computer Chess Forums|Open Chess Forum]], December 30, 2010</ref>  
+
* [[Mark Winands]], [[Yngvi Björnsson]] ('''2009'''). ''Evaluation Function Based Monte-Carlo LOA''. [http://www.ru.is/faculty/yngvi/pdf/WinandsB09.pdf pdf] <ref>[http://www.open-chess.org/viewtopic.php?f=5&t=886 Monte Carlo in LOA] by [[Mark Watkins]], [[Computer Chess Forums|Open Chess Forum]], December 30, 2010</ref>  
 
* [[Tristan Cazenave]] ('''2009'''). ''Nested Monte-Carlo Search''. [http://www.informatik.uni-trier.de/~ley/db/conf/ijcai/ijcai2009.html#Cazenave09 IJCAI 2009], [http://www.lamsade.dauphine.fr/~cazenave/papers/nested.pdf pdf]
 
* [[Tristan Cazenave]] ('''2009'''). ''Nested Monte-Carlo Search''. [http://www.informatik.uni-trier.de/~ley/db/conf/ijcai/ijcai2009.html#Cazenave09 IJCAI 2009], [http://www.lamsade.dauphine.fr/~cazenave/papers/nested.pdf pdf]
 
* [[Paolo Ciancarini]], [[Gian Piero Favini]] ('''2009'''). ''Monte Carlo Tree Search Techniques in the Game of Kriegspiel''. [http://www.informatik.uni-trier.de/~ley/db/conf/ijcai/ijcai2009.html IJCAI 2009], [http://ijcai.org/papers09/Papers/IJCAI09-086.pdf pdf] » [[KriegSpiel]]
 
* [[Paolo Ciancarini]], [[Gian Piero Favini]] ('''2009'''). ''Monte Carlo Tree Search Techniques in the Game of Kriegspiel''. [http://www.informatik.uni-trier.de/~ley/db/conf/ijcai/ijcai2009.html IJCAI 2009], [http://ijcai.org/papers09/Papers/IJCAI09-086.pdf pdf] » [[KriegSpiel]]
Line 114: Line 114:
 
==2010 ...==  
 
==2010 ...==  
 
* [[Julien Kloetzer]] ('''2010'''). ''[https://dspace.jaist.ac.jp/dspace/handle/10119/8867?locale=en Monte-Carlo Techniques: Applications to the Game of the Amazons]''. Ph.D. thesis, [[JAIST]]
 
* [[Julien Kloetzer]] ('''2010'''). ''[https://dspace.jaist.ac.jp/dspace/handle/10119/8867?locale=en Monte-Carlo Techniques: Applications to the Game of the Amazons]''. Ph.D. thesis, [[JAIST]]
* [[Yoshikuni Sato]], [[Daisuke Takahashi]] and [[Reijer Grimbergen]] ('''2010'''). ''A Shogi Program based on Monte-Carlo Tree Search''. [[ICGA Journal#33_2|ICGA Journal, Vol. 33, No. 2]]
+
* [[Yoshikuni Sato]], [[Daisuke Takahashi]], [[Reijer Grimbergen]] ('''2010'''). ''A Shogi Program based on Monte-Carlo Tree Search''. [[ICGA Journal#33_2|ICGA Journal, Vol. 33, No. 2]]
 
* [[Richard J. Lorentz]] ('''2010'''). ''[http://www.springerlink.com/content/p4x16832317r1214/ Improving Monte-Carlo Tree Search in Havannah]''. [[CG 2010]]
 
* [[Richard J. Lorentz]] ('''2010'''). ''[http://www.springerlink.com/content/p4x16832317r1214/ Improving Monte-Carlo Tree Search in Havannah]''. [[CG 2010]]
 
* [[Amine Bourki]], [[Guillaume Chaslot]], [[Matthieu Coulm]], [[Vincent Danjean]], [[Hassen Doghmen]], [[Thomas Hérault]], [[Jean-Baptiste Hoock]], [[Arpad Rimmel]], [[Fabien Teytaud]], [[Olivier Teytaud]], [[Paul Vayssière]], [[Ziqin Yu]] ('''2010'''). ''[http://hal.inria.fr/inria-00512854/en/ Scalability and Parallelization of Monte-Carlo Tree Search]''. [[CG 2010]], [http://hal.inria.fr/docs/00/51/28/54/PDF/newcluster.pdf pdf]
 
* [[Amine Bourki]], [[Guillaume Chaslot]], [[Matthieu Coulm]], [[Vincent Danjean]], [[Hassen Doghmen]], [[Thomas Hérault]], [[Jean-Baptiste Hoock]], [[Arpad Rimmel]], [[Fabien Teytaud]], [[Olivier Teytaud]], [[Paul Vayssière]], [[Ziqin Yu]] ('''2010'''). ''[http://hal.inria.fr/inria-00512854/en/ Scalability and Parallelization of Monte-Carlo Tree Search]''. [[CG 2010]], [http://hal.inria.fr/docs/00/51/28/54/PDF/newcluster.pdf pdf]

Revision as of 23:01, 6 March 2019

Home * Search * Monte-Carlo Tree Search

Edvard Munch - At the Roulette Table in Monte Carlo [1]

Monte Carlo Tree Search, (Monte-Carlo Tree Search, MCTS)
is a Best-First search algorithm based on random playouts. In conjunction with UCT (Upper Confidence bounds applied to Trees) Monte-Carlo Tree Search has yielded in a breakthrough in Computer Go [2], and is also successful in Amazons [3] [4], Lines of Action [5], Havannah [6], Hex [7], Checkers [8] and other Games with some difficulties in position evaluation, but until December 2017, when a Google DeepMind team reported on AlphaZero [9], not for Chess [10] [11].

MCTS is based on randomized explorations of the search space. Using the results of previous explorations, the algorithm gradually grows a game tree in memory, and successively becomes better at accurately estimating the values of the most promising moves [12].

Four Phases

MCTS consists of four strategic phases, repeated as long as there is time left [13] :

  1. In the Selection phase the tree is traversed from the root node until it selects a leaf node that is not added to the tree yet
  2. The Expansion strategy adds the leaf node to the tree
  3. The Simulation strategy plays moves in self-play until the end of the game. The result is either 1, 0 ,-1
  4. The Backpropagation strategy propagates the results through the tree
MCTS (English).svg

Steps of Monte Carlo Tree Search [14]

Pure Monte-Carlo search

Pure Monte-Carlo search with parameter T means that for each feasible move T random games are generated. The move with the best average score is played. A game is called “Monte Carlo perfect” when this procedure converges to perfect play for each position, when T goes to infinity. However, with limited time per move, increasing T does not guarantee to find a better move [15].

UCT

UCT (Upper Confidence bounds applied to Trees) deals with the flaw of Monte-Carlo Tree Search, when a program may favor a losing move with only one or a few forced refutations, but due to the vast majority of other moves provides a better random playout score than other, better moves [16].

See also

Publications

1987

1990 ...

2000 ...

2005 ...

2006

2007

2008

2009

2010 ...

2011

2012

2013

2014

2015 ...

2016

2017

2018

2019

Forum Posts

2010 ...

2015 ...

2017

2018

2019

Re: Training the trainer: how is it done for Stockfish? by Graham Jones, CCC, March 03, 2019 » Leela Chess Zero

External Links

Monte Carlo Tree Search

GitHub - suragnair/alpha-zero-general: A clean and simple implementation of a self-play learning algorithm based on AlphaGo Zero (any game, any framework!)

Monte Carlo Misc

Monte Carlo algorithm
Monte Carlo method
Monte Carlo
Monte Carlo Casino
feat. Raúl Midón, Roy Hargrove and the Monte-Carlo Philharmonic Orchestra, November 29, 2008, Monte-Carlo Jazz Festival

References

  1. Edvard Munch, At the Roulette Table in Monte Carlo, 1893, Edvard Munch from Wikipedia
  2. Rémi Coulom (2009). The Monte-Carlo Revolution in Go. JFFoS'2008: Japanese-French Frontiers of Science Symposium, slides as pdf
  3. Julien Kloetzer, Hiroyuki Iida, Bruno Bouzy (2007). The Monte-Carlo approach in Amazons. CGW 2007
  4. Richard J. Lorentz (2008). Amazons Discover Monte-Carlo. CG 2008
  5. Mark Winands, Yngvi Björnsson (2009). Evaluation Function Based Monte-Carlo LOA. pdf
  6. Richard J. Lorentz (2010). Improving Monte-Carlo Tree Search in Havannah. CG 2010
  7. Broderick Arneson, Ryan Hayward, Philip Henderson (2010). Monte Carlo Tree Search in Hex. IEEE Transactions on Computational Intelligence and AI in Games, Vol. 2, No. 4, pdf
  8. UCT surprise for checkers ! by Daniel Shawul, CCC, March 25, 2011
  9. David Silver, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Matthew Lai, Arthur Guez, Marc Lanctot, Laurent Sifre, Dharshan Kumaran, Thore Graepel, Timothy Lillicrap, Karen Simonyan, Demis Hassabis (2017). Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm. arXiv:1712.01815
  10. Raghuram Ramanujan, Ashish Sabharwal, Bart Selman (2010). On Adversarial Search Spaces and Sampling-Based Planning. ICAPS 2010
  11. Oleg Arenz (2012). Monte Carlo Chess. B.Sc. thesis, Darmstadt University of Technology, advisor Johannes Fürnkranz, pdf
  12. Guillaume Chaslot, Mark Winands, Jaap van den Herik (2008). Parallel Monte-Carlo Tree Search. CG 2008, pdf
  13. Guillaume Chaslot, Mark Winands, Jaap van den Herik (2008). Parallel Monte-Carlo Tree Search. CG 2008, pdf
  14. Steps of Monte-Carlo Tree Search by Mciura, March 31, 2013, Wikimedia Commons, Monte Carlo tree search from Wikipedia
  15. Ingo Althöfer (2011). On Board-Filling Games with Random-Turn Order and Monte Carlo Perfectness. Advances in Computer Games 13
  16. Levente Kocsis, Csaba Szepesvári (2006). Bandit based Monte-Carlo Planning ECML-06, LNCS/LNAI 4212, pp. 282-293. introducing UCT, pdf
  17. Jeff Rollason (2015). Mixing the Immiscible - MCTS and evaluation. AI Factory, October 2015
  18. Monte Carlo in LOA by Mark Watkins, Open Chess Forum, December 30, 2010
  19. The Settlers of Catan from Wikipedia
  20. Blokus Duo from Wikipedia
  21. Search traps in MCTS and chess by Daniel Shawul, CCC, December 25, 2017
  22. NoGo (ICGA Tournaments)
  23. Ms. Pac-Man from Wikipedia
  24. Scotland Yard (board game) from Wikipedia
  25. Morpion solitaire from Wikipedia
  26. Morpion Solitaire - Record Grids (5T game)
  27. Re: MC methods by Daniel Shawul, CCC, April 13, 2013
  28. Crossings from Wikipedia
  29. Epaminondas from Wikipedia
  30. 7 Wonders (board game) from WIkipedia
  31. Guillaume Chaslot, Mark Winands, Jos Uiterwijk, Jaap van den Herik, Bruno Bouzy (2008). Progressive Strategies for Monte-Carlo Tree Search. New Mathematics and Natural Computation, Vol. 4, No. 3, pdf
  32. Dap Hartmann (2017). Let's Catch the Train to Monte-Carlo. ICGA Journal, Vol. 39, No. 1
  33. Re: Minmax backup operator for MCTS by Brahim Hamadicharef, CCC, December 30, 2017
  34. Re: Announcing lczero by Daniel Shawul, CCC, January 21, 2018 » Leela Chess Zero
  35. GitHub - suragnair/alpha-zero-general: A clean and simple implementation of a self-play learning algorithm based on AlphaGo Zero (any game, any framework!)
  36. GitHub - dkappe/leela_lite: A toolkit for experimenting with UCT and Leela Chess nets in Python by Dietrich Kappe
  37. A Simple Alpha(Go) Zero Tutorial by Oliver Roese, CCC, December 30, 2017
  38. David Silver, Gerald Tesauro (2009). Monte-Carlo Simulation Balancing. ICML 2009, pdf

Up one level