Changes

← Older edit

Monte-Carlo Tree Search

185 bytes added, 16:36, 1 December 2021

no edit summary

=Playouts by NN=

Historically, at the root of MCTS were random and noisy playouts. Many such playouts were necessary to accurately evaluate a state. Since [[AlphaGo]] and [[AlphaZero]] it is not the case anymore. Strong policies and evaluations are now provided by [[Neural Networks|neural networks]] that are trained with [[Reinforcement Learning]]. In AlphaGo and its descendants the policy is used as a prior in the [[Christopher D. Rosin#PUCT|PUCT]] bandit to explore first the most promising moves advised by the neural network policy and the evaluations replace the playouts <ref>[[Quentin Cohen-Solal]], [[Tristan Cazenave]] ('''2020'''). ''Minimax Strikes Back''. [https://arxiv.org/abs/2012.10700 arXiv:2012.10700]</ref>.

=See also=

* [[Tobias Joppen]], [[Johannes Fürnkranz]] ('''2019'''). ''[https://www.groundai.com/project/ordinal-monte-carlo-tree-search/ Ordinal Monte Carlo Tree Search]''. [[Darmstadt University of Technology|TU Darmstadt]], [https://arxiv.org/abs/1901.04274 arXiv:1901.04274]

* [[Herilalaina Rakotoarison]], [[Marc Schoenauer]], [[Michèle Sebag]] ('''2019'''). ''Automated Machine Learning with Monte-Carlo Tree Search''. [https://arxiv.org/abs/1906.00170 arXiv:1906.00170]

* [[Aline Hufschmitt]], [[Jean-Noël Vittaut]], [[Nicolas Jouandeau]] ('''2019'''). ''Exploiting Game Decompositions in Monte Carlo Tree Search''. [[Advances in Computer Games 16]]

==2020 ...==

* [[Johannes Czech]], [[Patrick Korus]], [[Kristian Kersting]] ('''2020'''). ''Monte-Carlo Graph Search for AlphaZero''. [https://arxiv.org/abs/2012.11045 arXiv:2012.11045] » [[AlphaZero]], [[CrazyAra]]

GerdIsenberg

Bureaucrats, Administrators

25,161

edits

Changes

Monte-Carlo Tree Search

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools