Changes

Jump to: navigation, search

TAAI 2018

4 bytes added, 16:55, 7 March 2019
no edit summary
* [[Kiminori Matsuzaki]] ('''2018'''). ''Empirical Analysis of PUCT Algorithm with Evaluation Functions of Different Quality''. [[TAAI 2018]]
: [[Monte-Carlo Tree Search|Monte-Carlo tree search]] (MCTS) algorithms play an important role in developing computer players for many games. The performance of MCTS players is often leveraged in combination with offline knowledge, i.e., evaluation functions. In particular, recently [[AlphaGo]] and [[AlphaGo Zero]] achieved a big success in developing strong computer [[Go]] player by combining evaluation functions consisting of [[Deep Learning|deep neural networks]] with a variant of [[Christopher D. Rosin#PUCT|PUCT]] (Predictor + UCB applied to trees). The effect of evaluation functions on the strength of MCTS algorithms, however, has not been investigated well, especially in terms of the quality of evaluation functions. In this study, we address this issue and empirically analyze the AlphaGo's PUCT algorithm by using [[Othello ]] (Reversi) as the target game. We investigate the strength of PUCT players using variants of an existing evaluation function of a champion-level computer player. From intensive experiments, we found that the PUCT algorithm works very well especially with a good evaluation function and that the value function has more importance than the policy function in the PUCT algorithm.
=External Links=

Navigation menu