Changes

AlphaZero

11 bytes added, 22:20, 9 December 2018

no edit summary

was won by AlphaZero using a single machine with 4 [https://en.wikipedia.org/wiki/Tensor_processing_unit#First_generation first-generation TPUs] with +28=72-0, 10 games were published. Despite a possible hardware advantage of AlphaZero and criticized playing conditions <ref>[http://www.open-chess.org/viewtopic.php?f=5&t=3153 Alpha Zero] by [[Mark Watkins|BB+]], [[Computer Chess Forums|OpenChess Forum]], December 06, 2017</ref>, this is a tremendous achievement.

In the final [https://en.wikipedia.org/wiki/Peer_review peer reviewed] paper, published in [https://en.wikipedia.org/wiki/Science_(journal) Science magazine] in December 2018 <ref>[[David Silver]], [[Thomas Hubert]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Matthew Lai]], [[Arthur Guez]], [[Marc Lanctot]], [[Laurent Sifre]], [[Dharshan Kumaran]], [[Thore Graepel]], [[Timothy Lillicrap]], [[Karen Simonyan]], [[Demis Hassabis]] ('''2018'''). ''[http://science.sciencemag.org/content/362/6419/1140 A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play]''. [https://en.wikipedia.org/wiki/Science_(journal) Science], Vol. 362, No. 6419</ref> along with supplementary materials <ref>[http://science.sciencemag.org/content/suppl/2018/12/05/362.6419.1140.DC1 Supplementary Materials]</ref>, a 1000 game match was reported with about 200 games published, versus various most recent Stockfish versions available at the time of the matches, that is Stockfish 8, a development version as of January 13, 2018 close to Stockfish 9, [[Brainfish]] with [[Cerebellum]] book, and Stockfish 9, in total AlphaZero winning 155 games and losing 6 games.

Stockfish was configured according to its [[TCEC Season 9#Superfinal|2016 TCEC Season 9 superfinal]] settings: 44 threads on 44 cores (two 2.2GHz [[Intel]] [https://en.wikipedia.org/wiki/Xeon#E3-12xx_v4_series_%22Broadwell-WS%22 Xeon Broadwell] [[x86-64]] CPUs with 22 cores, running [[Linux]]), a transposition table size of 32 GiB, and 6-men [[Syzygy Bases|Syzygy bases]]. Time control was 3 hours per side and game plus 15 seconds increment per move. AlphaZero used a simple time control strategy: thinking for 1/20th of the remaining time, and selects moves greedily with respect to the root visit count. Each MCTS was executed on a single machine with 4 [https://en.wikipedia.org/wiki/Tensor_processing_unit#First_generation first-generation TPUs].

GerdIsenberg

Bureaucrats, Administrators

25,161

edits

Changes

AlphaZero

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools