Difference between revisions of "AlphaZero"
GerdIsenberg (talk | contribs) |
GerdIsenberg (talk | contribs) |
||
(8 intermediate revisions by the same user not shown) | |||
Line 55: | Line 55: | ||
==2018== | ==2018== | ||
* [[George Rajna]] ('''2018'''). ''AlphaZero Just Playing''. [http://vixra.org/abs/1802.0330 viXra:1802.0330] | * [[George Rajna]] ('''2018'''). ''AlphaZero Just Playing''. [http://vixra.org/abs/1802.0330 viXra:1802.0330] | ||
− | * [[ | + | * [[Mathematician#VGCerf|Vinton G. Cerf]] ('''2018'''). ''[https://cacm.acm.org/magazines/2018/7/229041-on-neural-networks/fulltext On Neural Networks]''. [[ACM#Communications|Communications of the ACM]], Vol. 61, No. 7 |
+ | ** [[Hermann Kaindl]] ('''2018'''). ''[https://cacm.acm.org/magazines/2018/12/232884-reclaim-internet-greatness/fulltext Comment - Lookahead Search for Computer Chess]''. [[ACM#Communications|Communications of the ACM]], Vol. 61, No. 12 | ||
* [[Garry Kasparov]] ('''2018'''). ''[http://science.sciencemag.org/content/362/6419/1087 Chess, a Drosophila of reasoning]''. [https://en.wikipedia.org/wiki/Science_(journal) Science], Vol. 362, No. 6419 | * [[Garry Kasparov]] ('''2018'''). ''[http://science.sciencemag.org/content/362/6419/1087 Chess, a Drosophila of reasoning]''. [https://en.wikipedia.org/wiki/Science_(journal) Science], Vol. 362, No. 6419 | ||
* [[Murray Campbell]] ('''2018'''). ''[http://science.sciencemag.org/content/362/6419/1118 Mastering board games]''. [https://en.wikipedia.org/wiki/Science_(journal) Science], Vol. 362, No. 6419 | * [[Murray Campbell]] ('''2018'''). ''[http://science.sciencemag.org/content/362/6419/1118 Mastering board games]''. [https://en.wikipedia.org/wiki/Science_(journal) Science], Vol. 362, No. 6419 | ||
Line 61: | Line 62: | ||
* [[Nai-Yuan Chang]], [[Chih-Hung Chen]], [[Shun-Shii Lin]], [[Surag Nair]] ('''2018'''). ''[https://dl.acm.org/citation.cfm?id=3278325 The Big Win Strategy on Multi-Value Network: An Improvement over AlphaZero Approach for 6x6 Othello]''. [https://dl.acm.org/citation.cfm?id=3278312 MLMI2018] | * [[Nai-Yuan Chang]], [[Chih-Hung Chen]], [[Shun-Shii Lin]], [[Surag Nair]] ('''2018'''). ''[https://dl.acm.org/citation.cfm?id=3278325 The Big Win Strategy on Multi-Value Network: An Improvement over AlphaZero Approach for 6x6 Othello]''. [https://dl.acm.org/citation.cfm?id=3278312 MLMI2018] | ||
* [[Yen-Chi Chen]], [[Chih-Hung Chen]], [[Shun-Shii Lin]] ('''2018'''). ''[https://dl.acm.org/citation.cfm?id=3293486 Exact-Win Strategy for Overcoming AlphaZero]''. [https://dl.acm.org/citation.cfm?id=3293475 CIIS 2018] <ref>[https://github.com/LeelaChessZero/lc0/issues/799 "Exact-Win Strategy for Overcoming AlphaZero" · Issue #799 · LeelaChessZero/lc0 · GitHub]</ref> | * [[Yen-Chi Chen]], [[Chih-Hung Chen]], [[Shun-Shii Lin]] ('''2018'''). ''[https://dl.acm.org/citation.cfm?id=3293486 Exact-Win Strategy for Overcoming AlphaZero]''. [https://dl.acm.org/citation.cfm?id=3293475 CIIS 2018] <ref>[https://github.com/LeelaChessZero/lc0/issues/799 "Exact-Win Strategy for Overcoming AlphaZero" · Issue #799 · LeelaChessZero/lc0 · GitHub]</ref> | ||
+ | * [[David Silver]], [[Thomas Hubert]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Matthew Lai]], [[Arthur Guez]], [[Marc Lanctot]], [[Laurent Sifre]], [[Dharshan Kumaran]], [[Thore Graepel]], [[Timothy Lillicrap]], [[Karen Simonyan]], [[Demis Hassabis]] ('''2018'''). ''[http://science.sciencemag.org/content/362/6419/1140 A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play]''. [https://en.wikipedia.org/wiki/Science_(journal) Science], Vol. 362, No. 6419 <ref>[https://deepmind.com/blog/alphazero-shedding-new-light-grand-games-chess-shogi-and-go/ AlphaZero: Shedding new light on the grand games of chess, shogi and Go] by [[David Silver]], [[Thomas Hubert]], [[Julian Schrittwieser]] and [[Demis Hassabis]], [[DeepMind]], December 03, 2018</ref> | ||
==2019== | ==2019== | ||
− | * [[Matthew Sadler]], [ | + | * [[Matthew Sadler]], [[Natasha Regan]] ('''2019'''). ''[https://www.newinchess.com/game-changer Game Changer: AlphaZero's Groundbreaking Chess Strategies and the Promise of AI]''. [https://en.wikipedia.org/wiki/New_In_Chess New In Chess] |
− | * [[Marc Lanctot]], [[Edward Lockhart]], [[Jean-Baptiste Lespiau]], [[ | + | * [[Marc Lanctot]], [[Edward Lockhart]], [[Jean-Baptiste Lespiau]], [[Vinícius Flores Zambaldi]], [[Satyaki Upadhyay]], [[Julien Pérolat]], [[Sriram Srinivasan]], [[Finbarr Timbers]], [[Karl Tuyls]], [[Shayegan Omidshafiei]], [[Daniel Hennes]], [[Dustin Morrill]], [[Paul Muller]], [[Timo Ewalds]], [[Ryan Faulkner]], [[János Kramár]], [[Bart De Vylder]], [[Brennan Saeta]], [[James Bradbury]], [[David Ding]], [[Sebastian Borgeaud]], [[Matthew Lai]], [[Julian Schrittwieser]], [[Thomas Anthony]], [[Edward Hughes]], [[Ivo Danihelka]], [[Jonah Ryan-Davis]] ('''2019'''). ''OpenSpiel: A Framework for Reinforcement Learning in Games''. [https://arxiv.org/abs/1908.09453 arXiv:1908.09453] <ref>[https://github.com/deepmind/open_spiel/blob/master/docs/contributing.md open_spiel/contributing.md at master · deepmind/open_spiel · GitHub]</ref> |
==2020 ...== | ==2020 ...== | ||
* [[Nenad Tomašev]], [[Ulrich Paquet]], [[Demis Hassabis]], [[Vladimir Kramnik]] ('''2020'''). ''Assessing Game Balance with AlphaZero: Exploring Alternative Rule Sets in Chess''. [https://arxiv.org/abs/2009.04374 arXiv:2009.04374] | * [[Nenad Tomašev]], [[Ulrich Paquet]], [[Demis Hassabis]], [[Vladimir Kramnik]] ('''2020'''). ''Assessing Game Balance with AlphaZero: Exploring Alternative Rule Sets in Chess''. [https://arxiv.org/abs/2009.04374 arXiv:2009.04374] | ||
+ | * [[Johannes Czech]], [[Patrick Korus]], [[Kristian Kersting]] ('''2020'''). ''Monte-Carlo Graph Search for AlphaZero''. [https://arxiv.org/abs/2012.11045 arXiv:2012.11045] » [[CrazyAra]] | ||
+ | * [[Johannes Czech]], [[Patrick Korus]], [[Kristian Kersting]] ('''2021'''). ''[https://ojs.aaai.org/index.php/ICAPS/article/view/15952 Improving AlphaZero Using Monte-Carlo Graph Search]''. [https://ojs.aaai.org/index.php/ICAPS/issue/view/380 Proceedings of the Thirty-First International Conference on Automated Planning and Scheduling], Vol. 31, [https://www.ml.informatik.tu-darmstadt.de/papers/czech2021icaps_mcgs.pdf pdf] | ||
+ | * [[Dominik Klein]] ('''2021'''). ''[https://github.com/asdfjkl/neural_network_chess Neural Networks For Chess]''. [https://github.com/asdfjkl/neural_network_chess/releases/tag/v1.1 Release Version 1.1 · GitHub] <ref>[https://www.talkchess.com/forum3/viewtopic.php?f=2&t=78283 Book about Neural Networks for Chess] by dkl, [[CCC]], September 29, 2021</ref> | ||
+ | * [[Thomas McGrath]], [[Andrei Kapishnikov]], [[Nenad Tomašev]], [[Adam Pearce]], [[Demis Hassabis]], [[Been Kim]], [[Ulrich Paquet]], [[Vladimir Kramnik]] ('''2021'''). ''Acquisition of Chess Knowledge in AlphaZero''. [https://arxiv.org/abs/2111.09259 arXiv:2111.09259] <ref>[https://en.chessbase.com/post/acquisition-of-chess-knowledge-in-alphazero Acquisition of Chess Knowledge in AlphaZero], [[ChessBase|ChessBase News]], November 18, 2021</ref> | ||
+ | * [[Nenad Tomašev]], [[Ulrich Paquet]], [[Demis Hassabis]], [[Vladimir Kramnik]] ('''2022'''). ''[https://cacm.acm.org/magazines/2022/2/258230-reimagining-chess-with-alphazero/fulltext Reimagining Chess with AlphaZero]''. [[ACM#Communications|Communications of the ACM]], Vol. 65, No. 2 | ||
=Forum Posts= | =Forum Posts= | ||
Line 154: | Line 161: | ||
: {{#evu:https://www.youtube.com/watch?v=7L2sUGcOgh0|alignment=left|valignment=top}} | : {{#evu:https://www.youtube.com/watch?v=7L2sUGcOgh0|alignment=left|valignment=top}} | ||
==OpenSpiel== | ==OpenSpiel== | ||
− | * [https://github.com/deepmind/open_spiel GitHub - deepmind/open_spiel: OpenSpiel is a collection of environments and algorithms for research in general reinforcement learning and search/planning in games] <ref>[[Marc Lanctot]], [[Edward Lockhart]], [[Jean-Baptiste Lespiau]], [[ | + | * [https://github.com/deepmind/open_spiel GitHub - deepmind/open_spiel: OpenSpiel is a collection of environments and algorithms for research in general reinforcement learning and search/planning in games] <ref>[[Marc Lanctot]], [[Edward Lockhart]], [[Jean-Baptiste Lespiau]], [[Vinícius Flores Zambaldi]], [[Satyaki Upadhyay]], [[Julien Pérolat]], [[Sriram Srinivasan]], [[Finbarr Timbers]], [[Karl Tuyls]], [[Shayegan Omidshafiei]], [[Daniel Hennes]], [[Dustin Morrill]], [[Paul Muller]], [[Timo Ewalds]], [[Ryan Faulkner]], [[János Kramár]], [[Bart De Vylder]], [[Brennan Saeta]], [[James Bradbury]], [[David Ding]], [[Sebastian Borgeaud]], [[Matthew Lai]], [[Julian Schrittwieser]], [[Thomas Anthony]], [[Edward Hughes]], [[Ivo Danihelka]], [[Jonah Ryan-Davis]] ('''2019'''). ''OpenSpiel: A Framework for Reinforcement Learning in Games''. [https://arxiv.org/abs/1908.09453 arXiv:1908.09453]</ref> |
** [https://github.com/deepmind/open_spiel/tree/master/open_spiel/algorithms open_spiel/open_spiel/algorithms at master · deepmind/open_spiel · GitHub] | ** [https://github.com/deepmind/open_spiel/tree/master/open_spiel/algorithms open_spiel/open_spiel/algorithms at master · deepmind/open_spiel · GitHub] | ||
*** [https://github.com/deepmind/open_spiel/tree/master/open_spiel/algorithms/alpha_zero open_spiel/open_spiel/algorithms/alpha_zero at master · deepmind/open_spiel · GitHub] | *** [https://github.com/deepmind/open_spiel/tree/master/open_spiel/algorithms/alpha_zero open_spiel/open_spiel/algorithms/alpha_zero at master · deepmind/open_spiel · GitHub] |
Latest revision as of 21:35, 31 January 2022
AlphaZero, a chess and Go playing entity by Google DeepMind based on a general reinforcement learning algorithm with the same name. On December 5, 2017 [1], the DeepMind team around David Silver, Thomas Hubert, and Julian Schrittwieser along with former Giraffe author Matthew Lai, reported on their generalized algorithm, combining Deep learning with Monte-Carlo Tree Search (MCTS) [2]. The final peer reviewed paper with various clarifications was published almost one year later in the Science magazine under the title A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play [3].
|
„Zero ist die Stille. Zero ist der Anfang. Zero ist rund. Zero dreht sich. Zero ist der Mond. Die Sonne ist Zero. Zero ist weiss. Die Wüste Zero. Der Himmel über Zero. Die Nacht –, Zero fließt. Das Auge Zero. Nabel. Mund. Kuß. Die Milch ist rund. Die Blume Zero der Vogel. Schweigend. Schwebend. Ich esse Zero, ich trinke Zero, ich schlafe Zero, ich wache Zero, ich liebe Zero. Zero ist schön, dynamo, dynamo, dynamo. Die Bäume im Frühling, der Schnee, Feuer, Wasser, Meer. Rot orange gelb grün indigo blau violett Zero Zero Regenbogen. 4 3 2 1 Zero. Gold und Silber, Schall und Rauch. Wanderzirkus Zero. Zero ist die Stille. Zero ist der Anfang. Zero ist rund. Zero ist Zero.“ [5] |
Network Architecture
The deep neural network consists of a “body” with input and hidden layers of spatial NxN planes, 8x8 board arrays for chess, followed by both policy and value “heads” [6] [7]. Each square cell of the input plane contains 6x2 piece-type and color bits of the current chess position from the current player's point of view, plus two bits of a repetition counter concerning the draw rule, and to further address graph history and path-dependency issues - these 14 bits times eight, that is up to seven predecessor positions as well - so that en passant, or some sense of progress is implicit. Additional 7 input bits consider castling rights, total move count and side to move, yielding in 119 bits per square cell for chess.
The body consists of a rectified batch-normalized convolutional layer followed by 19 residual blocks. Each such block consists of two rectified batch-normalized residual convolutional layers with a skip connection [8] [9]. Each convolution applies 256 filters (shared weight vectors) of kernel size 3x3 with stride 1. These layers connect the pieces on different squares to each other due to consecutive convolutions, where a cell of a layer is connected to the correspondent 3x3 receptive field of the previous layer, so that after 4 convolutions, each square is connected to every other cell in the original input layer [10].
The policy head applies an additional rectified, batch-normalized convolutional layer, followed by a final convolution of 73 filters for chess, with the final policy output represented as an 8x8 board array as well, for every origin square up to 73 target square possibilities (NRayDirs x MaxRayLength + NKnightDirs + NPawnDirs * NMinorPromotions), encoding a probability distribution over 64x73 = 4,672 possible moves, where illegal moves were masked out by setting their probabilities to zero, re-normalising the probabilities for remaining moves. The value head applies an additional rectified, batch-normalized convolution of 1 filter of kernel size 1x1 with stride 1, followed by a rectified linear layer of size 256 and a tanh-linear layer of size 1.
Training
AlphaZero was trained in 700,000 steps or mini-batches of size 4096 each, starting from randomly initialized parameters, using 5,000 first-generation TPUs [11] to generate self-play games and 64 second-generation TPUs [12] [13] [14] to train the neural networks [15] .
Stockfish Match
As mentioned in the December 2017 paper [16], a 100 game match versus Stockfish 8 using 64 threads and a transposition table size of 1GiB, was won by AlphaZero using a single machine with 4 first-generation TPUs with +28=72-0, 10 games were published. Despite a possible hardware advantage of AlphaZero and criticized playing conditions [17], this is a tremendous achievement.
In the final peer reviewed paper, published in Science magazine in December 2018 [18] along with supplementary materials [19], a 1000 game match was reported with about 200 games published, versus various most recent Stockfish versions available at the time of the matches, that is Stockfish 8, a development version as of January 13, 2018 close to Stockfish 9, Brainfish with Cerebellum book, and Stockfish 9, in total AlphaZero winning 155 games and losing 6 games.
Stockfish was configured according to its 2016 TCEC Season 9 superfinal settings: 44 threads on 44 cores (two 2.2GHz Intel Xeon Broadwell x86-64 CPUs with 22 cores, running Linux), a transposition table size of 32 GiB, and 6-men Syzygy bases. Time control was 3 hours per side and game plus 15 seconds increment per move. AlphaZero used a simple time control strategy: thinking for 1/20th of the remaining time, and selects moves greedily with respect to the root visit count. Each MCTS was executed on a single machine with 4 first-generation TPUs.
AlphaZero and Stockfish (except Brainfish) used no opening book, 12 common human positions as well as the 2016 TCEC Season 9 superfinal positions were played, originally selected by Jeroen Noomen [20]. To ensure diversity against opponents (Brainfish) with a deterministic opening book, AlphaZero used a small amount of randomization in its opening moves. This avoided duplicate games but also resulted in more losses by AlphaZero.
See also
Publications
2017
- David Silver, Julian Schrittwieser, Karen Simonyan, Ioannis Antonoglou, Aja Huang, Arthur Guez, Thomas Hubert, Lucas Baker, Matthew Lai, Adrian Bolton, Yutian Chen, Timothy Lillicrap, Fan Hui, Laurent Sifre, George van den Driessche, Thore Graepel, Demis Hassabis (2017). Mastering the game of Go without human knowledge. Nature, Vol. 550, pdf
- David Silver, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Matthew Lai, Arthur Guez, Marc Lanctot, Laurent Sifre, Dharshan Kumaran, Thore Graepel, Timothy Lillicrap, Karen Simonyan, Demis Hassabis (2017). Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm. arXiv:1712.01815
2018
- George Rajna (2018). AlphaZero Just Playing. viXra:1802.0330
- Vinton G. Cerf (2018). On Neural Networks. Communications of the ACM, Vol. 61, No. 7
- Hermann Kaindl (2018). Comment - Lookahead Search for Computer Chess. Communications of the ACM, Vol. 61, No. 12
- Garry Kasparov (2018). Chess, a Drosophila of reasoning. Science, Vol. 362, No. 6419
- Murray Campbell (2018). Mastering board games. Science, Vol. 362, No. 6419
- Chu-Hsuan Hsueh, I-Chen Wu, Jr-Chang Chen, Tsan-sheng Hsu (2018). AlphaZero for a Non-Deterministic Game. TAAI 2018 » Chinese Dark Chess
- Nai-Yuan Chang, Chih-Hung Chen, Shun-Shii Lin, Surag Nair (2018). The Big Win Strategy on Multi-Value Network: An Improvement over AlphaZero Approach for 6x6 Othello. MLMI2018
- Yen-Chi Chen, Chih-Hung Chen, Shun-Shii Lin (2018). Exact-Win Strategy for Overcoming AlphaZero. CIIS 2018 [21]
- David Silver, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Matthew Lai, Arthur Guez, Marc Lanctot, Laurent Sifre, Dharshan Kumaran, Thore Graepel, Timothy Lillicrap, Karen Simonyan, Demis Hassabis (2018). A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play. Science, Vol. 362, No. 6419 [22]
2019
- Matthew Sadler, Natasha Regan (2019). Game Changer: AlphaZero's Groundbreaking Chess Strategies and the Promise of AI. New In Chess
- Marc Lanctot, Edward Lockhart, Jean-Baptiste Lespiau, Vinícius Flores Zambaldi, Satyaki Upadhyay, Julien Pérolat, Sriram Srinivasan, Finbarr Timbers, Karl Tuyls, Shayegan Omidshafiei, Daniel Hennes, Dustin Morrill, Paul Muller, Timo Ewalds, Ryan Faulkner, János Kramár, Bart De Vylder, Brennan Saeta, James Bradbury, David Ding, Sebastian Borgeaud, Matthew Lai, Julian Schrittwieser, Thomas Anthony, Edward Hughes, Ivo Danihelka, Jonah Ryan-Davis (2019). OpenSpiel: A Framework for Reinforcement Learning in Games. arXiv:1908.09453 [23]
2020 ...
- Nenad Tomašev, Ulrich Paquet, Demis Hassabis, Vladimir Kramnik (2020). Assessing Game Balance with AlphaZero: Exploring Alternative Rule Sets in Chess. arXiv:2009.04374
- Johannes Czech, Patrick Korus, Kristian Kersting (2020). Monte-Carlo Graph Search for AlphaZero. arXiv:2012.11045 » CrazyAra
- Johannes Czech, Patrick Korus, Kristian Kersting (2021). Improving AlphaZero Using Monte-Carlo Graph Search. Proceedings of the Thirty-First International Conference on Automated Planning and Scheduling, Vol. 31, pdf
- Dominik Klein (2021). Neural Networks For Chess. Release Version 1.1 · GitHub [24]
- Thomas McGrath, Andrei Kapishnikov, Nenad Tomašev, Adam Pearce, Demis Hassabis, Been Kim, Ulrich Paquet, Vladimir Kramnik (2021). Acquisition of Chess Knowledge in AlphaZero. arXiv:2111.09259 [25]
- Nenad Tomašev, Ulrich Paquet, Demis Hassabis, Vladimir Kramnik (2022). Reimagining Chess with AlphaZero. Communications of the ACM, Vol. 65, No. 2
Forum Posts
2017
- Google's AlphaGo team has been working on chess by Peter Kappler, CCC, December 06, 2017
- Historic Milestone: AlphaZero by Miguel Castanuela, CCC, December 06, 2017
- AlphaZero beats AlphaGo Zero, Stockfish, and Elmo by Carl Lumma, CCC, December 06, 2017
- AlphaZero vs Stockfish by Bigler, CCC, December 06, 2017
- Deepmind drops the bomb by Leebot, FishCooking, December 06, 2017
- AlphaZero beats Stockfish 8 by 64-36 by Venator, Rybka Forum, December 06, 2017
- Alpha Zero by BB+, OpenChess Forum, December 06, 2017
- AlphaGo Zero And AlphaZero, RomiChess done better by Michael Sherwin, CCC, December 07, 2017 » RomiChess
- BBC News; 'Google's ... DeepMind AI claims chess crown' by pennine22, Hiarcs Forum, December 07, 2017
- Press Release Stockfish vs. AlphaZero by Michael Whiteley, FishCooking, December 08, 2017
- AlphaZero reinvents mobility and romanticism by Chris Whittington, Rybka Forum, December 08, 2017 » Alpha Zero's "Immortal Zugzwang Game"
- Reactions about AlphaZero from top GMs... by Norman Schmidt, CCC, December 08, 2017 » Reactions From Top GMs, Stockfish Author
- AlphaZero is not like other chess programs by Dann Corbit, CCC, December 08, 2017
- Re: AlphaZero is not like other chess programs by Rein Halbersma, CCC, December 09, 2017
- Photo of Google Cloud TPU cluster by Norman Schmidt, CCC, December 09, 2017
- Cerebellum analysis of the AlphaZero - Stockfish Games by Thomas Zipproth, CCC, December 11, 2017 » Cerebellum
- Open letter to Google DeepMind by Michael Stembera, FishCooking, December 12, 2017
- recent article on alphazero ... 12/11/2017 ... by Dan Ellwein, CCC, December 14, 2017
- An AlphaZero inspired project by Truls Edvard Stokke, CCC, December 14, 2017 » ZeroFish
- AlphaZero - Tactical Abilities by David Rasmussen, CCC, December 16, 2017
- In chess,AlphaZero outperformed Stockfish after just 4 hours by Ed Schroder, CCC, December 18, 2017
- AlphaZero - Youtube Videos by Christoph Fieberg, CSS Forum, December 18, 2017
- AlphaZero Chess is not that strong ... by Vincent Lejeune, CCC, December 19, 2017
- David Silver (Deepmind) inaccuracies by Ed Schroder, CCC, December 21, 2017
- AZ vs SF - game 99 by Rebel, Rybka Forum, December 23, 2017
- AlphaZero performance by Martin Sedlak, CCC, December 25, 2017
- A Simple Alpha(Go) Zero Tutorial by Oliver Roese, CCC, December 30, 2017
- AlphaZero: The 10 Top Shots by Walter Eigenmann, CCC, December 30, 2017
2018
- SF was more seriously handicapped than I thought by Kai Laskos, CCC, January 02, 2018
- Chess World to Google Deep Mind..Prove You beat Stockfish 8! by AA Ross, CCC, January 11, 2018
- Article:"How Alpha Zero Sees/Wins" by AA Ross, CCC, January 17, 2018 » How AlphaZero Wins
- Connect 4 AlphaZero implemented using Python... by Steve Maughan, CCC, January 29, 2018 » Connect Four, Python
- Seeing Alphazero in perspective ... by Dan Ellwein, CCC, February 10, 2018
- Matthew Sadler analysis of A0 vs SF [Edit: A0 published in Science?] by trulses, CCC, December 06, 2018
- Alphazero news by arunsoorya1309, CCC, December 06, 2018
- Re: Alphazero news by Matthew Lai, CCC, December 07, 2018
- Re: Alphazero news by Matthew Lai, CCC, December 07, 2018
- Re: Alphazero news by Larry Kaufman, CCC, December 07, 2018
- Re: Alphazero news by Kai Laskos, CCC, December 07, 2018
- Re: Alphazero news by Matthew Lai, CCC, December 07, 2018
- Re: Alphazero news by crem, CCC, December 07, 2018
- Re: Alphazero news by Matthew Lai, CCC, December 07, 2018
- Re: Alphazero news by crem, CCC, December 07, 2018
- Re: Alphazero news by Matthew Lai, CCC, December 07, 2018
- Re: Alphazero news by Gian-Carlo Pascutto, CCC, December 07, 2018 » Leela Chess Zero
- Re: Alphazero news by Matthew Lai, CCC, December 07, 2018
- Re: Alphazero news by crem, CCC, December 07, 2018
- Re: Alphazero news by Matthew Lai, CCC, December 07, 2018
- Re: Alphazero news by Matthew Lai, CCC, December 07, 2018 » Giraffe
- Re: Alphazero news by Matthew Lai, CCC, December 08, 2018
- Re: Alphazero news by Jonathan Rosenthal, CCC, December 11, 2018
- Re: Alphazero news by Matthew Lai, CCC, December 11, 2018
- Re: Alphazero news by Matthew Lai, CCC, December 11, 2018 » Stockfish Match
- Re: Alphazero news by Milos, CCC, December 11, 2018
- Re: Alphazero news by Gian-Carlo Pascutto, CCC, December 11, 2018
- Re: Alphazero news by Matthew Lai, CCC, December 11, 2018
- Re: Alphazero news by Kai Laskos, CCC, December 12, 2018 » Stockfish Match
- Policy training in Alpha Zero, LC0 .. by Chris Whittington, CCC, December 18, 2018 » LC0
2019
- A0 policy head ambiguity by Daniel Shawul, CCC, January 21, 2019
- AlphaZero No Castling Chess by Javier Ros, CCC, December 03, 2019
2020 ...
- AlphaZero by Pawel Wojcik, CCC, April 26, 2020
- Chess variants made with help from alpha zero article by jmartus, CCC, September 10, 2020
Blog Posts
- Lessons From Implementing AlphaZero by Aditya Prasad, Oracle Blog, June 05, 2018
- Lessons from AlphaZero: Connect Four by Aditya Prasad, Oracle Blog, June 13, 2018
- Lessons from AlphaZero (part 3): Parameter Tweaking by Aditya Prasad, Oracle Blog, June 20, 2018
- Lessons From AlphaZero (part 4): Improving the Training Target by Vish Abrams, Oracle Blog, June 27, 2018
- Lessons From Alpha Zero (part 5): Performance Optimization by Anthony Young, Oracle Blog, July 03, 2018
- Lessons From Alpha Zero (part 6) — Hyperparameter Tuning by Anthony Young, Oracle Blog, July 11, 2018
- AlphaZero: Shedding new light on the grand games of chess, shogi and Go by David Silver, Thomas Hubert, Julian Schrittwieser and Demis Hassabis, DeepMind, December 03, 2018
- AlphaZero paper, and Lc0 v0.19.1 by crem, LCZero blog, December 07, 2018
External Links
- AlphaZero from Wikipedia
- AlphaGo Zero - AlphaZero from Wikipedia
- Keynote David Silver NIPS 2017 Deep Reinforcement Learning Symposium AlphaZero, December 06, 2017, YouTube Video [26]
- A Simple Alpha(Go) Zero Tutorial by Surag Nair, Stanford University, December 29, 2017 [27]
- AlphaZero: Shedding new light on the grand games of chess, shogi and Go by David Silver, Thomas Hubert, Julian Schrittwieser and Demis Hassabis, DeepMind, December 03, 2018
- AlphaZero: Shedding new light on the grand games of chess, shogi and Go, December 06, 2018, YouTube Video
OpenSpiel
- GitHub - deepmind/open_spiel: OpenSpiel is a collection of environments and algorithms for research in general reinforcement learning and search/planning in games [28]
Reports
2017
- DeepMind’s AI became a superhuman chess player in a few hours, just for fun by James Vincent, The Verge, December 06, 2017
- Entire human chess knowledge learned and surpassed by DeepMind's AlphaZero in four hours by Sarah Knapton, and Leon Watson, The Telegraph, December 06, 2017
- Google's 'superhuman' DeepMind AI claims chess crown, BBC News, December 06, 2017 [29]
- DeepMind’s AlphaZero crushes chess by Colin McGourty, chess24, December 06, 2017
- One Small Step for Computers, One Giant Leap for Mankind by Dana Mackenzie, Dana Blogs Chess, December 06, 2017
- Google's AlphaZero Destroys Stockfish In 100-Game Match by Mike Klein, Chess.com, December 06, 2017
- The future is here – AlphaZero learns chess by Albert Silver, ChessBase News, December 06, 2017
- AlphaZero: Reactions From Top GMs, Stockfish Author by Peter Doggers, Chess.com, December 08, 2017 » Stockfish, Tord Romstad [30]
- Is AlphaZero really a scientific breakthrough in AI? by Jose Camacho Collados, Medium, December 11, 2017 [31]
- Alpha Zero: Comparing "Orangutans and Apples" by André Schulz, ChessBase News, December 13, 2017
- Kasparov on Deep Learning in chess by Frederic Friedel, ChessBase News, December 13, 2017
2018 ...
- AlphaZero really is that good by Colin McGourty, chess24, December 06, 2018
- Inside the (deep) mind of AlphaZero by Albert Silver, ChessBase News, December 07, 2018
- Standing on the shoulders of giants by Albert Silver, ChessBase News, September 18, 2019
- Kramnik And AlphaZero: How To Rethink Chess, Chess.com, December 02, 2019 [32]
Stockfish Match
Round 1
- The chess games of AlphaZero (Computer) from chessgames.com
- Cerebellum AlphaZero Analysis » Cerebellum [33]
- Deep Mind Alpha Zero's "Immortal Zugzwang Game" against Stockfish by Antonio Radic, December 07, 2017, YouTube Video [34] [35] » Zugzwang
- Deep Mind AI Alpha Zero Dismantles Stockfish's French Defense by Antonio Radic, December 08, 2017, YouTube Video
- How AlphaZero Wins by Dana Mackenzie, Dana Blogs Chess, December 15, 2017 [36]
Round 2, 3
- AlphaZero vs. Stockfish from chess24
- AlphaZero's Attacking Chess by Anna Rudolf, December 06, 2018, YouTube Video [37]
- "Exactly How to Attack" | DeepMind's AlphaZero vs. Stockfish by Matthew Sadler, December 06, 2018, YouTube Video
- "Bold Sir Lancelot" | DeepMind's AlphaZero vs. Stockfish by Matthew Sadler, December 06, 2018, YouTube Video
- "All-in Defence" | DeepMind's AlphaZero vs. Stockfish by Matthew Sadler, December 06, 2018, YouTube Video
- "Long-term Sacrifice" | DeepMind's AlphaZero vs. Stockfish by Matthew Sadler, December 06, 2018, YouTube Video
- "Endgame Class" | DeepMind's AlphaZero vs. Stockfish by Matthew Sadler, December 06, 2018, YouTube Video
Misc
- How to build your own AlphaZero AI using Python and Keras by David Foster, January 26, 2018 » Connect Four, Python [38]
- Can - Halleluwah, from Tago Mago 1971, YouTube Video
References
- ↑ "5th of December - The Krampus has come", suggested by Michael Scheidl in AlphaZero by Peter Martan, CSS Forum, December 06, 2017, with further comments by Ingo Althöfer
- ↑ David Silver, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Matthew Lai, Arthur Guez, Marc Lanctot, Laurent Sifre, Dharshan Kumaran, Thore Graepel, Timothy Lillicrap, Karen Simonyan, Demis Hassabis (2017). Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm. arXiv:1712.01815
- ↑ David Silver, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Matthew Lai, Arthur Guez, Marc Lanctot, Laurent Sifre, Dharshan Kumaran, Thore Graepel, Timothy Lillicrap, Karen Simonyan, Demis Hassabis (2018). A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play. Science, Vol. 362, No. 6419
- ↑ AlphaGo Zero: Learning from scratch by Demis Hassabis and David Silver, DeepMind, October 18, 2017
- ↑ Zero Manifesto by Günther Uecker, Heinz Mack and Otto Piene of the ZERO Art group 1963, Translation by Google Translate
"Zero is silence. Zero is the beginning. Zero is round. Zero turns. Zero is the moon. The sun is zero. Zero is white. The desert zero. The sky over zero. The night -, Zero flows. The eye zero. Navel. Mouth. Kiss. The milk is round. The flower zero the bird. Silently. Pending. I eat Zero, I drink Zero, I sleep Zero, I watch Zero, I love Zero. Zero is beautiful, dynamo, dynamo, dynamo. The trees in spring, the snow, fire, water, sea. Red orange yellow green indigo blue violet zero zero rainbow. 4 3 2 1 Zero. Gold and silver, noise and smoke. Zero circus. Zero is silence. Zero is the beginning. Zero is round. Zero is zero. "
Zero the new Idealism - ↑ David Silver, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Matthew Lai, Arthur Guez, Marc Lanctot, Laurent Sifre, Dharshan Kumaran, Thore Graepel, Timothy Lillicrap, Karen Simonyan, Demis Hassabis (2018). A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play. Science, Vol. 362, No. 6419, Supplementary Materials - Architecture
- ↑ Re: Alphazero news by Matthew Lai, CCC, December 08, 2018
- ↑ The principle of residual nets is to add the input of the layer to the output of each layer. With this simple modification training is faster and enables deeper networks, see Tristan Cazenave (2017). Residual Networks for Computer Go. IEEE Transactions on Computational Intelligence and AI in Games, Vol. PP, No. 99, pdf
- ↑ Residual Networks for Computer Go by Brahim Hamadicharef, CCC, December 07, 2017
- ↑ Re: AlphaZero is not like other chess programs by Rein Halbersma, CCC, December 09, 2017
- ↑ First In-Depth Look at Google’s TPU Architecture by Nicole Hemsoth, The Next Platform, April 05, 2017
- ↑ Photo of Google Cloud TPU cluster by Norman Schmidt, CCC, December 09, 2017
- ↑ First In-Depth Look at Google’s New Second-Generation TPU by Nicole Hemsoth, The Next Platform, May 17, 2017
- ↑ Under The Hood Of Google’s TPU2 Machine Learning Clusters by Paul Teich, The Next Platform, May 22, 2017
- ↑ David Silver, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Matthew Lai, Arthur Guez, Marc Lanctot, Laurent Sifre, Dharshan Kumaran, Thore Graepel, Timothy Lillicrap, Karen Simonyan, Demis Hassabis (2017). Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm. arXiv:1712.01815
- ↑ David Silver, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Matthew Lai, Arthur Guez, Marc Lanctot, Laurent Sifre, Dharshan Kumaran, Thore Graepel, Timothy Lillicrap, Karen Simonyan, Demis Hassabis (2017). Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm. arXiv:1712.01815
- ↑ Alpha Zero by BB+, OpenChess Forum, December 06, 2017
- ↑ David Silver, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Matthew Lai, Arthur Guez, Marc Lanctot, Laurent Sifre, Dharshan Kumaran, Thore Graepel, Timothy Lillicrap, Karen Simonyan, Demis Hassabis (2018). A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play. Science, Vol. 362, No. 6419
- ↑ Supplementary Materials
- ↑ Supplementary Materials S4
- ↑ "Exact-Win Strategy for Overcoming AlphaZero" · Issue #799 · LeelaChessZero/lc0 · GitHub
- ↑ AlphaZero: Shedding new light on the grand games of chess, shogi and Go by David Silver, Thomas Hubert, Julian Schrittwieser and Demis Hassabis, DeepMind, December 03, 2018
- ↑ open_spiel/contributing.md at master · deepmind/open_spiel · GitHub
- ↑ Book about Neural Networks for Chess by dkl, CCC, September 29, 2021
- ↑ Acquisition of Chess Knowledge in AlphaZero, ChessBase News, November 18, 2021
- ↑ AlphaZero explained by one creator by Mario Carbonell Martinez, CCC, December 19, 2017
- ↑ A Simple Alpha(Go) Zero Tutorial by Oliver Roese, CCC, December 30, 2017
- ↑ Marc Lanctot, Edward Lockhart, Jean-Baptiste Lespiau, Vinícius Flores Zambaldi, Satyaki Upadhyay, Julien Pérolat, Sriram Srinivasan, Finbarr Timbers, Karl Tuyls, Shayegan Omidshafiei, Daniel Hennes, Dustin Morrill, Paul Muller, Timo Ewalds, Ryan Faulkner, János Kramár, Bart De Vylder, Brennan Saeta, James Bradbury, David Ding, Sebastian Borgeaud, Matthew Lai, Julian Schrittwieser, Thomas Anthony, Edward Hughes, Ivo Danihelka, Jonah Ryan-Davis (2019). OpenSpiel: A Framework for Reinforcement Learning in Games. arXiv:1908.09453
- ↑ BBC News; 'Google's ... DeepMind AI claims chess crown' by pennine22, Hiarcs Forum, December 07, 2017
- ↑ Reactions about AlphaZero from top GMs... by Norman Schmidt, CCC, December 08, 2017
- ↑ recent article on alphazero ... 12/11/2017 ... by Dan Ellwein, CCC, December 14, 2017
- ↑ AlphaZero No Castling Chess by Javier Ros, CCC, December 03, 2019
- ↑ Cerebellum analysis of the AlphaZero - Stockfish Games by Thomas Zipproth, CCC, December 11, 2017
- ↑ AlphaZero reinvents mobility and romanticism by Chris Whittington, Rybka Forum, December 08, 2017
- ↑ Immortal Zugzwang Game from Wikipedia
- ↑ Article:"How Alpha Zero Sees/Wins" by AA Ross, CCC, January 17, 2018
- ↑ Anna Rudolf analyzes a game of AlphaZero's by Stuart Cracraft, CCC, December 07, 2018
- ↑ Connect 4 AlphaZero implemented using Python... by Steve Maughan, CCC, January 29, 2018