Changes

Jump to: navigation, search

AlphaZero

698 bytes added, 11:01, 20 February 2021
no edit summary
'''[[Main Page|Home]] * [[Engines]] * AlphaZero'''
[[FILE:ZERO-Manifest1.jpg{|border|right- style="vertical-align:top;"|thumb| ZERO Manifesto '''AlphaZero''',<refbr/>a chess and [[Go]] playing entity by [[Google]] [[DeepMind]] based on a general [[Reinforcement Learning|reinforcement learning]] algorithm with the same name. On [https://en.wikipedia.org/wiki/Manifesto ManifestoDecember_5#Holidays_and_observances December 5] of the , [https://en.wikipedia.org/wiki/Zero_(art) ZEROPortal:Current_events/2017_December_5 2017] <ref>"5th of December - The [https://en.wikipedia.org/wiki/Art_movement Art groupKrampus Krampus] 1963has come", Source: suggested by [[Michael Scheidl]] in [httpshttp://deforum.wikipediacomputerschach.orgde/cgi-bin/mwf/wikitopic_show.pl?tid=9635 AlphaZero] by Peter Martan, [[Computer Chess Forums|CSS Forum]], December 06, 2017, with further comments by [[Ingo Althöfer]]</Klaus_Schrenk Klaus Schrenkref>, the DeepMind team around [[David Silver]], [[Thomas Hubert]], and [[Julian Schrittwieser]] along with former [[Giraffe]] author [[Matthew Lai]], reported on their generalized algorithm, combining [[Deep Learning|Deep learning]] with [[Monte-Carlo Tree Search]] (Ed.MCTS) <ref>[[David Silver]], [[Thomas Hubert]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Matthew Lai]], [[Arthur Guez]], [[Marc Lanctot]], [[Laurent Sifre]], [[Dharshan Kumaran]], [[Thore Graepel]], [[Timothy Lillicrap]], [[Karen Simonyan]], [[Demis Hassabis]] ('''19842017'''). ''Aufbrüche. Manifeste, Manifestationen. Positionen in der bildenden Kunst zu Beginn der 60er Jahre in Berlin, Düsseldorf und MünchenMastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm''. [https://en.wikipediaarxiv.org/wikiabs/M1712.01815 arXiv:1712._DuMont_Schauberg DuMont01815] (German), </ref>. The final [https://en.wikipedia.org/wiki/Wikimedia_Commons Wikimedia CommonsPeer_review peer reviewed], Translation by paper with various clarifications was published almost one year later in the [https://en.wikipedia.org/wiki/Google_Translate Google TranslateScience_(journal) Science magazine]under the title ''A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play'' <br/ref>"Zero is silence. Zero is the beginning. Zero is round. Zero turns. Zero is the moon. The sun is zero. Zero is white. The desert zero. The sky over zero. The night -[[David Silver]], Zero flows. The eye zero. Navel. Mouth. Kiss. The milk is round. The flower zero the bird. Silently. Pending. I eat Zero[[Thomas Hubert]], I drink Zero[[Julian Schrittwieser]], I sleep Zero[[Ioannis Antonoglou]], I watch Zero[[Matthew Lai]], I love Zero. Zero is beautiful[[Arthur Guez]], dynamo[[Marc Lanctot]], dynamo[[Laurent Sifre]], dynamo. The trees in spring[[Dharshan Kumaran]], the snow[[Thore Graepel]], fire[[Timothy Lillicrap]], water[[Karen Simonyan]], sea[[Demis Hassabis]] ('''2018'''). Red orange yellow green indigo blue violet zero zero rainbow''[http://science. 4 3 2 1 Zerosciencemag. Gold and silverorg/content/362/6419/1140 A general reinforcement learning algorithm that masters chess, shogi, noise and smokeGo through self-play]''. Zero circus[https://en. Zero is silencewikipedia. Zero is the beginningorg/wiki/Science_(journal) Science], Vol. Zero is round362, No. Zero is zero. "<br/>Zero the new Idealism6419</ref> ]] .
'''AlphaZero''',<br/>
a chess and [[Go]] playing entity by [[Google]] [[DeepMind]] based on a general [[Reinforcement Learning|reinforcement learning]] algorithm with the same name. On [https://en.wikipedia.org/wiki/December_5#Holidays_and_observances December 5], [https://en.wikipedia.org/wiki/Portal:Current_events/2017_December_5 2017] <ref>"5th of December - The [https://en.wikipedia.org/wiki/Krampus Krampus] has come", suggested by [[Michael Scheidl]] in [http://forum.computerschach.de/cgi-bin/mwf/topic_show.pl?tid=9635 AlphaZero] by Peter Martan, [[Computer Chess Forums|CSS Forum]], December 06, 2017, with further comments by [[Ingo Althöfer]]</ref>, the DeepMind team around [[David Silver]], [[Thomas Hubert]], and [[Julian Schrittwieser]] along with former [[Giraffe]] author [[Matthew Lai]], reported on their generalized algorithm, combining [[Deep Learning|Deep learning]] with [[Monte-Carlo Tree Search]] (MCTS) <ref>[[David Silver]], [[Thomas Hubert]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Matthew Lai]], [[Arthur Guez]], [[Marc Lanctot]], [[Laurent Sifre]], [[Dharshan Kumaran]], [[Thore Graepel]], [[Timothy Lillicrap]], [[Karen Simonyan]], [[Demis Hassabis]] ('''2017'''). ''Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm''. [https://arxiv.org/abs/1712.01815 arXiv:1712.01815]</ref>. The final [https://en.wikipedia.org/wiki/Peer_review peer reviewed] paper with various clarifications was published almost one year later in the [https://en.wikipedia.org/wiki/Science_(journal) Science magazine] under the title ''A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play'' <ref>[[David Silver]], [[Thomas Hubert]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Matthew Lai]], [[Arthur Guez]], [[Marc Lanctot]], [[Laurent Sifre]], [[Dharshan Kumaran]], [[Thore Graepel]], [[Timothy Lillicrap]], [[Karen Simonyan]], [[Demis Hassabis]] ('''2018'''). ''[http://science.sciencemag.org/content/362/6419/1140 A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play]''. [https://en.wikipedia.org/wiki/Science_(journal) Science], Vol. 362, No. 6419</ref>.
=Description=
Starting from random play, and given no domain knowledge except the game rules, AlphaZero achieved a superhuman level of play in the games of chess and [[Shogi]] as well as in [[Go]]. The algorithm is a more generic version of the [[AlphaGo#Zero|AlphaGo Zero]] algorithm that was first introduced in the domain of Go <ref>[https://deepmind.com/blog/alphago-zero-learning-scratch/ AlphaGo Zero: Learning from scratch] by [[Demis Hassabis]] and [[David Silver]], [[DeepMind]], October 18, 2017</ref>. AlphaZero [[Evaluation|evaluates]] [[Chess Position|positions]] using non-linear function approximation based on a [[Neural Networks|deep neural network]], rather than the [[Evaluation#Linear|linear function approximation]] as used in classical chess programs.
This neural network takes the board position as input and outputs a vector of move probabilities (policy) and a position evaluation. Once trained, these network is combined with a [[Monte-Carlo Tree Search]] (MCTS) using the policy to narrow down the search to high ­probability moves, and using the value in conjunction with a fast rollout policy to evaluate positions in the tree. The selection is done by a variation of [[Christopher D. Rosin|Rosin's]] [[UCT]] improvement dubbed [[Christopher D. Rosin#PUCT|PUCT]].
 
| style="width: 30%" | <span style="display: block; text-align: center;"><span style="font-family: Comic Sans MS,cursive; font-size: 100%;">„Zero<br/>ist die Stille. Zero ist der<br/>Anfang. Zero ist rund. Zero dreht sich.<br/>Zero ist der Mond. Die Sonne ist Zero.<br/>Zero ist weiss. Die Wüste Zero. Der Himmel<br/>über Zero. Die Nacht –, Zero fließt. Das Auge<br/>Zero. Nabel. Mund. Kuß. Die Milch ist rund. Die<br/>Blume Zero der Vogel. Schweigend. Schwebend. Ich<br/>esse Zero, ich trinke Zero, ich schlafe Zero, ich wache<br/>Zero, ich liebe Zero. Zero ist schön, dynamo, dynamo,<br/>dynamo. Die Bäume im Frühling, der Schnee, Feuer,<br/>Wasser, Meer. Rot orange gelb grün indigo blau violett<br/>Zero Zero Regenbogen. 4 3 2 1 Zero. Gold und<br/>Silber, Schall und Rauch. Wanderzirkus Zero.<br/>Zero ist die Stille. Zero ist der Anfang.<br/>Zero ist rund. Zero ist<br/>Zero.“ </span></span> <ref>[https://de.wikipedia.org/wiki/ZERO#Manifest Zero Manifesto] by [https://en.wikipedia.org/wiki/G%C3%BCnther_Uecker Günther Uecker], [https://en.wikipedia.org/wiki/Heinz_Mack Heinz Mack] and [https://en.wikipedia.org/wiki/Otto_Piene Otto Piene] of the [https://en.wikipedia.org/wiki/Zero_(art) ZERO] [https://en.wikipedia.org/wiki/Art_movement Art group] 1963, Translation by [https://en.wikipedia.org/wiki/Google_Translate Google Translate]<br/>"Zero is silence. Zero is the beginning. Zero is round. Zero turns. Zero is the moon. The sun is zero. Zero is white. The desert zero. The sky over zero. The night -, Zero flows. The eye zero. Navel. Mouth. Kiss. The milk is round. The flower zero the bird. Silently. Pending. I eat Zero, I drink Zero, I sleep Zero, I watch Zero, I love Zero. Zero is beautiful, dynamo, dynamo, dynamo. The trees in spring, the snow, fire, water, sea. Red orange yellow green indigo blue violet zero zero rainbow. 4 3 2 1 Zero. Gold and silver, noise and smoke. Zero circus. Zero is silence. Zero is the beginning. Zero is round. Zero is zero. "<br/>Zero the new Idealism</ref>
|}
==Network Architecture==

Navigation menu