Changes

← Older edit

ShashChess

562 bytes added, 21:48, 5 June 2021

no edit summary

=Q-Learning=

A [https://en.wikipedia.org/wiki/Rote_learning rote learning ] technique ~~derived~~ inspired from [[Reinforcement Learning#Q-Learning|Q-learning]], worked out and introduced by [[Kelly Kinyama]]

<ref>[https://groups.google.com/g/fishcooking/c/fhX7dFAsyew/m/NSd0-OJjBwAJ Re: Self-Learning stockfish upgraded] by [[Kelly Kinyama]], [[Computer Chess Forums|FishCooking]], May 28, 2019</ref>

<ref>[https://groups.google.com/g/fishcooking/c/6IzmiSCB8lg/m/sFeSq9ykAQAJ A new reinforcement learning implementation of Q learning algorithm for alphabeta engines to automatically tune the evaluation of chess positions] by [[Kelly Kinyama]], [[Computer Chess Forums|FishCooking]], June 29, 2020</ref>

and also employed in [[BrainLearn]] 9.0 <ref>[https://github.com/amchess/BrainLearn/releases/tag/9.0 Release BrainLearn 9.0 · amchess/BrainLearn · GitHub]</ref>,

was applied in ShashChess since version 12.0 <ref>[https://groups.google.com/g/fishcooking/c/GLag32ARtKo/m/3Zoaq3-rAwAJ ShashChess 12.0] by [[Andrea Manzo]], [[Computer Chess Forums|FishCooking]], June 28, 2020</ref>.

After the end of a decisive selfplay game ~~in selfplay with appropriate [[Depth|search depth]]~~, the [[Move List|list of moves]] (ml) and associated [[Score|scores]] is merged into the learn table from end to start~~, replacements preferring depth and score~~,the score of timestep t adjusted as weighted average with the future reward of timestep t+1, using a [https://en.wikipedia.org/wiki/Q-learning#Learning_Rate learning rate] α of 0.5 and a [https://en.wikipedia.org/wiki/Q-learning#Discount_factor discount factor] γ of 0.99<ref>[https://github.com/amchess/ShashChess/blob/master/src/All/search.cpp#L2625 ShashChess/search.cpp at master · amchess/ShashChess · GitHub]</ref>:

<pre>

for (t = ml.size() - 2; t >= 0; t--) {

}

</pre>

During repeated selfplay games, subsequently playing along the learned best line so far, decreasing score adjustments will stimulate exploration of alternative siblings, while increasing score adjustments correspondents to exploitation of the best move.

=Forum Posts=

==2018 ...==

* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68093 ShashChess] by [[Andrea Manzo]], [[CCC]], July 28, 2018

: [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68093&start=103 Re: ShashChess] (11.0) by [[Andrea Manzo]], [[CCC]], March 06, 2020

: [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68093&start=272 Re: ShashChess] (17.1) by [[Andrea Manzo]], [[CCC]], June 01, 2021

* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=68120&p=769919 Build ShashChess for Android] by [[Andrea Manzo]], [[CCC]], August 01, 2018

==2020 ...==

* [https://groups.google.com/g/fishcooking/c/GLag32ARtKo/m/3Zoaq3-rAwAJ ShashChess 12.0] by [[Andrea Manzo]], [[Computer Chess Forums|FishCooking]], June 28, 2020

* [https://groups.google.com/g/fishcooking/c/6IzmiSCB8lg/m/sFeSq9ykAQAJ A new reinforcement learning implementation of Q learning algorithm for alphabeta engines to automatically tune the evaluation of chess positions] by [[Kelly Kinyama]], [[Computer Chess Forums|FishCooking]], June 29, 2020

* [https://groups.google.com/d/msg/fishcooking/yWtpz_FY5_Y/RMTG56fkAAAJ ShashChess NNUE 1.0] by [[Andrea Manzo]], [[Computer Chess Forums|FishCooking]], July 25, 2020

* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=76394 Shashchess which executable to use] by Andrew Bernasrd, [[CCC]], January 23, 2021

* [https://groups.google.com/g/fishcooking/c/Iy1AlEZJWc8 New BrainLearn and ShashChess] by [[Andrea Manzo]], [[Computer Chess Forums|FishCooking]], May 19, 2021

=External Links=

GerdIsenberg

Bureaucrats, Administrators

25,161

edits

Changes

ShashChess

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools