Changes

Jump to: navigation, search

ShashChess

171 bytes added, 21:48, 5 June 2021
no edit summary
and also employed in [[BrainLearn]] 9.0 <ref>[https://github.com/amchess/BrainLearn/releases/tag/9.0 Release BrainLearn 9.0 · amchess/BrainLearn · GitHub]</ref>,
was applied in ShashChess since version 12.0 <ref>[https://groups.google.com/g/fishcooking/c/GLag32ARtKo/m/3Zoaq3-rAwAJ ShashChess 12.0] by [[Andrea Manzo]], [[Computer Chess Forums|FishCooking]], June 28, 2020</ref>.
After the end of a decisive selfplay game in selfplay with appropriate [[Depth|search depth]], the [[Move List|list of moves]] (ml) and associated [[Score|scores]] is merged into the learn table from end to start, replacements preferring depth and score,
the score of timestep t adjusted as weighted average with the future reward of timestep t+1, using a [https://en.wikipedia.org/wiki/Q-learning#Learning_Rate learning rate] α of 0.5 and a [https://en.wikipedia.org/wiki/Q-learning#Discount_factor discount factor] γ of 0.99 <ref>[https://github.com/amchess/ShashChess/blob/master/src/All/search.cpp#L2625 ShashChess/search.cpp at master · amchess/ShashChess · GitHub]</ref>:
<pre>
}
</pre>
During repeated selfplay games, subsequently playing along the learned best line so far, decreasing score adjustments will stimulate exploration of alternative siblings, while increasing score adjustments correspondents to exploitation of the best move.
=Forum Posts=

Navigation menu