Changes

Jump to: navigation, search

ShashChess

152 bytes added, 09:03, 4 June 2021
no edit summary
was applied in ShashChess since version 12.0 <ref>[https://groups.google.com/g/fishcooking/c/GLag32ARtKo/m/3Zoaq3-rAwAJ ShashChess 12.0] by [[Andrea Manzo]], [[Computer Chess Forums|FishCooking]], June 28, 2020</ref>.
After the end of a decisive game in selfplay with appropriate [[Depth|search depth]], the [[Move List|list of moves]] (ml) and associated [[Score|scores]] is merged into the learn table from end to start, replacements preferring depth and score,
the score of timestep t adjusted as weighted average with the future reward of timestep t+1, using a [https://en.wikipedia.org/wiki/Q-learning#Learning_Rate learning rate] α of 0.5 and a [https://en.wikipedia.org/wiki/Q-learning#Discount_factor discount factor] γ of 0.99<ref>[https://github.com/amchess/ShashChess/blob/master/src/All/search.cpp#L2625 ShashChess/search.cpp at master · amchess/ShashChess · GitHub]</ref>:
<pre>
for (t = ml.size() - 2; t >= 0; t--) {

Navigation menu