Changes

Jump to: navigation, search

Eval Tuning in Deep Thought

5 bytes removed, 22:33, 28 January 2019
no edit summary
'''[[Main Page|Home]] * [[Automated Tuning]] * Eval Tuning in Deep Thought'''
'''Eval Tuning in Deep Thought''',<br/>
this page is a formatted reprint of [[Andreas Nowatzyk|Andreas Nowatzyk's]] explanations of the [[Automated Tuning|Eval Tuning]] source code <ref>[http://www.tim-mann.org/DT_eval_tune.tar.gz Source code to tune Deep Thought's evaluation] in tar.gz format</ref> of [[Deep Thought]] from January 2000 <ref>[http://www.tim-mann.org/DT_eval_tune.txt Andreas Nowatzyk's explanations of the the source code], January 2000</ref> , hosted by [[Tim Mann]] <ref>[http://www.tim-mann.org/deepthought.html Deep Thought at Tim Mann's Chess Pages]</ref> :
[[Andreas Nowatzyk]] was one of the contributors to the [[Deep Thought]] project while he was in grad school. A few years ago when he and I were both working for Compaq's research labs in Palo Alto, Andreas sent me a copy of Deep Thought's evaluation function tuning program and asked me to put it on the Web for him, since he no longer has an interest in computer chess.
via [https://en.wikipedia.org/wiki/Partial_derivative partial differentiation] of this expression for each parameter <Ai>. This leads to a [https://en.wikipedia.org/wiki/System_of_linear_equations linear equation system] with one equation for each unknown parameter of DT's evaluation function. If the positions are sufficiently varied (they usually were), then this equation system can be solved and out come the best values for our evaluation parameters.
<br/><br/>
The trouble was, we did not have such an oracle. So the next best thing we had is the evaluation of DT. Murray made some initial guesses for each parameter <Ai> and we used that as a starting point. Obviously, if we use our own <E(P)> as an oracle, we get the same parameters out of the least square fit as we put in. So this is just a cumbersome way to compute the identity: New(Ai) = Old(Ai) for all <i=1..100>. However, this was a great debugging tool to see that we got the mathematics right.<br/>
<br/><br/>
In the tuning case, we did not just take the top-level evaluation, rather we let DT [[Search|search]] shallow 3[[Ply|ply]] trees with [[Quiescence Search|quiescence extensions]]. The evaluation function is then computed symbolically: rather then plugging in values, we propagated the feature vector of the best leaf node to the top. The search itself was controlled by the current best guess of the evaluation parameters. These were full [[Minimax|min/max]] searches, rather than [[Alpha-Beta|alpha/beta]] searches. The tuning program cannot actually search these trees because it does not know what a [[Legal Move|legal chess move]] is. Instead, the actual DT was used to pre-search these trees and the results were stored in a database (dbf_all). The tuning program merely traverses these trees.
# DT's evaluation concurs, that is: E(P0) > E(P1...19)
# DT evaluated some other move as best.
<br/><br/>
So the objective of our tuning procedure was to maximize the first case and minimize the second case.
<br/><br/>

Navigation menu