Changes

Jump to: navigation, search

Eval Tuning in Deep Thought

10 bytes removed, 22:41, 17 May 2018
m
no edit summary
via [https://en.wikipedia.org/wiki/Partial_derivative partial differentiation] of this expression for each parameter <Ai>. This leads to a [https://en.wikipedia.org/wiki/System_of_linear_equations linear equation system] with one equation for each unknown parameter of DT's evaluation function. If the positions are sufficiently varied (they usually were), then this equation system can be solved and out come the best values for our evaluation parameters.
<br/><br/>
The trouble was, we did not have such an oracle. So the next best thing we had is the evaluation of DT. Murray made some initial guesses for each parameter <Ai> and we used that as a starting point. Obviously, if we use our own <E(P)> as an oracle, we get the same parameters out of the least square fit as we put in. So this is just a cumbersome way to compute the identity: New(Ai) = Old(Ai) for all <i=1..100>. However, this was a great debugging tool to see that we got the mathematics right.<br/>
<br/><br/>
In the tuning case, we did not just take the top-level evaluation, rather we let DT [[Search|search]] shallow 3[[Ply|ply]] trees with [[Quiescence Search|quiescence extensions]]. The evaluation function is then computed symbolically: rather then plugging in values, we propagated the feature vector of the best leaf node to the top. The search itself was controlled by the current best guess of the evaluation parameters. These were full [[Minimax|min/max]] searches, rather than [[Alpha-Beta|alpha/beta]] searches. The tuning program cannot actually search these trees because it does not know what a [[Legal Move|legal chess move]] is. Instead, the actual DT was used to pre-search these trees and the results were stored in a database (dbf_all). The tuning program merely traverses these trees.
# DT's evaluation concurs, that is: E(P0) > E(P1...19)
# DT evaluated some other move as best.
<br/><br/>
So the objective of our tuning procedure was to maximize the first case and minimize the second case.
<br/><br/>

Navigation menu