Difference between revisions of "RuyTune"

From Chessprogramming wiki
Jump to: navigation, search
(Created page with "'''Home * Automated Tuning * RuyTune''' FILE:ruytunetanh.jpg|border|right|thumb|link=http://www.wolframalpha.com/input/?i=tanh(0.43s)+,+s%3D-10+to+10| Ruy...")
 
Line 1: Line 1:
 
'''[[Main Page|Home]] * [[Automated Tuning]] * RuyTune'''
 
'''[[Main Page|Home]] * [[Automated Tuning]] * RuyTune'''
  
[[FILE:ruytunetanh.jpg|border|right|thumb|link=http://www.wolframalpha.com/input/?i=tanh(0.43s)+,+s%3D-10+to+10| RuyTune's [https://en.wikipedia.org/wiki/Hyperbolic_function hyperbolic tanh] based Sigmoid ]]  
+
[[FILE:ruytunetanh.jpg|border|right|thumb|link=http://www.wolframalpha.com/input/?i=tanh(0.43s)+,+s%3D-10+to+10| RuyTune's [https://en.wikipedia.org/wiki/Hyperbolic_function hyperbolic tanh] based Sigmoid <ref>[http://www.wolframalpha.com/input/?i=tanh(0.43s)+,+s%3D-10+to+10 tanh(0.43s) , s=-10 to 10] pawnunit plot by [https://en.wikipedia.org/wiki/Wolfram_Alpha Wolfram Alpha]</ref> ]]  
  
 
'''RuyTune''',<br/>
 
'''RuyTune''',<br/>
Line 11: Line 11:
 
where:
 
where:
 
* N is the number of test positions.
 
* N is the number of test positions.
* Ri is the result of the game corresponding to position i; '''-1'''* for black win, '''0''' for draw and '''+1''' for white win.
+
* R<span style="vertical-align: sub;">i</span> is the result of the game corresponding to position i; '''-1'''* for black win, '''0''' for draw and '''+1''' for white win.
* qi is corresponding to position i, the [[Score|value]] returned by the chess engine evaluation function. (Computing the gradient on the [[Quiescence Search|QS]] is a waste of time - it is much faster to run the QS saving the [[Principal variation|PV]] and then compute the gradient using the evaluation function of the end-of-PV position - and not worry too much about the fact that tweaking the evaluation function could result in a different position being picked <ref>[http://www.talkchess.com/forum/viewtopic.php?t=64189&start=36 Re: Texel tuning method question] by [[Álvaro Begué]], [[CCC]], June 07, 2017</ref>).
+
* q<span style="vertical-align: sub;">i</span> is corresponding to position i, the [[Score|value]] returned by the chess engine evaluation function. (Computing the gradient on the [[Quiescence Search|QS]] is a waste of time - it is much faster to run the QS saving the [[Principal variation|PV]] and then compute the gradient using the evaluation function of the end-of-PV position - and not worry too much about the fact that tweaking the evaluation function could result in a different position being picked <ref>[http://www.talkchess.com/forum/viewtopic.php?t=64189&start=36 Re: Texel tuning method question] by [[Álvaro Begué]], [[CCC]], June 07, 2017</ref>).
* [https://en.wikipedia.org/wiki/Sigmoid_function Sigmoid] is implemented by [https://en.wikipedia.org/wiki/Hyperbolic_function hyperbolic tangent] to convert [[Centipawns|centipawn scores]] into an expected result in [-1,1] <ref>[http://www.wolframalpha.com/input/?i=tanh(0.43s)+,+s%3D-10+to+10 tanh(0.43s) , s=-10 to 10] pawnunit plot by [https://en.wikipedia.org/wiki/Wolfram_Alpha Wolfram Alpha]</ref>.  
+
* [https://en.wikipedia.org/wiki/Sigmoid_function Sigmoid] is implemented by [https://en.wikipedia.org/wiki/Hyperbolic_function hyperbolic tangent] to convert [[Centipawns|centipawn scores]] into an expected result in [-1,1].  
 
  <span style="font-size:120%;">Sigmoid(s) = tanh(0.0043s)</span>
 
  <span style="font-size:120%;">Sigmoid(s) = tanh(0.0043s)</span>
  

Revision as of 14:41, 25 October 2018

Home * Automated Tuning * RuyTune

RuyTune's hyperbolic tanh based Sigmoid [1]

RuyTune,
an open source framework for tuning evaluation function parameters, written by Álvaro Begué in C++, released on Bitbucket [2] as introduced in November 2016 [3]. RuyTune applies logistic regression using a limited-memory BFGS, a quasi-Newton method that approximates the Broyden–Fletcher–Goldfarb–Shanno algorithm with limited amount of memory. It uses the libLBFGS library [4] along with reverse-mode automatic differentiation and requires that the evaluation function is converted to a C++ template function where the score type is a template parameter, and a database of quiescent positions with associated results [5].

Method

The function to minimize the mean squared error of the prediction is:

TexelTuneMathE.jpg

where:

  • N is the number of test positions.
  • Ri is the result of the game corresponding to position i; -1* for black win, 0 for draw and +1 for white win.
  • qi is corresponding to position i, the value returned by the chess engine evaluation function. (Computing the gradient on the QS is a waste of time - it is much faster to run the QS saving the PV and then compute the gradient using the evaluation function of the end-of-PV position - and not worry too much about the fact that tweaking the evaluation function could result in a different position being picked [6]).
  • Sigmoid is implemented by hyperbolic tangent to convert centipawn scores into an expected result in [-1,1].
Sigmoid(s) = tanh(0.0043s)

See also

Forum Posts

Re: Texel tuning method question by Álvaro Begué, CCC, June 07, 2017

External Links

References

Up one Level