Revision as of 22:13, 15 November 2019

Home * Learning * Supervised Learning

Supervised Learning,
is learning from examples provided by a knowledgable external supervisor. In machine learning, supervised learning is a technique for deducing a function from training data. The training data consist of pairs of input objects and desired outputs ^[1]. In computer games and chess, supervised learning techniques were used in automated tuning or to train neural network game and chess programs. Input objects are chess positions. The desired output is either the supervisor's move choice in that position (move adaption), or a score provided by an oracle (value adaption).

Move Adaption

Move adaption can be applied by linear regression to minimize a cost function considering the rank-number of the desired move in a move list ordered by score ^[2].

Value Adaption

One common idea to provide an oracle for supervised value adaption is to use the win/draw/loss outcome from finished games for all training positions selected from that game. Discrete {-1, 0, +1} or {0, ½, 1} desired values are the domain of logistic regression and require the evaluation scores mapped from pawn advantage to appropriate winning probabilities using the sigmoid function to calculate a mean squared error of the cost function to minimize, as demonstrated by Texel's Tuning Method.

Selected Publications

1960 ....

Arthur Samuel (1967). Some Studies in Machine Learning. Using the Game of Checkers. II-Recent Progress. pdf

1980 ...

Thomas Nitsche (1982). A Learning Chess Program. Advances in Computer Chess 3
Tony Marsland (1985). Evaluation-Function Factors. ICCA Journal, Vol. 8, No. 2, pdf
Eric B. Baum, Frank Wilczek (1987). Supervised Learning of Probability Distributions by Neural Networks. NIPS 1987
Maarten van der Meulen (1989). Weight Assessment in Evaluation Functions. Advances in Computer Chess 5

1990 ...

Michèle Sebag (1990). A symbolic-numerical approach for supervised learning from examples and rules. Ph.D. thesis, Paris Dauphine University
Feng-hsiung Hsu, Thomas Anantharaman, Murray Campbell, Andreas Nowatzyk (1990). A Grandmaster Chess Machine. Scientific American, Vol. 263, No. 4
Thomas Anantharaman (1997). Evaluation Tuning for Computer Chess: Linear Discriminant Methods. ICCA Journal, Vol. 20, No. 4

2000 ...

Michael Buro (2002). Improving Mini-max Search by Supervised Learning. Artificial Intelligence, Vol. 134, No. 1, pdf
Dave Gomboc, Michael Buro, Tony Marsland (2005). Tuning Evaluation Functions by Maximizing Concordance. Theoretical Computer Science, Vol. 349, No. 2, pdf

2010 ...

Tor Lattimore, Marcus Hutter (2011). No Free Lunch versus Occam's Razor in Supervised Learning. Solomonoff Memorial, Lecture Notes in Computer Science, Springer, arXiv:1111.3846 ^[3] ^[4]
Wen-Jie Tseng, Jr-Chang Chen, I-Chen Wu, Ching-Hua Kuo, Bo-Han Lin (2013). A Supervised Learning Method for Chinese Chess Programs. JSAI2013
Kunihito Hoki, Tomoyuki Kaneko (2014). Large-Scale Optimization for Evaluation Functions with Minimax Search. JAIR Vol. 49, pdf
Christopher Clark, Amos Storkey (2014). Teaching Deep Convolutional Neural Networks to Play Go. arXiv:1412.3409
Wen-Jie Tseng, Jr-Chang Chen, I-Chen Wu, Tinghan Wei (2018). Comparison Training for Computer Chinese Chess. arXiv:1801.07411

Forum Posts

Re: Insanity... or Tal style? by Miguel A. Ballicora, CCC, April 02, 2009 » Gaviota
Re: How Do You Automatically Tune Your Evaluation Tables by Álvaro Begué, CCC, January 08, 2014
The texel evaluation function optimization algorithm by Peter Österlund, CCC, January 31, 2014 » Texel's Tuning Method
SL vs RL by Chris Whittington, CCC, April 28, 2019

External Links

References

↑ Supervised learning from Wikipedia
↑ Tony Marsland (1985). Evaluation-Function Factors. ICCA Journal, Vol. 8, No. 2, pdf
↑ No free lunch in search and optimization - Wikipedia
↑ Occam's razor from Wikipedia

Up one Level

[1] Supervised learning from Wikipedia

[2] Tony Marsland (1985). Evaluation-Function Factors. ICCA Journal, Vol. 8, No. 2, pdf

[3] No free lunch in search and optimization - Wikipedia

[4] Occam's razor from Wikipedia

[1]

[2]

[3]

[4]

@@ Line 6: / Line 6: @@
 =Move Adaption=
-[[Automated Tuning#MoveAdaption|Move adaption]] applies [[Automated Tuning#LinearRegression|linear regression]] using a [https://en.wikipedia.org/wiki/Loss_function cost function]
+[[Automated Tuning#MoveAdaption|Move adaption]] can be applied by [[Automated Tuning#LinearRegression|linear regression]] to minimize a [https://en.wikipedia.org/wiki/Loss_function cost function] considering the rank-number of the desired move in a [[Move List|move list]] ordered by score <ref>[[Tony Marsland]] ('''1985'''). ''Evaluation-Function Factors''. [[ICGA Journal#8_2|ICCA Journal, Vol. 8, No. 2]], [http://webdocs.cs.ualberta.ca/~tony/OldPapers/evaluation.pdf pdf]</ref>.
-to minimize the rank-number of the desired move in a [[Move List|move list]] ordered by score <ref>[[Tony Marsland]] ('''1985'''). ''Evaluation-Function Factors''. [[ICGA Journal#8_2|ICCA Journal, Vol. 8, No. 2]], [http://webdocs.cs.ualberta.ca/~tony/OldPapers/evaluation.pdf pdf]</ref>.
 =Value Adaption=

Difference between revisions of "Supervised Learning"

Revision as of 22:13, 15 November 2019

Contents

Move Adaption

Value Adaption

See also

Selected Publications

1960 ....

1980 ...

1990 ...

2000 ...

2010 ...

Forum Posts

External Links

References

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools