Changes

Jump to: navigation, search

Supervised Learning

7,445 bytes added, 20:38, 15 November 2019
Created page with "'''Home * Learning * Supervised Learning''' '''Supervised Learning''',<br/> is learning from examples provided by a knowledgable external [https://en.wikipe..."
'''[[Main Page|Home]] * [[Learning]] * Supervised Learning'''

'''Supervised Learning''',<br/>
is learning from examples provided by a knowledgable external [https://en.wikipedia.org/wiki/Supervisor supervisor].
In machine learning, supervised learning is a technique for deducing a function from [https://en.wikipedia.org/wiki/Training,_validation,_and_test_sets training data]. The training data consist of pairs of input objects and desired outputs <ref>[https://en.wikipedia.org/wiki/Supervised_learning Supervised learning from Wikipedia]</ref>. In computer games and chess, supervised learning techniques were used in [[Automated Tuning|automated tuning]] or to train [[Neural Networks|neural network]] game and chess programs. Input objects are [[Chess Position|chess positions]]. The desired output is either the supervisor's move choice in that position ([[Automated Tuning#MoveAdaption|move adaption]]), or a [[Score|score]] provided by an [[Oracle|oracle]] ([[Automated Tuning#ValueAdaption|value adaption]]).

=Move Adaption=
[[Automated Tuning#MoveAdaption|Move adaption]] applies [[Automated Tuning#LinearRegression|linear regression]] using a [https://en.wikipedia.org/wiki/Loss_function cost function]
to minimize the rank-number of the desired move in a [[Move List|move list]] ordered by score <ref>[[Tony Marsland]] ('''1985'''). ''Evaluation-Function Factors''. [[ICGA Journal#8_2|ICCA Journal, Vol. 8, No. 2]], [http://webdocs.cs.ualberta.ca/~tony/OldPapers/evaluation.pdf pdf]</ref>.

=Value Adaption=
One common idea to provide an [[Oracle|oracle]] for supervised [[Automated Tuning#ValueAdaption|value adaption]] is to use the win/draw/loss outcome from finished games
for all training positions selected from that game. Discrete {-1, 0, +1} or {0, ½, 1} desired values are the domain of [[Automated Tuning#LogisticRegression|logistic regression]] and require the
evaluation scores mapped from [[Pawn Advantage, Win Percentage, and Elo|pawn advantage]] to appropriate winning probabilities using the [https://en.wikipedia.org/wiki/Sigmoid_function sigmoid function]
to calculate a [https://en.wikipedia.org/wiki/Mean_squared_error mean squared error] of the cost function to minimize, as demonstrated by [[Texel's Tuning Method]].

=See also=
* [[Automated Tuning#SupervisedLearning|Supervised Learning]] in [[Automated Tuning]]
* [[Book Learning]]
* [[Chessmaps Heuristic]]
* [[CHREST]]
* [[Deep Learning]]
* [[Neural Networks]]
* [[Planning]]
* [[Reinforcement Learning]]
* [[Temporal Difference Learning]]

=Selected Publications=
==1960 ....==
* [[Arthur Samuel]] ('''1967'''). ''Some Studies in Machine Learning. Using the Game of Checkers. II-Recent Progress''. [http://researcher.watson.ibm.com/researcher/files/us-beygel/samuel-checkers.pdf pdf]
==1980 ...==
* [[Thomas Nitsche]] ('''1982'''). ''A Learning Chess Program.'' [[Advances in Computer Chess 3]]
* [[Tony Marsland]] ('''1985'''). ''Evaluation-Function Factors''. [[ICGA Journal#8_2|ICCA Journal, Vol. 8, No. 2]], [http://webdocs.cs.ualberta.ca/~tony/OldPapers/evaluation.pdf pdf]
* [[Eric B. Baum]], [https://en.wikipedia.org/wiki/Frank_Wilczek Frank Wilczek] ('''1987'''). ''[http://papers.nips.cc/paper/3-supervised-learning-of-probability-distributions-by-neural-networks Supervised Learning of Probability Distributions by Neural Networks]''. [http://papers.nips.cc/book/neural-information-processing-systems-1987 NIPS 1987]
==1990 ...==
* [[Michèle Sebag]] ('''1990'''). ''A symbolic-numerical approach for supervised learning from examples and rules''. Ph.D. thesis, [https://en.wikipedia.org/wiki/Paris_Dauphine_University Paris Dauphine University]
* [[Feng-hsiung Hsu]], [[Thomas Anantharaman]], [[Murray Campbell]], [[Andreas Nowatzyk]] ('''1990'''). ''[http://www.disi.unige.it/person/DelzannoG/AI2/hsu.html A Grandmaster Chess Machine]''. [[Scientific American]], Vol. 263, No. 4
* [[Thomas Anantharaman]] ('''1997'''). ''Evaluation Tuning for Computer Chess: Linear Discriminant Methods''. [[ICGA Journal#20_4|ICCA Journal, Vol. 20, No. 4]]
==2000 ...==
* [[Michael Buro]] ('''2002'''). ''Improving Mini-max Search by Supervised Learning.'' [https://en.wikipedia.org/wiki/Artificial_Intelligence_(journal) Artificial Intelligence], Vol. 134, No. 1, [http://www.cs.ualberta.ca/%7Emburo/ps/logaij.pdf pdf]
* [[Dave Gomboc]], [[Michael Buro]], [[Tony Marsland]] ('''2005'''). ''Tuning Evaluation Functions by Maximizing Concordance''. [https://en.wikipedia.org/wiki/Theoretical_Computer_Science_%28journal%29 Theoretical Computer Science], Vol. 349, No. 2, [http://www.cs.ualberta.ca/%7Emburo/ps/tcs-learn.pdf pdf]
==2010 ...==
* [[Tor Lattimore]], [[Marcus Hutter]] ('''2011'''). ''No Free Lunch versus Occam's Razor in Supervised Learning''. [https://en.wikipedia.org/wiki/Ray_Solomonoff Solomonoff] Memorial, [https://en.wikipedia.org/wiki/Lecture_Notes_in_Computer_Science Lecture Notes in Computer Science], [https://en.wikipedia.org/wiki/Springer-Verlag Springer], [https://arxiv.org/abs/1111.3846 arXiv:1111.3846] <ref>[https://en.wikipedia.org/wiki/No_free_lunch_in_search_and_optimization No free lunch in search and optimization - Wikipedia]</ref> <ref>[https://en.wikipedia.org/wiki/Occam%27s_razor Occam's razor from Wikipedia]</ref>
* [[Wen-Jie Tseng]], [[Jr-Chang Chen]], [[I-Chen Wu]], [[Ching-Hua Kuo]], [[Bo-Han Lin]] ('''2013'''). ''A Supervised Learning Method for Chinese Chess Programs''. [http://2013.conf.ai-gakkai.or.jp/english-info JSAI2013]
* [[Kunihito Hoki]], [[Tomoyuki Kaneko]] ('''2014'''). ''[https://www.jair.org/papers/paper4217.html Large-Scale Optimization for Evaluation Functions with Minimax Search]''. [https://www.jair.org/vol/vol49.html JAIR Vol. 49], [https://pdfs.semanticscholar.org/eb9c/173576577acbb8800bf96aba452d77f1dc19.pdf pdf]
* [[Christopher Clark]], [[Amos Storkey]] ('''2014'''). ''Teaching Deep Convolutional Neural Networks to Play Go''. [http://arxiv.org/abs/1412.3409 arXiv:1412.3409]
* [[Wen-Jie Tseng]], [[Jr-Chang Chen]], [[I-Chen Wu]], [[Tinghan Wei]] ('''2018'''). ''Comparison Training for Computer Chinese Chess''. [https://arxiv.org/abs/1801.07411 arXiv:1801.07411]

=Forum Posts=
* [http://www.talkchess.com/forum/viewtopic.php?t=27266&postdays=0&postorder=asc&topic_view=&start=11 Re: Insanity... or Tal style?] by [[Miguel A. Ballicora]], [[CCC]], April 02, 2009 » [[Gaviota]]
* [http://www.talkchess.com/forum/viewtopic.php?t=50823&start=10 Re: How Do You Automatically Tune Your Evaluation Tables] by [[Álvaro Begué]], [[CCC]], January 08, 2014
* [http://www.talkchess.com/forum/viewtopic.php?t=50823&start=26 The texel evaluation function optimization algorithm] by [[Peter Österlund]], [[CCC]], January 31, 2014 » [[Texel's Tuning Method]]
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=70611 SL vs RL] by [[Chris Whittington]], [[CCC]], April 28, 2019

=External Links=
* [https://en.wikipedia.org/wiki/Supervised_learning Supervised learning from Wikipedia]
* [http://www.scholarpedia.org/article/Category:Supervised_learning Category: Supervised learning - Scholarpedia]
* [https://en.wikipedia.org/wiki/Boosting_%28machine_learning%29 Boosting (machine learning) from Wikipedia]
* [https://en.wikipedia.org/wiki/Computational_learning_theory Computational learning theory from Wikipedia]
* [https://en.wikipedia.org/wiki/Support_vector_machine Support vector machine from Wikipedia]

=References=
<references />
'''[[Learning|Up one Level]]'''

Navigation menu