Difference between revisions of "Learning"

From Chessprogramming wiki
Jump to: navigation, search
(28 intermediate revisions by the same user not shown)
Line 10: Line 10:
  
 
=Learning Paradigms=  
 
=Learning Paradigms=  
There are three major learning [https://en.wikipedia.org/wiki/Paradigm paradigms], each corresponding to a particular abstract learning task. These are [https://en.wikipedia.org/wiki/Supervised_learning supervised learning], [https://en.wikipedia.org/wiki/Unsupervised_learning unsupervised learning] and [[Reinforcement Learning|reinforcement learning]]. Usually any given type of [[Neural Networks|neural network]] architecture can be employed in any of those tasks.
+
There are three major learning [https://en.wikipedia.org/wiki/Paradigm paradigms], each corresponding to a particular abstract learning task. These are [[Supervised Learning|supervised learning]], [https://en.wikipedia.org/wiki/Unsupervised_learning unsupervised learning] and [[Reinforcement Learning|reinforcement learning]]. Usually any given type of [[Neural Networks|neural network]] architecture can be employed in any of those tasks.
  
 
==Supervised Learning==  
 
==Supervised Learning==  
 +
''see main page [[Supervised Learning]]''
 +
 
Supervised learning is learning from examples provided by a knowledgable external supervisor. In machine learning, supervised learning is a technique for deducing a function from training data. The training data consist of pairs of input objects and desired outputs, f.i. in computer chess a sequence of positions associated with the outcome of a game <ref>[http://www.aihorizon.com/essays/generalai/supervised_unsupervised_machine_learning.htm AI Horizon: Machine Learning, Part II: Supervised and Unsupervised Learning]</ref> .
 
Supervised learning is learning from examples provided by a knowledgable external supervisor. In machine learning, supervised learning is a technique for deducing a function from training data. The training data consist of pairs of input objects and desired outputs, f.i. in computer chess a sequence of positions associated with the outcome of a game <ref>[http://www.aihorizon.com/essays/generalai/supervised_unsupervised_machine_learning.htm AI Horizon: Machine Learning, Part II: Supervised and Unsupervised Learning]</ref> .
  
Line 37: Line 39:
 
* [[Planning]]
 
* [[Planning]]
 
* [[Reinforcement Learning]]
 
* [[Reinforcement Learning]]
 +
* [[Supervised Learning]]
 
* [[Temporal Difference Learning]]
 
* [[Temporal Difference Learning]]
 
<span id="Programs"></span>
 
<span id="Programs"></span>
 
=Programs=
 
=Programs=
 +
* [[Allie]]
 
* [[AlphaZero]]
 
* [[AlphaZero]]
 
* [[Alexs]]
 
* [[Alexs]]
Line 130: Line 134:
 
* [[Ross Quinlan]] ('''1979'''). ''Discovering Rules by Induction from Large Collections of Examples''. Expert Systems in the Micro-electronic Age, pp. 168-201. Edinburgh University Press (Introducing ID3)
 
* [[Ross Quinlan]] ('''1979'''). ''Discovering Rules by Induction from Large Collections of Examples''. Expert Systems in the Micro-electronic Age, pp. 168-201. Edinburgh University Press (Introducing ID3)
 
==1980 ...==  
 
==1980 ...==  
* [[Sarah E. Goldin]],  [http://www.linkedin.com/pub/phil-klahr/8/72b/676/de Philip Klahr] ('''1981'''). ''[http://dl.acm.org/citation.cfm?id=1623197 Learning and Abstraction in Simulation]''. [http://www.informatik.uni-trier.de/~ley/db/conf/ijcai/ijcai81.html#GoldinK81  IJCAI 1981], [http://ijcai.org/Past%20Proceedings/IJCAI-81-VOL%201/PDF/042.pdf pdf]
+
* [[Sarah E. Goldin]],  [http://www.linkedin.com/pub/phil-klahr/8/72b/676/de Philip Klahr] ('''1981'''). ''[http://dl.acm.org/citation.cfm?id=1623197 Learning and Abstraction in Simulation]''. [[Conferences#IJCAI1981|IJCAI 1981]], [http://ijcai.org/Past%20Proceedings/IJCAI-81-VOL%201/PDF/042.pdf pdf]
 
* [[Paul E. Utgoff]], [[Tom Mitchell]] ('''1982'''). ''Acquisition of Appropriate Bias for Inductive Concept Learning''. [http://dblp.uni-trier.de/db/conf/aaai/aaai82.html#UtgoffM82 AAAI 1982], [https://www.aaai.org/Papers/AAAI/1982/AAAI82-099.pdf pdf]
 
* [[Paul E. Utgoff]], [[Tom Mitchell]] ('''1982'''). ''Acquisition of Appropriate Bias for Inductive Concept Learning''. [http://dblp.uni-trier.de/db/conf/aaai/aaai82.html#UtgoffM82 AAAI 1982], [https://www.aaai.org/Papers/AAAI/1982/AAAI82-099.pdf pdf]
 
* [[A. Harry Klopf]] ('''1982'''). ''The Hedonistic Neuron: A Theory of Memory, Learning, and Intelligence''. Hemisphere Publishing Corporation, [[University of Michigan]]
 
* [[A. Harry Klopf]] ('''1982'''). ''The Hedonistic Neuron: A Theory of Memory, Learning, and Intelligence''. Hemisphere Publishing Corporation, [[University of Michigan]]
Line 145: Line 149:
 
==1985 ...==  
 
==1985 ...==  
 
* [[Tony Marsland]] ('''1985'''). ''Evaluation-Function Factors''. [[ICGA Journal#8_2|ICCA Journal, Vol. 8, No. 2]], [http://webdocs.cs.ualberta.ca/~tony/OldPapers/evaluation.pdf pdf]
 
* [[Tony Marsland]] ('''1985'''). ''Evaluation-Function Factors''. [[ICGA Journal#8_2|ICCA Journal, Vol. 8, No. 2]], [http://webdocs.cs.ualberta.ca/~tony/OldPapers/evaluation.pdf pdf]
* [[Albrecht Heeffer]] ('''1985'''). ''Validating Concepts from Automated Acquisition Systems''. [[Conferences#IJCAI|IJCAI 85]], [http://ijcai.org/Past%20Proceedings/IJCAI-85-VOL1/PDF/118.pdf pdf]
+
* [[Albrecht Heeffer]] ('''1985'''). ''Validating Concepts from Automated Acquisition Systems''. [[Conferences#IJCAI1985|IJCAI 1985]], [http://ijcai.org/Past%20Proceedings/IJCAI-85-VOL1/PDF/118.pdf pdf]
 
* [[Hans Berliner]] ('''1985'''). ''Goals, Plans, and Mechanisms: Non-symbolically in an Evaluation Surface.'' Presentation at Evolution, Games, and Learning, Center for Nonlinear Studies, [[Los Alamos National Laboratory]], May 21.
 
* [[Hans Berliner]] ('''1985'''). ''Goals, Plans, and Mechanisms: Non-symbolically in an Evaluation Surface.'' Presentation at Evolution, Games, and Learning, Center for Nonlinear Studies, [[Los Alamos National Laboratory]], May 21.
 
* [[Ryszard Michalski]], [[Jaime Carbonell]], [[Tom Mitchell]] ('''1985, 2014'''). ''[https://www.elsevier.com/books/machine-learning/michalski/978-0-08-051054-5?gclid=EAIaIQobChMItc_hsp_34AIVUeR3Ch2l9QcDEAYYASABEgKW4_D_BwEMachine Learning: An Artificial Intelligence Approach, Volume I]''. [https://en.wikipedia.org/wiki/Morgan_Kaufmann_Publishers Morgan Kaufmann]
 
* [[Ryszard Michalski]], [[Jaime Carbonell]], [[Tom Mitchell]] ('''1985, 2014'''). ''[https://www.elsevier.com/books/machine-learning/michalski/978-0-08-051054-5?gclid=EAIaIQobChMItc_hsp_34AIVUeR3Ch2l9QcDEAYYASABEgKW4_D_BwEMachine Learning: An Artificial Intelligence Approach, Volume I]''. [https://en.wikipedia.org/wiki/Morgan_Kaufmann_Publishers Morgan Kaufmann]
Line 160: Line 164:
 
* [[Gerald Tesauro]], [[Terrence J. Sejnowski]] ('''1987'''). ''A 'Neural' Network that Learns to Play Backgammon''. [http://www.informatik.uni-trier.de/~ley/db/conf/nips/nips1987.html#TesauroS87 NIPS 1987]
 
* [[Gerald Tesauro]], [[Terrence J. Sejnowski]] ('''1987'''). ''A 'Neural' Network that Learns to Play Backgammon''. [http://www.informatik.uni-trier.de/~ley/db/conf/nips/nips1987.html#TesauroS87 NIPS 1987]
 
*  [[Alen Shapiro]] ('''1987'''). ''Structured Induction in Expert Systems''. Turing Institute Press in association with Addison-Wesley Publishing Company, Workingham, UK
 
*  [[Alen Shapiro]] ('''1987'''). ''Structured Induction in Expert Systems''. Turing Institute Press in association with Addison-Wesley Publishing Company, Workingham, UK
* [[Alberto Maria Segre]] ('''1987'''). ''On the Operationality/Generality Trade-off in Explanation-based Learning''. [http://dblp.uni-trier.de/db/conf/ijcai/ijcai87.html IJCAI 1987], [http://ijcai.org/Past%20Proceedings/IJCAI-87-VOL1/PDF/049.pdf pdf]
+
* [[Alberto Maria Segre]] ('''1987'''). ''On the Operationality/Generality Trade-off in Explanation-based Learning''. [[Conferences#IJCAI1987|IJCAI 1987]], [http://ijcai.org/Past%20Proceedings/IJCAI-87-VOL1/PDF/049.pdf pdf]
 
* [[Alberto Maria Segre]] ('''1987'''). ''Explanation-Based Learning of Generalized Robot Assembly Plans''. Ph.D. thesis, [[University of Illinois at Urbana-Champaign]], Advisor: [http://www.ece.illinois.edu/directory/profile.asp?mrebl Gerald Francis DeJong, II]
 
* [[Alberto Maria Segre]] ('''1987'''). ''Explanation-Based Learning of Generalized Robot Assembly Plans''. Ph.D. thesis, [[University of Illinois at Urbana-Champaign]], Advisor: [http://www.ece.illinois.edu/directory/profile.asp?mrebl Gerald Francis DeJong, II]
 
* [[Eric B. Baum]], [https://en.wikipedia.org/wiki/Frank_Wilczek Frank Wilczek] ('''1987'''). ''[http://papers.nips.cc/paper/3-supervised-learning-of-probability-distributions-by-neural-networks Supervised Learning of Probability Distributions by Neural Networks]''. [http://papers.nips.cc/book/neural-information-processing-systems-1987 NIPS 1987]
 
* [[Eric B. Baum]], [https://en.wikipedia.org/wiki/Frank_Wilczek Frank Wilczek] ('''1987'''). ''[http://papers.nips.cc/paper/3-supervised-learning-of-probability-distributions-by-neural-networks Supervised Learning of Probability Distributions by Neural Networks]''. [http://papers.nips.cc/book/neural-information-processing-systems-1987 NIPS 1987]
Line 166: Line 170:
 
* [[Bruce Abramson]] ('''1988'''). ''Learning Expected-Outcome Evaluators in Chess.'' Proceedings of the 1988 AAAI Spring Symposium Series: Computer Game Playing, 26-28.
 
* [[Bruce Abramson]] ('''1988'''). ''Learning Expected-Outcome Evaluators in Chess.'' Proceedings of the 1988 AAAI Spring Symposium Series: Computer Game Playing, 26-28.
 
* [[Richard Sutton]] ('''1988'''). ''Learning to Predict by the Methods of Temporal Differences''. [https://en.wikipedia.org/wiki/Machine_Learning_%28journal%29 Machine Learning], Vol. 3, No. 1, [https://webdocs.cs.ualberta.ca/~sutton/papers/sutton-88-with-erratum.pdf pdf]
 
* [[Richard Sutton]] ('''1988'''). ''Learning to Predict by the Methods of Temporal Differences''. [https://en.wikipedia.org/wiki/Machine_Learning_%28journal%29 Machine Learning], Vol. 3, No. 1, [https://webdocs.cs.ualberta.ca/~sutton/papers/sutton-88-with-erratum.pdf pdf]
* [[David E. Goldberg]], [[Mathematician#Holland|John H. Holland]] ('''1988'''). ''[http://www.springerlink.com/content/rw3572714v41q507/ Genetic Algorithms and Machine Learning]''. [https://en.wikipedia.org/wiki/Machine_Learning_%28journal%29 Machine Learning], Vol. 3
+
* [[David E. Goldberg]], [[Mathematician#Holland|John H. Holland]] ('''1988'''). ''[https://link.springer.com/article/10.1023/A:1022602019183 Genetic Algorithms and Machine Learning]''. [https://www.springer.com/journal/10994 Machine Learning], Vol. 3
 
* [[Mathematician#KADeJong|Kenneth A. De Jong]], [[Mathematician#ACSchultz|Alan C. Schultz]] ('''1988'''). ''Using Experience-Based Learning in Game Playing''. Proceedings of the Fifth International Machine Learning Conference, [http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.52.5381 CiteSeerX] » [[Othello]]
 
* [[Mathematician#KADeJong|Kenneth A. De Jong]], [[Mathematician#ACSchultz|Alan C. Schultz]] ('''1988'''). ''Using Experience-Based Learning in Game Playing''. Proceedings of the Fifth International Machine Learning Conference, [http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.52.5381 CiteSeerX] » [[Othello]]
 
* [[Kai-Fu Lee]], [[Sanjoy Mahajan]] ('''1988'''). ''[http://www.sciencedirect.com/science/article/pii/0004370288900768 A Pattern Classification Approach to Evaluation Function Learning]''. [https://en.wikipedia.org/wiki/Artificial_Intelligence_%28journal%29 Artificial Intelligence], Vol. 36, No. 1
 
* [[Kai-Fu Lee]], [[Sanjoy Mahajan]] ('''1988'''). ''[http://www.sciencedirect.com/science/article/pii/0004370288900768 A Pattern Classification Approach to Evaluation Function Learning]''. [https://en.wikipedia.org/wiki/Artificial_Intelligence_%28journal%29 Artificial Intelligence], Vol. 36, No. 1
 
* [[Paul E. Utgoff]] ('''1988'''). ''[http://dl.acm.org/citation.cfm?id=896712 ID5: An incremental ID3]''. [http://dblp.uni-trier.de/db/conf/icml/ml1988.html#Utgoff88 ML 1988]  
 
* [[Paul E. Utgoff]] ('''1988'''). ''[http://dl.acm.org/citation.cfm?id=896712 ID5: An incremental ID3]''. [http://dblp.uni-trier.de/db/conf/icml/ml1988.html#Utgoff88 ML 1988]  
 +
* [[Shaul Markovitch]], [[Mathematician#PDScott|Paul D. Scott]] ('''1988'''). ''[https://www.semanticscholar.org/paper/The-Role-of-Forgetting-in-Learning-Markovitch-Scott/adbd75db1f85dd3545b4d6b8bba509bf20d7bfce The Role of Forgetting in Learning]''. [https://dblp.uni-trier.de/db/conf/icml/ml1988.html ML 1988], [http://www.cs.technion.ac.il/~shaulm/papers/pdf/Markovitch-Scott-icml1988.pdf pdf]
 
'''1989'''
 
'''1989'''
 +
* [[David E. Goldberg]] ('''1989'''). ''Genetic Algorithms in Search, Optimization and Machine Learning''. [https://en.wikipedia.org/wiki/Addison-Wesley Addison-Wesley]
 
* [[Robert Levinson]] ('''1989'''). ''A Self-Learning, Pattern-Oriented Chess Program''. [[ICGA Journal#12_4|ICCA Journal, Vol. 12, No. 4]]
 
* [[Robert Levinson]] ('''1989'''). ''A Self-Learning, Pattern-Oriented Chess Program''. [[ICGA Journal#12_4|ICCA Journal, Vol. 12, No. 4]]
 
* [[Bruce Abramson]] ('''1989'''). ''On Learning and Testing Evaluation Functions.'' Proceedings of the Sixth Israeli Conference on Artificial Intelligence, 1989, 7-16.
 
* [[Bruce Abramson]] ('''1989'''). ''On Learning and Testing Evaluation Functions.'' Proceedings of the Sixth Israeli Conference on Artificial Intelligence, 1989, 7-16.
Line 206: Line 212:
 
'''1993'''
 
'''1993'''
 
* [[Michael Gherrity]] ('''1993'''). ''A Game Learning Machine''. Ph.D. Thesis, [http://de.wikipedia.org/wiki/University_of_California,_San_Diego University of California, San Diego], [http://www.gherrity.org/thesis.ps.gz zipped ps]
 
* [[Michael Gherrity]] ('''1993'''). ''A Game Learning Machine''. Ph.D. Thesis, [http://de.wikipedia.org/wiki/University_of_California,_San_Diego University of California, San Diego], [http://www.gherrity.org/thesis.ps.gz zipped ps]
* [[Shaul Markovitch]], [http://www.cs.huji.ac.il/labs/danss/Fairplay/ Yaron Sella] ('''1993'''). ''Learning of Resource Allocation Strategies for Game Playing'', The proceedings of the 13th International Joint Conference on Artificial Intelligence, Chambery, France. [http://www.cs.technion.ac.il/~shaulm/papers/pdf/Markovitch-Sella-coin1996.pdf pdf]
+
* [[Shaul Markovitch]], [[Yaron Sella]] ('''1993'''). ''[https://onlinelibrary.wiley.com/doi/abs/10.1111/j.1467-8640.1996.tb00254.x Learning of Resource Allocation Strategies for Game Playing]''. [[Conferences#IJCAI1993|IJCAI 1993]], [https://www.ijcai.org/Proceedings/93-2/Papers/020.pdf pdf]
* [[David Carmel]], [[Shaul Markovitch]] ('''1993'''). ''Learning Models of Opponent's Strategy in Game Playing''. [[AAAI]] Proceedings, [http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.55.6488 CiteSeerX]
+
* [[Mathematician#DCarmel|David Carmel]], [[Shaul Markovitch]] ('''1993'''). ''[https://aaai.org/Library/Symposia/Fall/1993/fs93-02-019.php Learning Models of Opponent's Strategy in Game Playing]''. [[Conferences#AAAI-93|AAAI 1993]], FS-93-02, [https://www.aaai.org/Papers/Symposia/Fall/1993/FS-93-02/FS93-02-019.pdf pdf]
* [[Dan Geiger]], [[Azaria Paz]], [[Judea Pearl]] ('''1993'''). ''Learning simple causal structures''. [http://onlinelibrary.wiley.com/journal/10.1002/%28ISSN%291098-111X International Journal of Intelligent Systems], 8, pp. 231-247.
+
* [[Mathematician#DGeiger|Dan Geiger]], [[Mathematician#APaz|Azaria Paz]], [[Judea Pearl]] ('''1993'''). ''Learning simple causal structures''. [http://onlinelibrary.wiley.com/journal/10.1002/%28ISSN%291098-111X International Journal of Intelligent Systems], Vol. 8
 
* [[Sebastian Thrun]], [[Tom Mitchell]] ('''1993'''). ''Integrating Inductive Neural Network Learning and Explanation-Based Learning''. [[Conferences#IJCAI1993|IJCAI 1993]], [http://robots.stanford.edu/papers/thrun.EBNN_ijcai93.ps.gz zipped ps]
 
* [[Sebastian Thrun]], [[Tom Mitchell]] ('''1993'''). ''Integrating Inductive Neural Network Learning and Explanation-Based Learning''. [[Conferences#IJCAI1993|IJCAI 1993]], [http://robots.stanford.edu/papers/thrun.EBNN_ijcai93.ps.gz zipped ps]
 
* [[Alois Heinz]], [[Christoph Hense]] ('''1993'''). ''[http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.56.872 Bootstrap learning of α-β-evaluation functions]''. [http://dblp.uni-trier.de/db/conf/icci/icci1993.html#HeinzH93 ICCI 1993], [http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.56.872&rep=rep1&type=pdf pdf]
 
* [[Alois Heinz]], [[Christoph Hense]] ('''1993'''). ''[http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.56.872 Bootstrap learning of α-β-evaluation functions]''. [http://dblp.uni-trier.de/db/conf/icci/icci1993.html#HeinzH93 ICCI 1993], [http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.56.872&rep=rep1&type=pdf pdf]
 +
* [[Nicol N. Schraudolph]], [[Peter Dayan]], [[Terrence J. Sejnowski]] ('''1993'''). ''[https://papers.nips.cc/paper/820-temporal-difference-learning-of-position-evaluation-in-the-game-of-go Temporal Difference Learning of Position Evaluation in the Game of Go]''. [https://papers.nips.cc/book/advances-in-neural-information-processing-systems-6-1993 NIPS 1993]
 
'''1994'''
 
'''1994'''
 
* [[Eduardo F. Morales]] ('''1994'''). ''Learning Patterns for Playing Strategies''. [[ICGA Journal#17_1|ICCA Journal, Vol. 17, No. 1]]
 
* [[Eduardo F. Morales]] ('''1994'''). ''Learning Patterns for Playing Strategies''. [[ICGA Journal#17_1|ICCA Journal, Vol. 17, No. 1]]
Line 220: Line 227:
 
* [[Alberto Maria Segre]], [[Charles Elkan]] ('''1994'''). ''A High-Performance Explanation-Based Learning Algorithm''. [https://en.wikipedia.org/wiki/Artificial_Intelligence_%28journal%29 Artificial Intelligence], Vol. 68, Nos. 1-2
 
* [[Alberto Maria Segre]], [[Charles Elkan]] ('''1994'''). ''A High-Performance Explanation-Based Learning Algorithm''. [https://en.wikipedia.org/wiki/Artificial_Intelligence_%28journal%29 Artificial Intelligence], Vol. 68, Nos. 1-2
 
* [[David E. Moriarty]], [[Risto Miikkulainen]] ('''1994'''). ''Evolving Neural Networks to focus Minimax Search''. [[AAAI|AAAI-94]], [http://www.cs.utexas.edu/~ai-lab/pubs/moriarty.focus.pdf pdf]
 
* [[David E. Moriarty]], [[Risto Miikkulainen]] ('''1994'''). ''Evolving Neural Networks to focus Minimax Search''. [[AAAI|AAAI-94]], [http://www.cs.utexas.edu/~ai-lab/pubs/moriarty.focus.pdf pdf]
* [[Nicol N. Schraudolph]], [[Peter Dayan]], [[Terrence J. Sejnowski]] ('''1994'''). ''[http://nic.schraudolph.org/bib2html/b2hd-SchDaySej94.html Temporal Difference Learning of Position Evaluation in the Game of Go]''. [http://papers.nips.cc/book/advances-in-neural-information-processing-systems-6-1993 Advances in Neural Information Processing Systems 6]
 
 
==1995 ...==  
 
==1995 ...==  
 
* [[Gerhard Mehlsam]], [[Hermann Kaindl]], [[Wilhelm Barth]] ('''1995'''). ''Feature Construction during Tree Learning''. [http://137.226.34.227/dblp/db/conf/gosler/gosler1995.html GOSLER Final Report] 1995: 391-403
 
* [[Gerhard Mehlsam]], [[Hermann Kaindl]], [[Wilhelm Barth]] ('''1995'''). ''Feature Construction during Tree Learning''. [http://137.226.34.227/dblp/db/conf/gosler/gosler1995.html GOSLER Final Report] 1995: 391-403
 
* [[Chris McConnell]] ('''1995'''). ''Tuning Evaluation Functions for Search''. [http://www.cs.cmu.edu/afs/cs.cmu.edu/user/ccm/www/papers/ml.ps ps] or [http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=9B2A0CCA8B1AFB594A879799D974111A?doi=10.1.1.53.9742&rep=rep1&type=pdf pdf] from [http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.53.9742 CiteSeerX]
 
* [[Chris McConnell]] ('''1995'''). ''Tuning Evaluation Functions for Search''. [http://www.cs.cmu.edu/afs/cs.cmu.edu/user/ccm/www/papers/ml.ps ps] or [http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=9B2A0CCA8B1AFB594A879799D974111A?doi=10.1.1.53.9742&rep=rep1&type=pdf pdf] from [http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.53.9742 CiteSeerX]
* [[David Heckerman]], [[Dan Geiger]], [[Max Chickering]] ('''1995'''). ''Learning Bayesian Networks: The Combination of Knowledge and Statistical Data''. [https://en.wikipedia.org/wiki/Machine_Learning_%28journal%29 Machine Learning], Vol. 20, [http://research.microsoft.com/en-us/um/people/dmax/publications/ml95.pdf pdf]
+
* [[David Heckerman]], [[Mathematician#DGeiger|Dan Geiger]], [[Max Chickering]] ('''1995'''). ''Learning Bayesian Networks: The Combination of Knowledge and Statistical Data''. [https://en.wikipedia.org/wiki/Machine_Learning_%28journal%29 Machine Learning], Vol. 20, [http://research.microsoft.com/en-us/um/people/dmax/publications/ml95.pdf pdf]
 
* [[Tristan Cazenave]] ('''1995'''). ''Learning and Problem Solving in Gogol, a Go playing program''. [http://www.lamsade.dauphine.fr/~cazenave/papers/cazenave95learning.pdf pdf]
 
* [[Tristan Cazenave]] ('''1995'''). ''Learning and Problem Solving in Gogol, a Go playing program''. [http://www.lamsade.dauphine.fr/~cazenave/papers/cazenave95learning.pdf pdf]
 
* [[Gerald Tesauro]] ('''1995'''). ''Temporal Difference Learning and TD-Gammon''. [[ACM#Communications|Communications of the ACM]] Vol. 38, No. 3
 
* [[Gerald Tesauro]] ('''1995'''). ''Temporal Difference Learning and TD-Gammon''. [[ACM#Communications|Communications of the ACM]] Vol. 38, No. 3
Line 253: Line 259:
 
* [[Don Beal]], [[Martin C. Smith]] ('''1997'''). ''Learning Piece Values Using Temporal Differences''. [[ICGA Journal#20_3|ICCA Journal, Vol. 20, No. 3]]
 
* [[Don Beal]], [[Martin C. Smith]] ('''1997'''). ''Learning Piece Values Using Temporal Differences''. [[ICGA Journal#20_3|ICCA Journal, Vol. 20, No. 3]]
 
* [[Kieran Greer]], [[Piyush Ojha]], [[David A. Bell]] ('''1997'''). ''Learning Search Heuristics from Examples: A Study in Computer Chess'', Seventh Conference of the Spanish Association for Artificial Intelligence, CAEPIA’97, November, pp. 695-704.
 
* [[Kieran Greer]], [[Piyush Ojha]], [[David A. Bell]] ('''1997'''). ''Learning Search Heuristics from Examples: A Study in Computer Chess'', Seventh Conference of the Spanish Association for Artificial Intelligence, CAEPIA’97, November, pp. 695-704.
* [[Nir Friedman]], [[Moises Goldszmidt]], [[David Heckerman]], [[Stuart Russell]] ('''1997'''). ''Where is the Impact of Bayesian Networks in Learning?'' In Proc. Fifteenth International Joint Conference on Artificial Intelligence, Nagoya, Japan, [http://www.cs.berkeley.edu/~russell/papers/ijcai97-challenge.ps ps]
+
* [[Nir Friedman]], [[Moises Goldszmidt]], [[David Heckerman]], [[Stuart Russell]] ('''1997'''). ''Where is the Impact of Bayesian Networks in Learning?'' [[Conferences#IJCAI1997|IJCAI 1997]], [http://www.cs.berkeley.edu/~russell/papers/ijcai97-challenge.ps ps]
 
* [[Ronald Parr]], [[Stuart Russell]] ('''1997'''). ''Reinforcement Learning with Hierarchies of Machines.'' In Advances in Neural Information Processing Systems 10, MIT Press, [http://www.cs.berkeley.edu/~russell/papers/nips97-ham.ps.gz zipped ps]
 
* [[Ronald Parr]], [[Stuart Russell]] ('''1997'''). ''Reinforcement Learning with Hierarchies of Machines.'' In Advances in Neural Information Processing Systems 10, MIT Press, [http://www.cs.berkeley.edu/~russell/papers/nips97-ham.ps.gz zipped ps]
* [[Tristan Cazenave]] ('''1997'''). ''Gogol (an Analytical Learning Program)''. [http://www.ijcai.org/past/ijcai-97/ IJCAI'97], [http://www.lamsade.dauphine.fr/~cazenave/papers/fost97.pdf pdf]
+
* [[Tristan Cazenave]] ('''1997'''). ''Gogol (an Analytical Learning Program)''. [[Conferences#IJCAI1997|IJCAI 1997]], [http://www.lamsade.dauphine.fr/~cazenave/papers/fost97.pdf pdf]
 
* [[Tom Mitchell]] ('''1997'''). ''[http://www.cs.cmu.edu/%7Etom/mlbook.html Machine Learning]''. [https://en.wikipedia.org/wiki/McGraw-Hill McGraw Hill]
 
* [[Tom Mitchell]] ('''1997'''). ''[http://www.cs.cmu.edu/%7Etom/mlbook.html Machine Learning]''. [https://en.wikipedia.org/wiki/McGraw-Hill McGraw Hill]
 
* [[Michèle Sebag]] ('''1997'''). ''Stochastic Heuristics for Machine Learning & Machine Learning for Stochastic Optimization''. Habilitation, [https://en.wikipedia.org/wiki/Paris-Sud_11_University Paris-Sud 11 University]
 
* [[Michèle Sebag]] ('''1997'''). ''Stochastic Heuristics for Machine Learning & Machine Learning for Stochastic Optimization''. Habilitation, [https://en.wikipedia.org/wiki/Paris-Sud_11_University Paris-Sud 11 University]
Line 284: Line 290:
 
* [[David Heckerman]] ('''1999'''). ''A tutorial on learning with Bayesian networks''. [http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=F04616607A620324B33D40A8ABB702CB?doi=10.1.1.15.4522&rep=rep1&type=pdf pdf] from [http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.15.4522 CiteSeerX]
 
* [[David Heckerman]] ('''1999'''). ''A tutorial on learning with Bayesian networks''. [http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=F04616607A620324B33D40A8ABB702CB?doi=10.1.1.15.4522&rep=rep1&type=pdf pdf] from [http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.15.4522 CiteSeerX]
 
* [http://www2.lifl.fr/%7Edecomite/ F. De Comité], [http://www.lif.univ-mrs.fr/%7Efdenis/ F. Denis], [http://www.grappa.univ-lille3.fr/%7Egilleron/ R. Gilleron] et [[Fabien Letouzey]] ('''1999'''). ''Positive and Unlabeled Examples help Learning'', The 10th International Conference on Algorithmic Learning Theory, [http://www.cmi.univ-mrs.fr/%7Efdenis/alt99.ps ps]
 
* [http://www2.lifl.fr/%7Edecomite/ F. De Comité], [http://www.lif.univ-mrs.fr/%7Efdenis/ F. Denis], [http://www.grappa.univ-lille3.fr/%7Egilleron/ R. Gilleron] et [[Fabien Letouzey]] ('''1999'''). ''Positive and Unlabeled Examples help Learning'', The 10th International Conference on Algorithmic Learning Theory, [http://www.cmi.univ-mrs.fr/%7Efdenis/alt99.ps ps]
* [http://www.ilsp.gr/homepages/papavasiliou_eng.html Vassilis Papavassiliou], [[Stuart Russell]] ('''1999'''). ''Convergence of reinforcement learning with general function approximators.'' In Proc. IJCAI-99, Stockholm, [http://www.cs.berkeley.edu/~russell/papers/ijcai99-bridge.ps ps]
+
* [http://www.ilsp.gr/homepages/papavasiliou_eng.html Vassilis Papavassiliou], [[Stuart Russell]] ('''1999'''). ''Convergence of reinforcement learning with general function approximators.'' [[Conferences#IJCAI1999|IJCAI 1999]], [http://www.cs.berkeley.edu/~russell/papers/ijcai99-bridge.ps ps]
 
* [[Philip G. K. Reiser]], [[Patricia J. Riddle]] ('''1999'''). ''[http://link.springer.com/chapter/10.1007%2F3-540-48873-1_19 Evolving Logic Programs to Classify Chess-Endgame Positions]''. [http://link.springer.com/book/10.1007%2F3-540-48873-1 Simulated Evolution and Learning], [https://en.wikipedia.org/wiki/Canberra Canberra], Australia. [http://www.springer.com/series/1244 Lecture Notes in Artificial Intelligence], No. 1585, [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer], [http://stancomb.co.uk/Papers/seal98.pdf pdf] » [[Endgame]]
 
* [[Philip G. K. Reiser]], [[Patricia J. Riddle]] ('''1999'''). ''[http://link.springer.com/chapter/10.1007%2F3-540-48873-1_19 Evolving Logic Programs to Classify Chess-Endgame Positions]''. [http://link.springer.com/book/10.1007%2F3-540-48873-1 Simulated Evolution and Learning], [https://en.wikipedia.org/wiki/Canberra Canberra], Australia. [http://www.springer.com/series/1244 Lecture Notes in Artificial Intelligence], No. 1585, [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer], [http://stancomb.co.uk/Papers/seal98.pdf pdf] » [[Endgame]]
 
* [[Marco Wiering]] ('''1999'''). ''[Explorations in Efficient Reinforcement Learning''. Ph.D. thesis, [https://en.wikipedia.org/wiki/University_of_Amsterdam University of Amsterdam], advisors [[Mathematician#FGroen|Frans Groen]] and [[Jürgen Schmidhuber]]
 
* [[Marco Wiering]] ('''1999'''). ''[Explorations in Efficient Reinforcement Learning''. Ph.D. thesis, [https://en.wikipedia.org/wiki/University_of_Amsterdam University of Amsterdam], advisors [[Mathematician#FGroen|Frans Groen]] and [[Jürgen Schmidhuber]]
Line 306: Line 312:
 
* [[Michael Bain]], [[Stephen Muggleton]], [[Ashwin Srinivasan]] ('''2000'''). ''Generalising Closed World Specialisation: A Chess End Game Application''. [http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.24.3499 CitySeerX]
 
* [[Michael Bain]], [[Stephen Muggleton]], [[Ashwin Srinivasan]] ('''2000'''). ''Generalising Closed World Specialisation: A Chess End Game Application''. [http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.24.3499 CitySeerX]
 
'''2001'''
 
'''2001'''
* [[Nicol N. Schraudolph]], [[Peter Dayan]], [[Terrence J. Sejnowski]]  ('''2001'''). ''[http://nic.schraudolph.org/bib2html/b2hd-SchDaySej01.html Learning to Evaluate Go Positions via Temporal Difference Methods]''. in  [[Norio Baba]], [[Lakhmi C. Jain]] (eds.) ('''2001'''). ''[http://jasss.soc.surrey.ac.uk/7/1/reviews/takama.html Computational Intelligence in Games, Studies in Fuzziness and Soft Computing]''. [http://www.springer.com/economics?SGWID=1-165-6-73481-0 Physica-Verlag], revised version of [[Nicol N. Schraudolph#1994|1994 paper]]
+
* [[Nicol N. Schraudolph]], [[Peter Dayan]], [[Terrence J. Sejnowski]]  ('''2001'''). ''[http://nic.schraudolph.org/bib2html/b2hd-SchDaySej01.html Learning to Evaluate Go Positions via Temporal Difference Methods]''. [http://jasss.soc.surrey.ac.uk/7/1/reviews/takama.html Computational Intelligence in Games, Studies in Fuzziness and Soft Computing] [http://www.springer.com/economics?SGWID=1-165-6-73481-0 Physica-Verlag], revised version of [[Nicol N. Schraudolph#1993|1993 paper]]
* [[Jonathan Schaeffer]], [[Markian Hlynka]], [[Vili Jussila]] ('''2001'''). ''Temporal Difference Learning Applied to a High-Performance Game-Playing Program''. [http://www.informatik.uni-trier.de/~ley/db/conf/ijcai/ijcai2001.html#SchaefferHJ01 IJCAI 2001]
+
* [[Jonathan Schaeffer]], [[Markian Hlynka]], [[Vili Jussila]] ('''2001'''). ''Temporal Difference Learning Applied to a High-Performance Game-Playing Program''. [[Conferences#IJCAI2001|IJCAI 2001]]
* [[Michael Bowling]], [[Manuela Veloso|Manuela M. Veloso]] ('''2001'''). ''Rational and Convergent Learning in Stochastic Games''. [http://www.informatik.uni-trier.de/~ley/db/conf/ijcai/ijcai2001.html#BowlingV01 IJCAI 2001]
+
* [[Michael Bowling]], [[Manuela Veloso|Manuela M. Veloso]] ('''2001'''). ''Rational and Convergent Learning in Stochastic Games''. [[Conferences#IJCAI2001|IJCAI 2001]]
* [[Levente Kocsis]], [[Jos Uiterwijk]], [[Jaap van den Herik]] ('''2001'''). ''Move Ordering using Neural Networks'', IEA/AIE 2001, LNCS 2070, 45-50 [http://zaphod.aml.sztaki.hu/papers/kocsis-IEA01.ps ps]
+
* [[Levente Kocsis]], [[Jos Uiterwijk]], [[Jaap van den Herik]] ('''2001'''). ''Move Ordering using Neural Networks'', IEA/AIE 2001, LNCS 2070 [http://zaphod.aml.sztaki.hu/papers/kocsis-IEA01.ps ps]
 
* [[Marty Hirsch]] ('''2001'''). ''Machine Learning in MChess Professional''. [[Advances in Computer Games 9]]
 
* [[Marty Hirsch]] ('''2001'''). ''Machine Learning in MChess Professional''. [[Advances in Computer Games 9]]
 
* [[Yngvi Björnsson]], [[Tony Marsland]] ('''2001'''). ''Learning Search Control in Adversary Games''. [[Advances in Computer Games 9]], pp. 157-174. [http://www.ru.is/faculty/yngvi/pdf/BjornssonM01b.pdf pdf]
 
* [[Yngvi Björnsson]], [[Tony Marsland]] ('''2001'''). ''Learning Search Control in Adversary Games''. [[Advances in Computer Games 9]], pp. 157-174. [http://www.ru.is/faculty/yngvi/pdf/BjornssonM01b.pdf pdf]
Line 315: Line 321:
 
* [[Jean Hayes Michie]] ('''2001'''). ''[http://www.aaai.org/ojs/index.php/aimagazine/article/view/1599/0 Machine Learning and Light Relief: A Review of Truth from Trash]''. [http://www.informatik.uni-trier.de/~ley/db/journals/aim/aim22.html#Michie01 AI Magazine Vol. 22 No. 4], [http://www.aaai.org/ojs/index.php/aimagazine/article/download/1599/1498 pdf]
 
* [[Jean Hayes Michie]] ('''2001'''). ''[http://www.aaai.org/ojs/index.php/aimagazine/article/view/1599/0 Machine Learning and Light Relief: A Review of Truth from Trash]''. [http://www.informatik.uni-trier.de/~ley/db/journals/aim/aim22.html#Michie01 AI Magazine Vol. 22 No. 4], [http://www.aaai.org/ojs/index.php/aimagazine/article/download/1599/1498 pdf]
 
* [[Pieter Spronck]], [[Ida Sprinkhuizen-Kuyper]], [[Eric Postma]] ('''2001'''). ''Infused Evolutionary Learning''. Proceedings of the Eleventh Belgian-Dutch Conference on Machine Learning, [http://www.cnts.ua.ac.be/benelearn2001/proceedings/bene01-spronck.pdf pdf], [http://ticc.uvt.nl/~pspronck/pubs/InfusedEvolutionaryLearning.pdf pdf]
 
* [[Pieter Spronck]], [[Ida Sprinkhuizen-Kuyper]], [[Eric Postma]] ('''2001'''). ''Infused Evolutionary Learning''. Proceedings of the Eleventh Belgian-Dutch Conference on Machine Learning, [http://www.cnts.ua.ac.be/benelearn2001/proceedings/bene01-spronck.pdf pdf], [http://ticc.uvt.nl/~pspronck/pubs/InfusedEvolutionaryLearning.pdf pdf]
* [[Charles Elkan]] ('''2001'''). ''The Foundations of Cost-Sensitive Learning''. [[Conferences#IJCAI|IJCAI 2001]]
+
* [[Charles Elkan]] ('''2001'''). ''The Foundations of Cost-Sensitive Learning''. [[Conferences#IJCAI2001|IJCAI 2001]]
 
* [[Alex B. Meijer]], [[Henk Koppelaar]] ('''2001'''). ''[http://www.kbs.twi.tudelft.nl/Publications/Conference/2001/2001-MeijerKoppelaar-GAMEON01.html A learning architecture for the game of Go]''. [https://www.informs.org/Attend-a-Conference/Conference-Calendar/Game-On-2001 Game-On 2001]
 
* [[Alex B. Meijer]], [[Henk Koppelaar]] ('''2001'''). ''[http://www.kbs.twi.tudelft.nl/Publications/Conference/2001/2001-MeijerKoppelaar-GAMEON01.html A learning architecture for the game of Go]''. [https://www.informs.org/Attend-a-Conference/Conference-Calendar/Game-On-2001 Game-On 2001]
 
* [[Johannes Fürnkranz]], [[Miroslav Kubat]] ('''2001'''). ''[https://www.novapublishers.com/catalog/product_info.php?products_id=720 Machines that Learn to Play Games]''. Advances in Computation: Theory and Practice, Vol. 8,. [https://en.wikipedia.org/wiki/Nova_Publishers NOVA Science Publishers]
 
* [[Johannes Fürnkranz]], [[Miroslav Kubat]] ('''2001'''). ''[https://www.novapublishers.com/catalog/product_info.php?products_id=720 Machines that Learn to Play Games]''. Advances in Computation: Theory and Practice, Vol. 8,. [https://en.wikipedia.org/wiki/Nova_Publishers NOVA Science Publishers]
Line 409: Line 415:
 
* [[Byoung-Tak Zhang]] ('''2008'''). ''Hypernetworks: A molecular evolutionary architecture for cognitive learning and memory''. [[IEEE|IEEE Computational Intelligence Magazine]], Vol. 3, No. 3, [https://bi.snu.ac.kr/Publications/Journals/International/IEEE_Comp_Int_3_Zhang.pdf pdf]
 
* [[Byoung-Tak Zhang]] ('''2008'''). ''Hypernetworks: A molecular evolutionary architecture for cognitive learning and memory''. [[IEEE|IEEE Computational Intelligence Magazine]], Vol. 3, No. 3, [https://bi.snu.ac.kr/Publications/Journals/International/IEEE_Comp_Int_3_Zhang.pdf pdf]
 
* [[Maria Cutumisu]], [[Michael Bowling]], [[Duane Szafron]], [[Richard Sutton]] ('''2008'''). ''Agent Learning using Action-Dependent Learning Rates in Computer Role-Playing Games''. [https://www.aaai.org/Library/AIIDE/aiide08contents.php Proceedings of the Fourth Artificial Intelligence and Interactive Digital Entertainment Conference], [https://webdocs.cs.ualberta.ca/~duane/publications/pdf/2008aiide.pdf pdf]
 
* [[Maria Cutumisu]], [[Michael Bowling]], [[Duane Szafron]], [[Richard Sutton]] ('''2008'''). ''Agent Learning using Action-Dependent Learning Rates in Computer Role-Playing Games''. [https://www.aaai.org/Library/AIIDE/aiide08contents.php Proceedings of the Fourth Artificial Intelligence and Interactive Digital Entertainment Conference], [https://webdocs.cs.ualberta.ca/~duane/publications/pdf/2008aiide.pdf pdf]
 +
* [[Balázs Csanád Csáji]], [https://dblp.dagstuhl.de/pers/hd/m/Monostori:L=aacute=szl=oacute= László Monostori] ('''2008, 2014'''). ''Adaptive stochastic resource control: a machine learning approach''. [https://en.wikipedia.org/wiki/Journal_of_Artificial_Intelligence_Research Journal of Artificial Intelligence Research], Vol. 32, [https://arxiv.org/abs/1401.3434 arXiv:1401.3434]
 
'''2009'''
 
'''2009'''
 
* [[Hamid Reza Maei]], [[Csaba Szepesvári]], [[Shalabh Bhatnagar]], [[Doina Precup]], [[David Silver]], [[Richard Sutton]] ('''2009'''). ''Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation.'' Accepted in Advances in Neural Information Processing Systems 22, Vancouver, BC. December 2009. MIT Press. [http://books.nips.cc/papers/files/nips22/NIPS2009_1121.pdf pdf]
 
* [[Hamid Reza Maei]], [[Csaba Szepesvári]], [[Shalabh Bhatnagar]], [[Doina Precup]], [[David Silver]], [[Richard Sutton]] ('''2009'''). ''Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation.'' Accepted in Advances in Neural Information Processing Systems 22, Vancouver, BC. December 2009. MIT Press. [http://books.nips.cc/papers/files/nips22/NIPS2009_1121.pdf pdf]
Line 414: Line 421:
 
* [[Martin Možina]] ('''2009'''). ''Argument Based Machine Learning'', PhD Thesis, [http://www.ailab.si/martin/mozina_phd.pdf pdf]
 
* [[Martin Možina]] ('''2009'''). ''Argument Based Machine Learning'', PhD Thesis, [http://www.ailab.si/martin/mozina_phd.pdf pdf]
 
* [[David Silver]] ('''2009'''). ''Reinforcement Learning and Simulation-Based Search''. Ph.D. thesis, [[University of Alberta]]. [http://webdocs.cs.ualberta.ca/~silver/David_Silver/Publications_files/thesis.pdf pdf]
 
* [[David Silver]] ('''2009'''). ''Reinforcement Learning and Simulation-Based Search''. Ph.D. thesis, [[University of Alberta]]. [http://webdocs.cs.ualberta.ca/~silver/David_Silver/Publications_files/thesis.pdf pdf]
* [[Eli David|Omid David]], [[Jaap van den Herik]], [[Moshe Koppel]], [[Nathan S. Netanyahu]] ('''2009'''). ''Simulating Human Grandmasters: Evolution and Coevolution of Evaluation Functions''. [[ACM]] Genetic and Evolutionary Computation Conference ([http://www.sigevo.org/gecco-2009/ GECCO '09]), pp. 1483 - 1489, [https://en.wikipedia.org/wiki/Montreal Montreal], Canada, [http://www.omiddavid.com/pubs/gm-simul.pdf pdf]
+
* [[Eli David|Omid David]], [[Jaap van den Herik]], [[Moshe Koppel]], [[Nathan S. Netanyahu]] ('''2009'''). ''Simulating Human Grandmasters: Evolution and Coevolution of Evaluation Functions''. [http://www.sigevo.org/gecco-2009/ GECCO '09], [https://arxiv.org/abs/1711.06840 arXiv:1711.06840]
 
* [[Eli David|Omid David]] ('''2009'''). ''Genetic Algorithms Based Learning for Evolving Intelligent Organisms''. Ph.D. Thesis <ref>[[Dap Hartmann]] ('''2010'''). ''Mimicking the Black Box - Genetically evolving evaluation functions and search algorithms''. Review on Omid David's Ph.D. Thesis, [[ICGA Journal#33_1|ICGA Journal, Vol 33, No. 1]]</ref>
 
* [[Eli David|Omid David]] ('''2009'''). ''Genetic Algorithms Based Learning for Evolving Intelligent Organisms''. Ph.D. Thesis <ref>[[Dap Hartmann]] ('''2010'''). ''Mimicking the Black Box - Genetically evolving evaluation functions and search algorithms''. Review on Omid David's Ph.D. Thesis, [[ICGA Journal#33_1|ICGA Journal, Vol 33, No. 1]]</ref>
 
* [[Nur Merve Amil]], [[Nicolas Bredèche]], [[Christian Gagné]], [[Sylvain Gelly]], [[Marc Schoenauer]], [[Olivier Teytaud]] ('''2009'''). ''A Statistical Learning Perspective of Genetic Programming''. EuroGP 2009, [http://hal.inria.fr/docs/00/36/97/82/PDF/eurogp.pdf pdf]
 
* [[Nur Merve Amil]], [[Nicolas Bredèche]], [[Christian Gagné]], [[Sylvain Gelly]], [[Marc Schoenauer]], [[Olivier Teytaud]] ('''2009'''). ''A Statistical Learning Perspective of Genetic Programming''. EuroGP 2009, [http://hal.inria.fr/docs/00/36/97/82/PDF/eurogp.pdf pdf]
Line 424: Line 431:
 
* [[Mark Levene]], [[Trevor Fenner]] ('''2009'''). ''A Methodology for Learning Players' Styles from Game Records''. [http://arxiv.org/abs/0904.2595v1 arXiv:0904.2595v1]
 
* [[Mark Levene]], [[Trevor Fenner]] ('''2009'''). ''A Methodology for Learning Players' Styles from Game Records''. [http://arxiv.org/abs/0904.2595v1 arXiv:0904.2595v1]
 
* [[Mathematician#THastie|Trevor Hastie]], [[Mathematician#RTibshirani|Robert Tibshirani]], [https://en.wikipedia.org/wiki/Jerome_H._Friedman Jerome Friedman] ('''2009'''). ''[http://www.springer.com/book/9780387848570 The Elements of Statistical Learning: Data Mining, Inference, and Prediction]''. Second Edition, [https://de.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer]
 
* [[Mathematician#THastie|Trevor Hastie]], [[Mathematician#RTibshirani|Robert Tibshirani]], [https://en.wikipedia.org/wiki/Jerome_H._Friedman Jerome Friedman] ('''2009'''). ''[http://www.springer.com/book/9780387848570 The Elements of Statistical Learning: Data Mining, Inference, and Prediction]''. Second Edition, [https://de.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer]
 +
* [https://dblp.uni-trier.de/pers/hd/h/Hall:Mark_A= Mark A. Hall], [https://dblp.uni-trier.de/pers/hd/f/Frank:Eibe Eibe Frank], [[Geoffrey Holmes]], [[Bernhard Pfahringer]], [https://dblp.uni-trier.de/pers/hd/r/Reutemann:Peter Peter Reutemann], [[Ian H. Witten]] ('''2009'''). ''The WEKA data mining software: an update''. [https://dblp.uni-trier.de/db/journals/sigkdd/sigkdd11.html SIGKDD Explorations], Vol. 11, No. 1, [https://www.kdd.org/exploration_files/p2V11n1.pdf pdf] <ref>[https://en.wikipedia.org/wiki/Weka_(machine_learning) Weka (machine learning) from Wikipedia]</ref>
 
==2010 ...==
 
==2010 ...==
 +
* [[Johannes Fürnkranz]], [https://de.wikipedia.org/wiki/Eyke_H%C3%BCllermeier Eyke Hüllermeier] (eds.) ('''2010'''). ''[https://link.springer.com/book/10.1007/978-3-642-14125-6 Preference Learning]''. [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer]
 +
* [https://dblp.uni-trier.de/pers/hd/f/Frank:Eibe Eibe Frank], [https://dblp.uni-trier.de/pers/hd/h/Hall:Mark_A= Mark A. Hall], [[Geoffrey Holmes]], [https://dblp.uni-trier.de/pers/hd/k/Kirkby:Richard Richard Kirkby], [[Bernhard Pfahringer]], [[Ian H. Witten]], [https://dblp.uni-trier.de/pers/hd/t/Trigg:Leonard_E= Len Trigg]  ('''2010'''). ''[https://link.springer.com/chapter/10.1007/978-0-387-09823-4_66 Weka-A Machine Learning Workbench for Data Mining]''. [https://link.springer.com/book/10.1007/978-0-387-09823-4 Data Mining and Knowledge Discovery Handbook], [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer]
 
* [[Jacek Mańdziuk]] ('''2010'''). ''[http://link.springer.com/book/10.1007%2F978-3-642-11678-0 Knowledge-Free and Learning-Based Methods in Intelligent Game Playing]''. [http://link.springer.com/bookseries/7092 Studies in Computational Intelligence], Vol. 276, [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer]
 
* [[Jacek Mańdziuk]] ('''2010'''). ''[http://link.springer.com/book/10.1007%2F978-3-642-11678-0 Knowledge-Free and Learning-Based Methods in Intelligent Game Playing]''. [http://link.springer.com/bookseries/7092 Studies in Computational Intelligence], Vol. 276, [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer]
 
* [[Joel Veness]], [[Kee Siong Ng]], [[Marcus Hutter]], [[David Silver]] ('''2010'''). ''Reinforcement Learning via AIXI Approximation''. Association for the Advancement of Artificial Intelligence (AAAI), [http://jveness.info/publications/veness_rl_via_aixi_approx.pdf pdf]
 
* [[Joel Veness]], [[Kee Siong Ng]], [[Marcus Hutter]], [[David Silver]] ('''2010'''). ''Reinforcement Learning via AIXI Approximation''. Association for the Advancement of Artificial Intelligence (AAAI), [http://jveness.info/publications/veness_rl_via_aixi_approx.pdf pdf]
* [[Eli David|Omid David]], [[Moshe Koppel]], [[Nathan S. Netanyahu]] ('''2010'''). ''[http://www.springerlink.com/content/3346t8432n718821 Expert-Driven Genetic Algorithms for Simulating Evaluation Functions]''. [http://www.omiddavid.com/pubs/expert-driven.pdf pdf]
 
 
* [[Eli David|Omid David]], [[Nathan S. Netanyahu]], Yoav Rosenberg, Moshe Shimoni ('''2010'''). ''Genetic Algorithms for Automatic Classification of Moving Objects''. [[ACM]] Genetic and Evolutionary Computation Conference ([http://www.sigevo.org/gecco-2010/ GECCO '10]), [https://en.wikipedia.org/wiki/Portland,_Oregon Portland, OR], [http://www.omiddavid.com/pubs/object-classification.pdf pdf]
 
* [[Eli David|Omid David]], [[Nathan S. Netanyahu]], Yoav Rosenberg, Moshe Shimoni ('''2010'''). ''Genetic Algorithms for Automatic Classification of Moving Objects''. [[ACM]] Genetic and Evolutionary Computation Conference ([http://www.sigevo.org/gecco-2010/ GECCO '10]), [https://en.wikipedia.org/wiki/Portland,_Oregon Portland, OR], [http://www.omiddavid.com/pubs/object-classification.pdf pdf]
* [[Eli David|Omid David]], [[Moshe Koppel]], [[Nathan S. Netanyahu]] ('''2010'''). ''Genetic Algorithms for Automatic Search Tuning''. [[ICGA Journal#33_2|ICGA Journal, Vol 33, No. 2]]
+
* [[Eli David|Omid David]], [[Moshe Koppel]], [[Nathan S. Netanyahu]] ('''2010'''). ''Genetic Algorithms for Automatic Search Tuning''. [[ICGA Journal#33_2|ICGA Journal, Vol. 33, No. 2]]
 
* [[Mesut Kirci]] ('''2010'''). ''Feature Learning using State Differences''. Master's thesis, Department of Computing Science, [[University of Alberta]], [http://repository.library.ualberta.ca/dspace/bitstream/10048/1011/1/kirci_mesut_spring+2010.pdf pdf] » [[General Game Playing]]
 
* [[Mesut Kirci]] ('''2010'''). ''Feature Learning using State Differences''. Master's thesis, Department of Computing Science, [[University of Alberta]], [http://repository.library.ualberta.ca/dspace/bitstream/10048/1011/1/kirci_mesut_spring+2010.pdf pdf] » [[General Game Playing]]
 
* [[Amine Bourki]], [[Matthieu Coulm]], [[Philippe Rolet]], [[Olivier Teytaud]], [[Paul Vayssière]] ('''2010'''). ''[http://hal.inria.fr/inria-00467796/en/ Parameter Tuning by Simple Regret Algorithms and Multiple Simultaneous Hypothesis Testing]''. [http://hal.inria.fr/docs/00/46/77/96/PDF/tosubmit.pdf pdf]
 
* [[Amine Bourki]], [[Matthieu Coulm]], [[Philippe Rolet]], [[Olivier Teytaud]], [[Paul Vayssière]] ('''2010'''). ''[http://hal.inria.fr/inria-00467796/en/ Parameter Tuning by Simple Regret Algorithms and Multiple Simultaneous Hypothesis Testing]''. [http://hal.inria.fr/docs/00/46/77/96/PDF/tosubmit.pdf pdf]
Line 451: Line 460:
 
* [[Marcin Szubert]], [[Wojciech Jaśkowski]], [[Krzysztof Krawiec]] ('''2011'''). ''Learning Board Evaluation Function for Othello by Hybridizing Coevolution with Temporal Difference Learning''. [http://control.ibspan.waw.pl:3000/mainpage Control and Cybernetics], Vol. 40, No. 3, [http://www.cs.put.poznan.pl/wjaskowski/pub/papers/szubert2011learning.pdf pdf] » [[Othello]]
 
* [[Marcin Szubert]], [[Wojciech Jaśkowski]], [[Krzysztof Krawiec]] ('''2011'''). ''Learning Board Evaluation Function for Othello by Hybridizing Coevolution with Temporal Difference Learning''. [http://control.ibspan.waw.pl:3000/mainpage Control and Cybernetics], Vol. 40, No. 3, [http://www.cs.put.poznan.pl/wjaskowski/pub/papers/szubert2011learning.pdf pdf] » [[Othello]]
 
* [[Hamid Reza Maei]] ('''2011'''). ''Gradient Temporal-Difference Learning Algorithms''. Ph.D. thesis, [[University of Alberta]], advisor [[Richard Sutton]], [http://webdocs.cs.ualberta.ca/~sutton/papers/maei-thesis-2011.pdf pdf]  
 
* [[Hamid Reza Maei]] ('''2011'''). ''Gradient Temporal-Difference Learning Algorithms''. Ph.D. thesis, [[University of Alberta]], advisor [[Richard Sutton]], [http://webdocs.cs.ualberta.ca/~sutton/papers/maei-thesis-2011.pdf pdf]  
 +
* [[Eli David|Omid David]], [[Moshe Koppel]], [[Nathan S. Netanyahu]] ('''2011'''). ''[https://link.springer.com/article/10.1007/s10710-010-9103-4 Expert-Driven Genetic Algorithms for Simulating Evaluation Functions]''. [https://www.springer.com/journal/10710 Genetic Programming and Evolvable Machines], Vol. 12, No. 1, [https://arxiv.org/abs/1711.06841 arXiv:1711.06841]
 
'''2012'''
 
'''2012'''
 
* [[Marco Wiering]], [http://martijnvanotterlo.nl/ Martijn Van Otterlo] ('''2012'''). ''Reinforcement learning: State-of-the-art''. [http://link.springer.com/book/10.1007/978-3-642-27645-3 Adaptation, Learning, and Optimization, Vol. 12], [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer]
 
* [[Marco Wiering]], [http://martijnvanotterlo.nl/ Martijn Van Otterlo] ('''2012'''). ''Reinforcement learning: State-of-the-art''. [http://link.springer.com/book/10.1007/978-3-642-27645-3 Adaptation, Learning, and Optimization, Vol. 12], [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer]
Line 471: Line 481:
 
* [[Volodymyr Mnih]], [[Koray Kavukcuoglu]], [[David Silver]], [[Alex Graves]], [[Ioannis Antonoglou]], [[Daan Wierstra]], [[Martin Riedmiller]] ('''2013'''). ''Playing Atari with Deep Reinforcement Learning''. [http://arxiv.org/abs/1312.5602 arXiv:1312.5602] <ref>[http://www.nervanasys.com/demystifying-deep-reinforcement-learning/ Demystifying Deep Reinforcement Learning] by [http://www.nervanasys.com/author/tambet/ Tambet Matiisen], [http://www.nervanasys.com/ Nervana], December 21, 2015</ref> <ref>[http://www.google.com/patents/US20150100530 Patent US20150100530 - Methods and apparatus for reinforcement learning - Google Patents]</ref>
 
* [[Volodymyr Mnih]], [[Koray Kavukcuoglu]], [[David Silver]], [[Alex Graves]], [[Ioannis Antonoglou]], [[Daan Wierstra]], [[Martin Riedmiller]] ('''2013'''). ''Playing Atari with Deep Reinforcement Learning''. [http://arxiv.org/abs/1312.5602 arXiv:1312.5602] <ref>[http://www.nervanasys.com/demystifying-deep-reinforcement-learning/ Demystifying Deep Reinforcement Learning] by [http://www.nervanasys.com/author/tambet/ Tambet Matiisen], [http://www.nervanasys.com/ Nervana], December 21, 2015</ref> <ref>[http://www.google.com/patents/US20150100530 Patent US20150100530 - Methods and apparatus for reinforcement learning - Google Patents]</ref>
 
'''2014'''
 
'''2014'''
* [[Eli David|Omid E. David]], [[Jaap van den Herik]], [[Moshe Koppel]], [[Nathan S. Netanyahu]] ('''2014'''). ''Genetic Algorithms for Evolving Computer Chess Programs''. [[IEEE#EC|IEEE Transactions on Evolutionary Computation]], [http://www.genetic-programming.org/hc2014/David-Paper.pdf pdf] <ref>[http://www.liacs.nl/nieuws/jaap-van-den-herik-wint-humies-award-2014/ Jaap van den Herik wint Humies Award 2014 - LIACS - Leiden Institute of Advanced Computer Science]</ref>
+
* [[Eli David|Omid David]], [[Jaap van den Herik]], [[Moshe Koppel]], [[Nathan S. Netanyahu]] ('''2014'''). ''Genetic Algorithms for Evolving Computer Chess Programs''. [[IEEE#EC|IEEE Transactions on Evolutionary Computation]], [https://arxiv.org/abs/1711.08337 arXiv:1711.08337]
 
* [[Wojciech Jaśkowski]], [[Marcin Szubert]], [[Paweł Liskowski]] ('''2014'''). ''Multi-Criteria Comparison of Coevolution and Temporal Difference Learning on Othello''. [http://www.evostar.org/2014/ EvoApplications 2014], [http://www.springer.com/computer/theoretical+computer+science/book/978-3-662-45522-7 Springer, volume 8602] » [[Othello]]
 
* [[Wojciech Jaśkowski]], [[Marcin Szubert]], [[Paweł Liskowski]] ('''2014'''). ''Multi-Criteria Comparison of Coevolution and Temporal Difference Learning on Othello''. [http://www.evostar.org/2014/ EvoApplications 2014], [http://www.springer.com/computer/theoretical+computer+science/book/978-3-662-45522-7 Springer, volume 8602] » [[Othello]]
 
* [[Marcin Szubert]], [[Wojciech Jaśkowski]] ('''2014'''). ''Temporal Difference Learning of N-Tuple Networks for the Game 2048''. [[IEEE#CIG|IEEE Conference on Computational Intelligence and Games]], [http://www.cs.put.poznan.pl/mszubert/pub/szubert2014cig.pdf pdf] <ref>[https://en.wikipedia.org/wiki/2048_%28video_game%29 2048 (video game) from Wikipedia]</ref>
 
* [[Marcin Szubert]], [[Wojciech Jaśkowski]] ('''2014'''). ''Temporal Difference Learning of N-Tuple Networks for the Game 2048''. [[IEEE#CIG|IEEE Conference on Computational Intelligence and Games]], [http://www.cs.put.poznan.pl/mszubert/pub/szubert2014cig.pdf pdf] <ref>[https://en.wikipedia.org/wiki/2048_%28video_game%29 2048 (video game) from Wikipedia]</ref>
Line 481: Line 491:
 
* [[Mathematician#ROrtner|Ronald Ortner]], [[Mathematician#DRyabko|Daniil Ryabko]], [[Peter Auer]], [[Rémi Munos]] ('''2014'''). ''Regret bounds for restless Markov bandits''. [https://en.wikipedia.org/wiki/Theoretical_Computer_Science_%28journal%29 Theoretical Computer Science] 558, [http://daniil.ryabko.net/mabajr.pdf pdf]
 
* [[Mathematician#ROrtner|Ronald Ortner]], [[Mathematician#DRyabko|Daniil Ryabko]], [[Peter Auer]], [[Rémi Munos]] ('''2014'''). ''Regret bounds for restless Markov bandits''. [https://en.wikipedia.org/wiki/Theoretical_Computer_Science_%28journal%29 Theoretical Computer Science] 558, [http://daniil.ryabko.net/mabajr.pdf pdf]
 
==2015 ...==
 
==2015 ...==
* [[Volodymyr Mnih]], [[Koray Kavukcuoglu]], [[David Silver]], [[Andrei A. Rusu]], [[Joel Veness]], [[Marc G. Bellemare]], [[Alex Graves]], [[Martin Riedmiller]], [[Andreas K. Fidjeland]], [[Georg Ostrovski]], [[Stig Petersen]], [[Charles Beattie]], [[Amir Sadik]], [[Ioannis Antonoglou]], [[Helen King]], [[Dharshan Kumaran]], [[Daan Wierstra]], [[Shane Legg]], [[Demis Hassabis]] ('''2015'''). ''[http://www.nature.com/nature/journal/v518/n7540/abs/nature14236.html Human-level control through deep reinforcement learning]''. [https://en.wikipedia.org/wiki/Nature_%28journal%29 Nature], Vol. 518
+
* [[Volodymyr Mnih]], [[Koray Kavukcuoglu]], [[David Silver]], [[Mathematician#AARusu|Andrei A. Rusu]], [[Joel Veness]], [[Marc G. Bellemare]], [[Alex Graves]], [[Martin Riedmiller]], [[Andreas K. Fidjeland]], [[Georg Ostrovski]], [[Stig Petersen]], [[Charles Beattie]], [[Amir Sadik]], [[Ioannis Antonoglou]], [[Helen King]], [[Dharshan Kumaran]], [[Daan Wierstra]], [[Shane Legg]], [[Demis Hassabis]] ('''2015'''). ''[http://www.nature.com/nature/journal/v518/n7540/abs/nature14236.html Human-level control through deep reinforcement learning]''. [https://en.wikipedia.org/wiki/Nature_%28journal%29 Nature], Vol. 518
 
* [[Tobias Graf]], [[Marco Platzner]] ('''2015'''). ''Adaptive Playouts in Monte Carlo Tree Search with Policy Gradient Reinforcement Learning''. [[Advances in Computer Games 14]]
 
* [[Tobias Graf]], [[Marco Platzner]] ('''2015'''). ''Adaptive Playouts in Monte Carlo Tree Search with Policy Gradient Reinforcement Learning''. [[Advances in Computer Games 14]]
 
* [[Yuichiro Sato]], [[Hiroyuki Iida]], [[Jaap van den Herik]] ('''2015'''). ''Transfer Learning by Inductive Logic Programming''. [[Advances in Computer Games 14]]
 
* [[Yuichiro Sato]], [[Hiroyuki Iida]], [[Jaap van den Herik]] ('''2015'''). ''Transfer Learning by Inductive Logic Programming''. [[Advances in Computer Games 14]]
Line 496: Line 506:
 
* [[Jialin Liu]], [[Olivier Teytaud]], [[Tristan Cazenave]] ('''2016'''). ''Fast seed-learning algorithms for games''. [[CG 2016]]
 
* [[Jialin Liu]], [[Olivier Teytaud]], [[Tristan Cazenave]] ('''2016'''). ''Fast seed-learning algorithms for games''. [[CG 2016]]
 
* [[Eli David|Omid E. David]], [[Nathan S. Netanyahu]], [[Lior Wolf]] ('''2016'''). ''[http://link.springer.com/chapter/10.1007%2F978-3-319-44781-0_11 DeepChess: End-to-End Deep Neural Network for Automatic Learning in Chess]''. [http://icann2016.org/ ICAAN 2016], [https://en.wikipedia.org/wiki/Lecture_Notes_in_Computer_Science Lecture Notes in Computer Science], Vol. 9887, [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer], [http://www.cs.tau.ac.il/~wolf/papers/deepchess.pdf pdf preprint] » [[DeepChess]] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=61748 DeepChess: Another deep-learning based chess program] by [[Matthew Lai]], [[CCC]], October 17, 2016</ref> <ref>[http://icann2016.org/index.php/conference-programme/recipients-of-the-best-paper-awards/ ICANN 2016 | Recipients of the best paper awards]</ref>
 
* [[Eli David|Omid E. David]], [[Nathan S. Netanyahu]], [[Lior Wolf]] ('''2016'''). ''[http://link.springer.com/chapter/10.1007%2F978-3-319-44781-0_11 DeepChess: End-to-End Deep Neural Network for Automatic Learning in Chess]''. [http://icann2016.org/ ICAAN 2016], [https://en.wikipedia.org/wiki/Lecture_Notes_in_Computer_Science Lecture Notes in Computer Science], Vol. 9887, [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer], [http://www.cs.tau.ac.il/~wolf/papers/deepchess.pdf pdf preprint] » [[DeepChess]] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=61748 DeepChess: Another deep-learning based chess program] by [[Matthew Lai]], [[CCC]], October 17, 2016</ref> <ref>[http://icann2016.org/index.php/conference-programme/recipients-of-the-best-paper-awards/ ICANN 2016 | Recipients of the best paper awards]</ref>
* [https://www.linkedin.com/in/ian-goodfellow-b7187213 Ian Goodfellow], [https://en.wikipedia.org/wiki/Yoshua_Bengio Yoshua Bengio], [https://www.linkedin.com/in/aaron-courville-53a63459 Aaron Courville] ('''2016'''). ''[http://www.deeplearningbook.org/ Deep Learning]''. [https://en.wikipedia.org/wiki/MIT_Press MIT Press]
+
* [[Mathematician#IGoodfellow|Ian Goodfellow]], [[Mathematician#YBengio|Yoshua Bengio]], [[Mathematician#ACourville|Aaron Courville]] ('''2016'''). ''[http://www.deeplearningbook.org/ Deep Learning]''. [https://en.wikipedia.org/wiki/MIT_Press MIT Press]
 
* [[Max Jaderberg]], [[Volodymyr Mnih]], [[Wojciech Marian Czarnecki]], [[Tom Schaul]], [[Joel Z. Leibo]], [[David Silver]], [[Koray Kavukcuoglu]] ('''2016'''). ''Reinforcement Learning with Unsupervised Auxiliary Tasks''. [https://arxiv.org/abs/1611.05397v1 arXiv:1611.05397v1]
 
* [[Max Jaderberg]], [[Volodymyr Mnih]], [[Wojciech Marian Czarnecki]], [[Tom Schaul]], [[Joel Z. Leibo]], [[David Silver]], [[Koray Kavukcuoglu]] ('''2016'''). ''Reinforcement Learning with Unsupervised Auxiliary Tasks''. [https://arxiv.org/abs/1611.05397v1 arXiv:1611.05397v1]
 +
* [[Ian H. Witten]], [https://dblp.uni-trier.de/pers/hd/f/Frank:Eibe Eibe Frank], [https://dblp.uni-trier.de/pers/hd/h/Hall:Mark_A= Mark A. Hall], [http://www.professeurs.polymtl.ca/christopher.pal/ Christopher Pal] ('''2016'''). ''[https://www.cs.waikato.ac.nz/~ml/weka/book.html Data Mining: Practical Machine Learning Tools and Techniques]''. 4th Edition, [https://en.wikipedia.org/wiki/Morgan_Kaufmann_Publishers Morgan Kaufmann]
 
'''2017'''
 
'''2017'''
 
* [[Stephen Muggleton]] ('''2017'''). ''Meta-Interpretive Learning: Achievements and Challenges''. Invited Paper, [https://dblp.uni-trier.de/db/conf/ruleml/ruleml2017.html RuleML+RR 2017], [https://www.doc.ic.ac.uk/~shm/Papers/rulemlabs.pdf pdf]
 
* [[Stephen Muggleton]] ('''2017'''). ''Meta-Interpretive Learning: Achievements and Challenges''. Invited Paper, [https://dblp.uni-trier.de/db/conf/ruleml/ruleml2017.html RuleML+RR 2017], [https://www.doc.ic.ac.uk/~shm/Papers/rulemlabs.pdf pdf]
Line 508: Line 519:
 
* [[Matthia Sabatelli]], [[Francesco Bidoia]], [[Valeriu Codreanu]], [[Marco Wiering]] ('''2018'''). ''Learning to Evaluate Chess Positions with Deep Neural Networks and Limited Lookahead''. ICPRAM 2018, [http://www.ai.rug.nl/~mwiering/GROUP/ARTICLES/ICPRAM_CHESS_DNN_2018.pdf pdf]
 
* [[Matthia Sabatelli]], [[Francesco Bidoia]], [[Valeriu Codreanu]], [[Marco Wiering]] ('''2018'''). ''Learning to Evaluate Chess Positions with Deep Neural Networks and Limited Lookahead''. ICPRAM 2018, [http://www.ai.rug.nl/~mwiering/GROUP/ARTICLES/ICPRAM_CHESS_DNN_2018.pdf pdf]
 
* [[Takeshi Ito]] ('''2018'''). ''Game learning support system based on future position''. [[CG 2018]], [[ICGA Journal#40_4|ICGA Journal, Vol. 40, No. 4]]
 
* [[Takeshi Ito]] ('''2018'''). ''Game learning support system based on future position''. [[CG 2018]], [[ICGA Journal#40_4|ICGA Journal, Vol. 40, No. 4]]
 +
'''2019'''
 +
* [[Herilalaina Rakotoarison]], [[Marc Schoenauer]], [[Michèle Sebag]] ('''2019'''). ''Automated Machine Learning with Monte-Carlo Tree Search''. [https://arxiv.org/abs/1906.00170 arXiv:1906.00170]
 +
* [[Frank Hutter]], [https://dblp.org/pers/hd/k/Kotthoff:Lars Lars Kotthoff], [https://dblp.org/pers/hd/v/Vanschoren:Joaquin Joaquin Vanschoren] (eds.) ('''2019'''). ''[https://link.springer.com/book/10.1007%2F978-3-030-05318-5 Automated Machine Learning]''. [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer]
 +
* [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Thomas Hubert]], [[Karen Simonyan]], [[Laurent Sifre]], [[Simon Schmitt]], [[Arthur Guez]], [[Edward Lockhart]], [[Demis Hassabis]], [[Thore Graepel]], [[Timothy Lillicrap]], [[David Silver]] ('''2019'''). ''Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model''. [https://arxiv.org/abs/1911.08265 arXiv:1911.08265] <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=2&t=72381 New DeepMind paper] by GregNeto, [[CCC]], November 21, 2019</ref>
  
 
=Forum Posts=
 
=Forum Posts=
Line 534: Line 549:
 
* [http://www.talkchess.com/forum/viewtopic.php?t=56313 Position learning and opening books] by Forrest Hoch, [[CCC]], May 11, 2015
 
* [http://www.talkchess.com/forum/viewtopic.php?t=56313 Position learning and opening books] by Forrest Hoch, [[CCC]], May 11, 2015
 
* [http://www.talkchess.com/forum/viewtopic.php?t=61861 A database for learning evaluation functions] by [[Álvaro Begué]], [[CCC]], October 28, 2016 » [[Automated Tuning]], [[Evaluation]], [[Texel's Tuning Method]]
 
* [http://www.talkchess.com/forum/viewtopic.php?t=61861 A database for learning evaluation functions] by [[Álvaro Begué]], [[CCC]], October 28, 2016 » [[Automated Tuning]], [[Evaluation]], [[Texel's Tuning Method]]
 +
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=72020 A book on machine learning] by Mehdi Amini, [[CCC]], October 06, 2019
  
 
=External Links=  
 
=External Links=  
Line 548: Line 564:
 
* [https://en.wikipedia.org/wiki/List_of_machine_learning_concepts List of machine learning concepts from Wikipedia]
 
* [https://en.wikipedia.org/wiki/List_of_machine_learning_concepts List of machine learning concepts from Wikipedia]
 
* [https://en.wikipedia.org/wiki/Apprenticeship_learning Apprenticeship learning from Wikipedia]
 
* [https://en.wikipedia.org/wiki/Apprenticeship_learning Apprenticeship learning from Wikipedia]
 +
* [https://en.wikipedia.org/wiki/Automated_machine_learning Automated machine learning from Wikipedia]
 +
* [https://en.wikipedia.org/wiki/Data_mining Data mining from Wikipeadia]
 
* [https://en.wikipedia.org/wiki/Ensemble_learning Ensemble learning from Wikipedia]
 
* [https://en.wikipedia.org/wiki/Ensemble_learning Ensemble learning from Wikipedia]
 +
** [https://en.wikipedia.org/wiki/Bootstrap_aggregating Bootstrap aggregating from Wikipedia]
 
* [https://en.wikipedia.org/wiki/Explanation-based_learning Explanation-based learning from Wikipedia]
 
* [https://en.wikipedia.org/wiki/Explanation-based_learning Explanation-based learning from Wikipedia]
 
* [https://en.wikipedia.org/wiki/Meta_learning_%28computer_science%29 Meta Learning from Wikipedia]
 
* [https://en.wikipedia.org/wiki/Meta_learning_%28computer_science%29 Meta Learning from Wikipedia]
Line 567: Line 586:
 
* [http://www.aihorizon.com/essays/generalai/no_free_lunch_machine_learning.htm AI Horizon: Machine Learning, Part III: Testing Algorithms, and The "No Free Lunch Theorem"]
 
* [http://www.aihorizon.com/essays/generalai/no_free_lunch_machine_learning.htm AI Horizon: Machine Learning, Part III: Testing Algorithms, and The "No Free Lunch Theorem"]
 
==Chess==
 
==Chess==
* [http://www.top-5000.nl/authors/rebel/hints.htm Learning Methods] by [[Ed Schroder|Ed Schröder]]
 
 
* [http://archive.ics.uci.edu/ml/datasets/Chess+%28King-Rook+vs.+King-Pawn%29 UCI Machine Learning Repository: Chess (King-Rook vs. King-Pawn) Data Set] by [[Alen Shapiro]]
 
* [http://archive.ics.uci.edu/ml/datasets/Chess+%28King-Rook+vs.+King-Pawn%29 UCI Machine Learning Repository: Chess (King-Rook vs. King-Pawn) Data Set] by [[Alen Shapiro]]
 +
* [https://en.chessbase.com/post/standing-on-the-shoulders-of-giants Standing on the shoulders of giants] by [[Albert Silver]], [[ChessBase|ChessBase News]], September 18, 2019
 
==Supervised Learning==
 
==Supervised Learning==
 
* [https://en.wikipedia.org/wiki/Supervised_learning Supervised learning from Wikipedia]
 
* [https://en.wikipedia.org/wiki/Supervised_learning Supervised learning from Wikipedia]
 
* [http://www.scholarpedia.org/article/Category:Supervised_learning Category: Supervised learning - Scholarpedia]
 
* [http://www.scholarpedia.org/article/Category:Supervised_learning Category: Supervised learning - Scholarpedia]
 
* [https://en.wikipedia.org/wiki/Boosting_%28machine_learning%29 Boosting (machine learning) from Wikipedia]
 
* [https://en.wikipedia.org/wiki/Boosting_%28machine_learning%29 Boosting (machine learning) from Wikipedia]
: [https://en.wikipedia.org/wiki/AdaBoost AdaBoost from Wikipedia]
+
** [https://en.wikipedia.org/wiki/AdaBoost AdaBoost from Wikipedia]
 
* [https://en.wikipedia.org/wiki/Computational_learning_theory Computational learning theory from Wikipedia]
 
* [https://en.wikipedia.org/wiki/Computational_learning_theory Computational learning theory from Wikipedia]
 
* [https://en.wikipedia.org/wiki/Support_vector_machine Support vector machine from Wikipedia]
 
* [https://en.wikipedia.org/wiki/Support_vector_machine Support vector machine from Wikipedia]
Line 596: Line 615:
 
* [https://en.wikipedia.org/wiki/Statistical_learning_theory Statistical learning theory from Wikipedia]
 
* [https://en.wikipedia.org/wiki/Statistical_learning_theory Statistical learning theory from Wikipedia]
 
* [https://en.wikipedia.org/wiki/Statistical_classification Statistical classification from Wikipedia]
 
* [https://en.wikipedia.org/wiki/Statistical_classification Statistical classification from Wikipedia]
: [https://en.wikipedia.org/wiki/Naive_Bayes_classifier Naive Bayes classifier from Wikipedia]
+
** [https://en.wikipedia.org/wiki/Naive_Bayes_classifier Naive Bayes classifier from Wikipedia]
: [https://en.wikipedia.org/wiki/Probabilistic_classification Probabilistic classification from Wikipedia]
+
** [https://en.wikipedia.org/wiki/Probabilistic_classification Probabilistic classification from Wikipedia]
 
* [https://en.wikipedia.org/wiki/Statistical_mechanics Statistical mechanics from Wikipedia]
 
* [https://en.wikipedia.org/wiki/Statistical_mechanics Statistical mechanics from Wikipedia]
 
* [https://en.wikipedia.org/wiki/Bayesian_network Bayesian network from Wikipedia]
 
* [https://en.wikipedia.org/wiki/Bayesian_network Bayesian network from Wikipedia]
Line 610: Line 629:
 
* [https://en.wikipedia.org/wiki/Mean_squared_error Mean squared error from Wikipedia]
 
* [https://en.wikipedia.org/wiki/Mean_squared_error Mean squared error from Wikipedia]
 
* [https://en.wikipedia.org/wiki/Regression_analysis Regression analysis from Wikipedia]
 
* [https://en.wikipedia.org/wiki/Regression_analysis Regression analysis from Wikipedia]
: [https://en.wikipedia.org/wiki/Outline_of_regression_analysis Outline of regression analysis from Wikipedia]
+
** [https://en.wikipedia.org/wiki/Outline_of_regression_analysis Outline of regression analysis from Wikipedia]
: [https://en.wikipedia.org/wiki/Linear_regression Linear regression from Wikipedia]
+
** [https://en.wikipedia.org/wiki/Linear_regression Linear regression from Wikipedia]
: [https://en.wikipedia.org/wiki/Logistic_regression Logistic regression from Wikipedia]
+
** [https://en.wikipedia.org/wiki/Logistic_regression Logistic regression from Wikipedia]
 
* [https://en.wikipedia.org/wiki/Probability Probability from Wikipedia]
 
* [https://en.wikipedia.org/wiki/Probability Probability from Wikipedia]
 
* [https://en.wikipedia.org/wiki/Probability_theory Probability theory from Wikipedia]
 
* [https://en.wikipedia.org/wiki/Probability_theory Probability theory from Wikipedia]
 
* [https://en.wikipedia.org/wiki/Probability_density_function Probability density function from Wikipedia]
 
* [https://en.wikipedia.org/wiki/Probability_density_function Probability density function from Wikipedia]
 
* [https://en.wikipedia.org/wiki/Probability_distribution Probability distribution from Wikipedia]
 
* [https://en.wikipedia.org/wiki/Probability_distribution Probability distribution from Wikipedia]
: [https://en.wikipedia.org/wiki/Normal_distribution Normal distribution from Wikipedia]
+
** [https://en.wikipedia.org/wiki/Normal_distribution Normal distribution from Wikipedia]
 
* [https://en.wikipedia.org/wiki/Probability_measure Probability measure from Wikipedia]
 
* [https://en.wikipedia.org/wiki/Probability_measure Probability measure from Wikipedia]
 
* [https://en.wikipedia.org/wiki/Probability_space Probability space from Wikipedia]
 
* [https://en.wikipedia.org/wiki/Probability_space Probability space from Wikipedia]
 
* [https://en.wikipedia.org/wiki/Pseudorandomness Pseudorandomness from Wikipedia]
 
* [https://en.wikipedia.org/wiki/Pseudorandomness Pseudorandomness from Wikipedia]
: [https://en.wikipedia.org/wiki/Pseudorandom_number_generator Pseudorandom number generator from Wikipedia]
+
** [https://en.wikipedia.org/wiki/Pseudorandom_number_generator Pseudorandom number generator from Wikipedia]
: [https://en.wikipedia.org/wiki/Pseudo-random_number_sampling Pseudo-random number sampling from Wikipedia]
+
** [https://en.wikipedia.org/wiki/Pseudo-random_number_sampling Pseudo-random number sampling from Wikipedia]
 
* [https://en.wikipedia.org/wiki/Randomness Randomness from Wikipedia]
 
* [https://en.wikipedia.org/wiki/Randomness Randomness from Wikipedia]
: [https://en.wikipedia.org/wiki/Statistical_randomness Statistical randomness from Wikipedia]
+
** [https://en.wikipedia.org/wiki/Statistical_randomness Statistical randomness from Wikipedia]
 
* [https://en.wikipedia.org/wiki/Vapnik%E2%80%93Chervonenkis_theory Vapnik–Chervonenkis theory from Wikipedia]
 
* [https://en.wikipedia.org/wiki/Vapnik%E2%80%93Chervonenkis_theory Vapnik–Chervonenkis theory from Wikipedia]
 
* [https://en.wikipedia.org/wiki/VC_dimension VC dimension from Wikipedia]
 
* [https://en.wikipedia.org/wiki/VC_dimension VC dimension from Wikipedia]
Line 674: Line 693:
 
* [http://www.scholarpedia.org/article/Hopfield_network Hopfield network - Scholarpedia]
 
* [http://www.scholarpedia.org/article/Hopfield_network Hopfield network - Scholarpedia]
 
* [https://en.wikipedia.org/wiki/Long_short_term_memory Long short term memory from Wikipedia]
 
* [https://en.wikipedia.org/wiki/Long_short_term_memory Long short term memory from Wikipedia]
'''Blogs'''
 
* [https://theneural.wordpress.com/ Neural Networks Blog] by [[Ilya Sutskever]]
 
* [http://dynamicnotions.blogspot.com/ Dynamic Notions] by [http://www.blogger.com/profile/07894297206547597169 John Wakefield] , a Blog about the evolution of neural networks with [[C sharp|C#]] samples:
 
: [http://dynamicnotions.blogspot.com/2008/09/single-layer-perceptron.html The Single Layer Perceptron]
 
: [http://dynamicnotions.blogspot.com/2008/09/hidden-neurons-and-feature-space.html Hidden Neurons and Feature Space]
 
: [http://dynamicnotions.blogspot.com/2008/09/training-neural-networks-using-back.html Training Neural Networks Using Back Propagation in C#]
 
: [http://dynamicnotions.blogspot.com/2008/09/data-mining-with-artificial-neural.html Data Mining with Artificial Neural Networks (ANN)]
 
* [http://www.welchlabs.com/blog Blog - Welch Labs]
 
 
==Courses==
 
==Courses==
 
* [http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching.html Advanced Topics: RL] by [[David Silver]]
 
* [http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching.html Advanced Topics: RL] by [[David Silver]]
Line 693: Line 704:
 
=References=  
 
=References=  
 
<references />
 
<references />
 
 
'''[[Main Page|Up one Level]]'''
 
'''[[Main Page|Up one Level]]'''
 +
[[Category:Videos]]

Revision as of 18:32, 7 October 2020

Home * Learning

Learning [1]

Learning,
the process of acquiring new knowledge which involves synthesizing different types of information. Machine learning as aspect of computer chess programming deals with algorithms that allow the program to change its behavior based on data, which for instance occurs during game playing against a variety of opponents considering the final outcome and/or the game record for instance as history score chart indexed by ply. Related to Machine learning is evolutionary computation and its sub-areas of genetic algorithms, and genetic programming, that mimics the process of natural evolution, as further mentioned in automated tuning. The process of learning often implies understanding, perception or reasoning. So called Rote learning avoids understanding and focuses on memorization. Inductive learning takes examples and generalizes rather than starting with existing knowledge. Deductive learning takes abstract concepts to make sense of examples [2].

Learning inside a Chess Program

Learning inside a chess program may address several disjoint issues. A persistent hash table remembers "important" positions from earlier games inside the search with its exact score [3]. Worse positions may be avoided in advance. Learning opening book moves, that is appending successful novelties or modify the probability of already stored moves from the book based on the outcome of a game [4]. Another application is learning evaluation weights of various features, f. i. piece- [5] or piece-square [6] values or mobility. Programs may also learn to control search [7] or time usage [8].

Learning Paradigms

There are three major learning paradigms, each corresponding to a particular abstract learning task. These are supervised learning, unsupervised learning and reinforcement learning. Usually any given type of neural network architecture can be employed in any of those tasks.

Supervised Learning

see main page Supervised Learning

Supervised learning is learning from examples provided by a knowledgable external supervisor. In machine learning, supervised learning is a technique for deducing a function from training data. The training data consist of pairs of input objects and desired outputs, f.i. in computer chess a sequence of positions associated with the outcome of a game [9] .

Unsupervised Learning

Unsupervised machine learning seems much harder: the goal is to have the computer learn how to do something that we don't tell it how to do. The learner is given only unlabeled examples, f. i. a sequence of positions of a running game but the final result (still) unknown. A form of reinforcement learning can be used for unsupervised learning, where an agent bases its actions on the previous rewards and punishments without necessarily even learning any information about the exact ways that its actions affect the world. Clustering is another method of unsupervised learning.

Reinforcement Learning

see main page Reinforcement Learning

Reinforcement learning is defined not by characterizing learning methods, but by characterizing a learning problem. Reinforcement learning is learning what to do - how to map situations to actions - so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them. The reinforcement learning problem is deeply indebted to the idea of Markov decision processes (MDPs) from the field of optimal control.

Learning Topics

Programs

See also

Selected Publications

[10]

1940 ...

1950 ...

Claude Shannon, John McCarthy (eds.) (1956). Automata Studies. Annals of Mathematics Studies, No. 34
Alan Turing, Jack Copeland (editor) (2004). The Essential Turing, Seminal Writings in Computing, Logic, Philosophy, Artificial Intelligence, and Artificial Life plus The Secrets of Enigma. Oxford University Press, amazon, google books

1955 ...

Claude Shannon, John McCarthy (eds.) (1956). Automata Studies. Annals of Mathematics Studies, No. 34, pdf

1960 ...

1965 ...

1970 ...

1975 ...

  • Jacques Pitrat (1976). A Program to Learn to Play Chess. Pattern Recognition and Artificial Intelligence, pp. 399-419. Academic Press Ltd. London, UK. ISBN 0-12-170950-7.
  • Jacques Pitrat (1976). Realization of a Program Learning to Find Combinations at Chess. Computer Oriented Learning Processes (ed. J. Simon). Noordhoff, Groningen, The Netherlands.
  • Pericles Negri (1977). Inductive Learning in a Hierarchical Model for Representing Knowledge in Chess End Games. pdf
  • Ryszard Michalski, Pericles Negri (1977). An experiment on inductive learning in chess endgames. Machine Intelligence 8, pdf
  • Boris Stilman (1977). The Computer Learns. in 1976 US Computer Chess Championship, by David Levy, Computer Science Press, Woodland Hills, CA, pp. 83-90
  • Richard Sutton (1978). Single channel theory: A neuronal theory of learning. Brain Theory Newsletter 3, No. 3/4, pp. 72-75.
  • Ross Quinlan (1979). Discovering Rules by Induction from Large Collections of Examples. Expert Systems in the Micro-electronic Age, pp. 168-201. Edinburgh University Press (Introducing ID3)

1980 ...

1985 ...

1986

1987

1988

1989

1990 ...

1991

1992

1993

1994

1995 ...

1996

1997

1998

Miroslav Kubat, Ivan Bratko, Ryszard Michalski (1998). A Review of Machine Learning Methods. pdf

1999

2000 ...

2001

2002

2003

2004

2005 ...

2006

2007

2008

2009

2010 ...

2011

2012

István Szita (2012). Reinforcement Learning in Games. Chapter 17

2013

2014

2015 ...

2016

2017

2018

2019

Forum Posts

1998 ...

2000 ...

2005 ...

2010 ...

2015 ...

External Links

Machine Learning

AI

Learning I
Learning II

Chess

Supervised Learning

Unsupervised Learning

Reinforcement Learning

TD Learning

Statistics

Markov Models

NNs

ANNs

Topics

RNNs

Courses

References

  1. A depiction of the world's oldest continually operating university, the University of Bologna, Italy, by Laurentius de Voltolina, second half of 14th century, Learning from Wikipedia
  2. Inductive learning vs Deductive learning
  3. David Slate (1987). A Chess Program that uses its Transposition Table to Learn from Experience. ICCA Journal, Vol. 10, No. 2
  4. Robert Hyatt (1999). Book Learning - a Methodology to Tune an Opening Book Automatically. ICCA Journal, Vol. 22, No. 1
  5. Don Beal, Martin C. Smith (1997). Learning Piece Values Using Temporal Differences. ICCA Journal, Vol. 20, No. 3
  6. Don Beal, Martin C. Smith (1999). Learning Piece-Square Values using Temporal Differences. ICCA Journal, Vol. 22, No. 4
  7. Yngvi Björnsson, Tony Marsland (2001). Learning Search Control in Adversary Games. Advances in Computer Games 9, pdf
  8. Levente Kocsis, Jos Uiterwijk, Jaap van den Herik (2000). Learning Time Allocation using Neural Networks. CG 2000, postscript
  9. AI Horizon: Machine Learning, Part II: Supervised and Unsupervised Learning
  10. online papers from Machine Learning in Games by Jay Scott
  11. Rosenblatt's Contributions
  12. Ratio Club from Wikipedia
  13. Royal Radar Establishment from Wikipedia
  14. see Swap-off by Helmut Richter
  15. The abandonment of connectionism in 1969 - Wikipedia
  16. Frank Rosenblatt (1962). Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms. Spartan Books
  17. Long short term memory from Wikipedia
  18. Tsumego from Wikipedia
  19. Learnable Evolution Model from Wikipedia
  20. University of Bristol - Department of Computer Science - Technical Reports
  21. Generalized Hebbian Algorithm from Wikipedia
  22. Dap Hartmann (2010). Mimicking the Black Box - Genetically evolving evaluation functions and search algorithms. Review on Omid David's Ph.D. Thesis, ICGA Journal, Vol 33, No. 1
  23. Monte-Carlo Simulation Balancing - videolectures.net by David Silver
  24. MATLAB from Wikipedia
  25. Weka (machine learning) from Wikipedia
  26. Ms. Pac-Man from Wikipedia
  27. Demystifying Deep Reinforcement Learning by Tambet Matiisen, Nervana, December 21, 2015
  28. Patent US20150100530 - Methods and apparatus for reinforcement learning - Google Patents
  29. 2048 (video game) from Wikipedia
  30. Teaching Deep Convolutional Neural Networks to Play Go by Hiroshi Yamashita, The Computer-go Archives, December 14, 2014
  31. Teaching Deep Convolutional Neural Networks to Play Go by Michel Van den Bergh, CCC, December 16, 2014
  32. Convolutional neural network from Wikipedia
  33. Best Paper Awards | TAAI 2014
  34. DeepChess: Another deep-learning based chess program by Matthew Lai, CCC, October 17, 2016
  35. ICANN 2016 | Recipients of the best paper awards
  36. Using GAN to play chess by Evgeniy Zheltonozhskiy, CCC, February 23, 2017
  37. New DeepMind paper by GregNeto, CCC, November 21, 2019
  38. Naive Bayes classifier from Wikipedia
  39. Amir Ban (2012). Automatic Learning of Evaluation, with Applications to Computer Chess. Discussion Paper 613, The Hebrew University of Jerusalem - Center for the Study of Rationality, Givat Ram
  40. Christopher Clark, Amos Storkey (2014). Teaching Deep Convolutional Neural Networks to Play Go. arXiv:1412.3409
  41. Chris J. Maddison, Aja Huang, Ilya Sutskever, David Silver (2014). Move Evaluation in Go Using Deep Convolutional Neural Networks. arXiv:1412.6564v1

Up one Level