Revision as of 15:13, 25 August 2018

Home * Learning

Learning ^[1]

Learning,
the process of acquiring new knowledge which involves synthesizing different types of information. Machine learning as aspect of computer chess programming deals with algorithms that allow the program to change its behavior based on data, which for instance occurs during game playing against a variety of opponents considering the final outcome and/or the game record for instance as history score chart indexed by ply. Related to Machine learning is evolutionary computation and its sub-areas of genetic algorithms, and genetic programming, that mimics the process of natural evolution, as further mentioned in automated tuning. The process of learning often implies understanding, perception or reasoning. So called Rote learning avoids understanding and focuses on memorization. Inductive learning takes examples and generalizes rather than starting with existing knowledge. Deductive learning takes abstract concepts to make sense of examples ^[2].

Learning inside a Chess Program

Learning inside a chess program may address several disjoint issues. A persistent hash table remembers "important" positions from earlier games inside the search with its exact score ^[3]. Worse positions may be avoided in advance. Learning opening book moves, that is appending successful novelties or modify the probability of already stored moves from the book based on the outcome of a game ^[4]. Another application is learning evaluation weights of various features, f. i. piece- ^[5] or piece-square ^[6] values or mobility. Programs may also learn to control search ^[7] or time usage ^[8].

Learning Paradigms

There are three major learning paradigms, each corresponding to a particular abstract learning task. These are supervised learning, unsupervised learning and reinforcement learning. Usually any given type of neural network architecture can be employed in any of those tasks.

Supervised Learning

Supervised learning is learning from examples provided by a knowledgable external supervisor. In machine learning, supervised learning is a technique for deducing a function from training data. The training data consist of pairs of input objects and desired outputs, f.i. in computer chess a sequence of positions associated with the outcome of a game ^[9] .

Unsupervised Learning

Unsupervised machine learning seems much harder: the goal is to have the computer learn how to do something that we don't tell it how to do. The learner is given only unlabeled examples, f. i. a sequence of positions of a running game but the final result (still) unknown. A form of reinforcement learning can be used for unsupervised learning, where an agent bases its actions on the previous rewards and punishments without necessarily even learning any information about the exact ways that its actions affect the world. Clustering is another method of unsupervised learning.

Reinforcement Learning

see main page Reinforcement Learning

Reinforcement learning is defined not by characterizing learning methods, but by characterizing a learning problem. Reinforcement learning is learning what to do - how to map situations to actions - so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them. The reinforcement learning problem is deeply indebted to the idea of Markov decision processes (MDPs) from the field of optimal control.

Learning Topics

Programs

Selected Publications

^[10]

1940 ...

Walter Pitts (1942). Some observations on the simple neuron circuit. Bulletin of Mathematical Biology, Vol. 4, No. 3
Warren S. McCulloch, Walter Pitts (1943). A Logical Calculus of the Ideas Immanent in Nervous Activity. Bulletin of Mathematical Biology, Vol. 5, No. 1
Donald O. Hebb (1949). The Organization of Behavior. Wiley & Sons

1950 ...

Stephen C. Kleene (1951) Representation of Events in Nerve Nets and Finite Automata. RM-704, RAND paper, pdf, reprinted in

Claude Shannon, John McCarthy (eds.) (1956). Automata Studies. Annals of Mathematics Studies, No. 34

Paul I. Richards (1951). Machines which can learn. American Scientist, 39:711-716
Paul I. Richards (1952). On Game Learning Machines. The Scientific Monthly, Vol. 74, No. 4, April 1952
Alan Turing (1953). Chess. part of the collection Digital Computers Applied to Games in Bertram Vivian Bowden (editor), Faster Than Thought, a symposium on digital computing machines, reprinted 1988 in Computer Chess Compendium, reprinted in

Alan Turing, Jack Copeland (editor) (2004). The Essential Turing, Seminal Writings in Computing, Logic, Philosophy, Artificial Intelligence, and Artificial Life plus The Secrets of Enigma. Oxford University Press, amazon, google books

Marvin Minsky (1954). Neural Nets and the Brain Model Problem. Ph.D. dissertation, Princeton University

1955 ...

John von Neumann (1956). Probabilistic Logic and the Synthesis of Reliable Organisms From Unreliable Components. in

Claude Shannon, John McCarthy (eds.) (1956). Automata Studies. Annals of Mathematics Studies, No. 34, pdf

Frank Rosenblatt (1957). The Perceptron - a Perceiving and Recognizing Automaton. Report 85-460-1, Cornell Aeronautical Laboratory ^[11]
Albert M. Uttley (1959). Imitation of Pattern Recognition and Trial-and-error Learning in a Conditional Probability Computer. Reviews of Modern Physics, Vol. 31, April 1959, pp. 546-548 ^[12] ^[13]
Arthur Samuel (1959). Some Studies in Machine Learning Using the Game of Checkers. IBM Journal July 1959 » Checkers
Edward Feigenbaum (1959). An Information Processing Theory of Verbal Learning. RAND Paper

1960 ...

Edward Feigenbaum (1960). Information Theories of Human Verbal Learning. Ph.D. thesis, Carnegie Mellon University, advisor Herbert Simon
Edward Feigenbaum (1961). The Simulation of Verbal Learning Behavior. Proceedings Western Joint Conference, Vol. 19
Edward Feigenbaum, Herbert Simon (1961). Performance of a Reading Task by an Elementary Perceiving and Memorizing Program. RAND Paper, pdf
Donald Michie (1961). Trial and Error. Penguin Science Survey, pdf
Edward Feigenbaum, Herbert Simon (1962). A Theory of the Serial Position Effect. British Journal of Psychology, Vol. 53, 307-32, pdf
Earl B. Hunt (1962). Concept Learning: An Information Processing Problem. Wiley. google books
Frank Rosenblatt (1962). Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms. Spartan Books
Allen Newell (1963). Learning, Generality and Problem Solving. Memorandum RM-3285-1-PR pdf
Herbert Simon, Edward Feigenbaum (1964). An Information-processing Theory of Some Effects of Similarity, Familiarization, and Meaningfulness in Verbal Learning. Journal of Verbal Learning and Verbal Behavior, Vol. 3, No. 5, pdf

1965 ...

James R. Slagle (1965). A multipurpose Theorem Proving Heuristic Program that learns. IFIP Congress 65, Vol. 2
Donald Michie (1966). Game Playing and Game Learning Automata. Advances in Programming and Non-Numerical Computation, Leslie Fox (ed.), pp. 183-200. Oxford, Pergamon. » Includes Appendix: Rules of SOMAC by John Maynard Smith, introduces Expectiminimax tree ^[14]
Thomas A. Throop (1966). Thoughts on the Development of Computer Learning Programs. Defense Technical Information Center
Arnold K. Griffith (1966). A new Machine-Learning Technique applied to the Game of Checkers. MIT, Project MAC, MAC-M-293
Arthur Samuel (1967). Some Studies in Machine Learning. Using the Game of Checkers. II-Recent Progress. pdf
Marvin Minsky, Seymour Papert (1969). Perceptrons. ^[15] ^[16]

1970 ...

Albert Zobrist (1970). A Pattern Recognition Program which uses a Geometry-Preserving Representation of Features. Technical Report #85, pdf
Vladimir Vapnik, Alexey Chervonenkis (1971). On the Uniform Convergence of Relative Frequencies of Events to Their Probabilities. Theory of Probability and its Applications, Vol. 16, No. 2
A. Harry Klopf (1972). Brain Function and Adaptive Systems - A Heterostatic Theory. Air Force Cambridge Research Laboratories, Special Reports, No. 133, pdf
Marvin Minsky, Seymour Papert (1972). Perceptrons: An Introduction to Computational Geometry. The MIT Press, 2nd edition with corrections
Herbert Simon, Kevin J. Gilmartin (1973). A Simulation of Memory for Chess Positions. Cognitive Psychology, Vol. 5, pp. 29-46. pdf
Arnold K. Griffith (1974). A Comparison and Evaluation of Three Machine Learning Procedures as Applied to the Game of Checkers. Artificial Intelligence, Vol. 5, No. 2 » Checkers

1975 ...

Jacques Pitrat (1976). A Program to Learn to Play Chess. Pattern Recognition and Artificial Intelligence, pp. 399-419. Academic Press Ltd. London, UK. ISBN 0-12-170950-7.
Jacques Pitrat (1976). Realization of a Program Learning to Find Combinations at Chess. Computer Oriented Learning Processes (ed. J. Simon). Noordhoff, Groningen, The Netherlands.
Pericles Negri (1977). Inductive Learning in a Hierarchical Model for Representing Knowledge in Chess End Games. pdf
Ryszard Michalski, Pericles Negri (1977). An experiment on inductive learning in chess endgames. Machine Intelligence 8, pdf
Boris Stilman (1977). The Computer Learns. in 1976 US Computer Chess Championship, by David Levy, Computer Science Press, Woodland Hills, CA, pp. 83-90
Richard Sutton (1978). Single channel theory: A neuronal theory of learning. Brain Theory Newsletter 3, No. 3/4, pp. 72-75.
Ross Quinlan (1979). Discovering Rules by Induction from Large Collections of Examples. Expert Systems in the Micro-electronic Age, pp. 168-201. Edinburgh University Press (Introducing ID3)

1980 ...

Sarah E. Goldin, Philip Klahr (1981). Learning and Abstraction in Simulation. IJCAI 1981, pdf
Paul E. Utgoff, Tom Mitchell (1982). Acquisition of Appropriate Bias for Inductive Concept Learning. AAAI 1982, pdf
A. Harry Klopf (1982). The Hedonistic Neuron: A Theory of Memory, Learning, and Intelligence. Hemisphere Publishing Corporation, University of Michigan
Alen Shapiro, Tim Niblett (1982). Automatic Induction of Classification Rules for Chess End game. Advances in Computer Chess 3
Thomas Nitsche (1982). A Learning Chess Program. Advances in Computer Chess 3
Ryszard Michalski, Jaime Carbonell, Tom Mitchell (1983). Machine Learning: An Artificial Intelligence Approach. Tioga Publishing Company, ISBN 0-935382-05-4. google books
Ross Quinlan (1983). Learning efficient classification procedures and their application to chess end games. In Machine Learning: An Artificial Intelligence Approach, pages 463–482. Tioga, Palo Alto
Alen Shapiro (1983). The Role of Structured Induction in Expert Systems. University of Edinburgh, Machine Intelligence Research Unit (Ph.D. thesis)
Edward Feigenbaum, Herbert Simon (1984). EPAMlike models of recognition and learning. Cognitive Science, Vol. 8, 305-336, pdf
John E. Laird, Paul S. Rosenbloom, Allen Newell (1984). Towards Chunking as a General Learning Mechanism. AAAI 1984
Albrecht Heeffer (1984). Automated Acquisition on Concepts for the Description of Middle-game Positions in Chess. Turing Institute, Glasgow, Scotland, TIRM-84-005
Paul E. Utgoff (1984). Shift of Bias for Inductive Concept Learning. Ph.D. thesis, Rutgers University, New Brunswick
Leslie Valiant (1984). A Theory of the Learnable. Communications of the ACM, Vol. 27, No. 11, pdf

1985 ...

Tony Marsland (1985). Evaluation-Function Factors. ICCA Journal, Vol. 8, No. 2, pdf
Albrecht Heeffer (1985). Validating Concepts from Automated Acquisition Systems. IJCAI 85, pdf
Hans Berliner (1985). Goals, Plans, and Mechanisms: Non-symbolically in an Evaluation Surface. Presentation at Evolution, Games, and Learning, Center for Nonlinear Studies, Los Alamos National Laboratory, May 21.
Ryszard Michalski, Jaime Carbonell, Tom Mitchell (1985). Machine Learning: An Artificial Intelligence Approach. Morgan Kaufmann, ISBN 0-934613-09-5. google books
Igor Roizen, Judea Pearl (1985). Learning Link Probabilities in Causal Trees. Proceedings of the Second Conference on Uncertainty in Artificial Intelligence

1986

Steven Skiena (1986). An Overview of Machine Learning in Chess. ICCA Journal, Vol. 9, No. 1
Jens Christensen, Richard Korf (1986). A Unified Theory of Heuristic Evaluation functions and Its Applications to Learning. Proceedings of the AAAI-86, pp. 148-152, pdf.
Ryszard Michalski, Jaime Carbonell, Tom Mitchell (1986). Machine Learning: An Artificial Intelligence Approach, Volume II. Morgan Kaufmann, ISBN 0-934613-00-1. google books
Tom Mitchell, Jaime Carbonell, Ryszard Michalski (1986). Machine Learning: A Guide to Current Research. The Kluwer International Series in Engineering and Computer Science, Vol. 12
Ivan Bratko, Igor Kononenko (1986). Learning Rules from Incomplete and Noisy Data. Proceedings Unicom Seminar on the Scope of Artificial Intelligence in Statistics. Technical Press

1987

David Slate (1987). A Chess Program that uses its Transposition Table to Learn from Experience. ICCA Journal, Vol. 10, No. 2
Ronald L. Rivest (1987). Learning Decision Lists. Machine Learning 2,3, pdf 2001
Gerald Tesauro, Terrence J. Sejnowski (1987). A 'Neural' Network that Learns to Play Backgammon. NIPS 1987
Alen Shapiro (1987). Structured Induction in Expert Systems. Turing Institute Press in association with Addison-Wesley Publishing Company, Workingham, UK
Alberto Maria Segre (1987). On the Operationality/Generality Trade-off in Explanation-based Learning. IJCAI 1987, pdf
Alberto Maria Segre (1987). Explanation-Based Learning of Generalized Robot Assembly Plans. Ph.D. thesis, University of Illinois at Urbana-Champaign, Advisor: Gerald Francis DeJong, II
Eric B. Baum, Frank Wilczek (1987). Supervised Learning of Probability Distributions by Neural Networks. NIPS 1987

1988

Bruce Abramson (1988). Learning Expected-Outcome Evaluators in Chess. Proceedings of the 1988 AAAI Spring Symposium Series: Computer Game Playing, 26-28.
Richard Sutton (1988). Learning to Predict by the Methods of Temporal Differences. Machine Learning, Vol. 3, No. 1, pdf
David E. Goldberg, John H. Holland (1988). Genetic Algorithms and Machine Learning. Machine Learning, Vol. 3
Kenneth A. De Jong, Alan C. Schultz (1988). Using Experience-Based Learning in Game Playing. Proceedings of the Fifth International Machine Learning Conference, CiteSeerX » Othello
Kai-Fu Lee, Sanjoy Mahajan (1988). A Pattern Classification Approach to Evaluation Function Learning. Artificial Intelligence, Vol. 36, No. 1
Paul E. Utgoff (1988). ID5: An incremental ID3. ML 1988

1989

Robert Levinson (1989). A Self-Learning, Pattern-Oriented Chess Program. ICCA Journal, Vol. 12, No. 4
Bruce Abramson (1989). On Learning and Testing Evaluation Functions. Proceedings of the Sixth Israeli Conference on Artificial Intelligence, 1989, 7-16.
Eric Wefald, Stuart Russell (1989). Adaptive Learning of Decision-Theoretic Search Control Knowledge. In Proceedings of the Sixth International Workshop on Machine Learning. Ithaca, NY: Morgan Kaufmann
Stephen Muggleton, Michael Bain, Jean Hayes Michie, Donald Michie (1989). An Experimental Comparison of Human and Machine Learning Formalisms. 6. ML 1989, pdf
Eric B. Baum (1989). A Proposal for More Powerful Learning Algorithms. Neural Computation, Vol. 1, No. 2
Susan L. Epstein (1989). The Intelligent Novice - Learning to Play Better. Heuristic Programming in Artificial Intelligence 1
Chris Watkins (1989). Learning from Delayed Rewards. Ph.D. thesis, Cambridge University, pdf

1990 ...

Richard Sutton, Andrew Barto (1990). Time Derivative Models of Pavlovian Reinforcement. Learning and Computational Neuroscience: Foundations of Adaptive Networks: 497-537.
Bruce Abramson (1990). On Learning and Testing Evaluation Functions. Journal of Experimental and Theoretical Artificial Intelligence 2: 241-251.
Tony Scherzer, Linda Scherzer, Dean Tjaden (1990). Learning in Bebe. Computers, Chess, and Cognition » Mephisto Best-Publication Award
Yves Kodratoff, Ryszard Michalski (1990). Machine Learning: An Artificial Intelligence Approach, Volume III. Morgan Kaufmann, ISBN 1-55860-119-8. google books
Michèle Sebag (1990). A symbolic-numerical approach for supervised learning from examples and rules. Ph.D. thesis, Paris Dauphine University

1991

Robert Schapire (1991). The Design and Analysis of Efficient Learning Algorithms. Ph.D. thesis, Massachusetts Institute of Technology, supervisor Ronald L. Rivest, pdf
Gerhard Mehlsam, Hermann Kaindl, Wilhelm Barth (1991). Feature Construction During Tree Learning. GWAI 1991: 50-61.
Alex van Tiggelen (1991). Neural Networks as a Guide to Optimization - The Chess Middle Game Explored. ICCA Journal, Vol. 14, No. 3
William Tunstall-Pedoe (1991). Genetic Algorithms Optimizing Evaluation Functions. ICCA Journal, Vol. 14, No. 3
Tony Scherzer, Linda Scherzer, Dean Tjaden (1991). Learning in Bebe. ICCA Journal, Vol. 14, No. 4
Steven Walczak (1991). Predicting Actions from Induction on Past Performance. Proceedings of the 8th International Workshop on Machine Learning , pp. 275-279. Morgan Kaufmann
Paul E. Utgoff, Jeffery A. Clouse (1991). Two Kinds of Training Information for Evaluation Function Learning. University of Massachusetts, Amherst, Proceedings of the AAAI 1991
Byoung-Tak Zhang, Gerd Veenker (1991). Neural networks that teach themselves through genetic discovery of novel examples. IEEE IJCNN'91, pdf
Byoung-Tak Zhang, Gerd Veenker (1991). Focused incremental learning for improved generalization with reduced training sets. ICANN'91, pdf

1992

Miroslav Kubat (1992). Introduction to Machine Learning. Advanced Topics in Artificial Intelligence 1992
Michael Bain (1992). Learning optimal chess strategies. Proc. Intl. Workshop on Inductive Logic Programming (ed. Stephen Muggleton), Institute for New Generation Computer Technology, Tokyo, Japan.
Eduardo F. Morales (1992). First-Order Induction of Patterns in Chess. Ph.D. Thesis, The Turing Institute, University of Strathclyde, Glasgow
Eduardo F. Morales (1992). Learning Chess Patterns. Inductive Logic Programming (ed. Stephen Muggleton), Academic Press, The Apic Series, London, UK
Gerald Tesauro (1992). Temporal Difference Learning of Backgammon Strategy. ML 1992
Chris Watkins, Peter Dayan (1992). Q-learning. Machine Learning, Vol. 8, No. 2
Gerald Tesauro (1992). Practical Issues in Temporal Difference Learning. Machine Learning, Vol. 8, No. 3-4
Manuela Veloso (1992). Learning by Analogical Reasoning in General Purpose Problem Solving. Ph.D. thesis, Carnegie Mellon University, advisor Jaime Carbonell

1993

Michael Gherrity (1993). A Game Learning Machine. Ph.D. Thesis, University of California, San Diego, zipped ps
Shaul Markovitch, Yaron Sella (1993). Learning of Resource Allocation Strategies for Game Playing, The proceedings of the 13th International Joint Conference on Artificial Intelligence, Chambery, France. pdf
David Carmel, Shaul Markovitch (1993). Learning Models of Opponent's Strategy in Game Playing. AAAI Proceedings, CiteSeerX
Dan Geiger, Azaria Paz, Judea Pearl (1993). Learning simple causal structures. International Journal of Intelligent Systems, 8, pp. 231-247.
Sebastian Thrun, Tom Mitchell (1993). Integrating Inductive Neural Network Learning and Explanation-Based Learning. IJCAI 1993, zipped ps
Alois Heinz, Christoph Hense (1993). Bootstrap learning of α-β-evaluation functions. ICCI 1993, pdf

1994

Eduardo F. Morales (1994). Learning Patterns for Playing Strategies. ICCA Journal, Vol. 17, No. 1
Fernand Gobet, Peter Jansen (1994). Towards a chess program based on a model of human memory. Advances in Computer Chess 7 » CHUMP
Michael Bain (1994). Learning Logical Exceptions in Chess. Ph.D. thesis, University of Strathclyde, CitySeerX
Michael Bain, Stephen Muggleton (1994). Learning Optimal Chess Strategies. Machine Intelligence 13 (eds. K. Furukawa and Donald Michie), pp. 291-309. Oxford University Press, Oxford, UK. ISBN 0198538502.
Ryszard Michalski, George Tecuci (1994). Machine Learning: A Multistrategy Approach, Volume IV. Morgan Kaufmann, ISBN 1-55860-251-8. google books
Gerald Tesauro (1994). TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play. Neural Computation Vol. 6, No. 2
Alberto Maria Segre, Charles Elkan (1994). A High-Performance Explanation-Based Learning Algorithm. Artificial Intelligence, Vol. 68, Nos. 1-2
David E. Moriarty, Risto Miikkulainen (1994). Evolving Neural Networks to focus Minimax Search. AAAI-94, pdf
Nicol N. Schraudolph, Peter Dayan, Terrence J. Sejnowski (1994). Temporal Difference Learning of Position Evaluation in the Game of Go. Advances in Neural Information Processing Systems 6

1995 ...

Gerhard Mehlsam, Hermann Kaindl, Wilhelm Barth (1995). Feature Construction during Tree Learning. GOSLER Final Report 1995: 391-403
Chris McConnell (1995). Tuning Evaluation Functions for Search. ps or pdf from CiteSeerX
David Heckerman, Dan Geiger, Max Chickering (1995). Learning Bayesian Networks: The Combination of Knowledge and Statistical Data. Machine Learning, Vol. 20, pdf
Tristan Cazenave (1995). Learning and Problem Solving in Gogol, a Go playing program. pdf
Gerald Tesauro (1995). Temporal Difference Learning and TD-Gammon. Communications of the ACM Vol. 38, No. 3
Sebastian Thrun (1995). Learning to Play the Game of Chess. in Gerald Tesauro, David S. Touretzky, Todd K. Leen (eds.) Advances in Neural Information Processing Systems 7, MIT Press
Marco Wiering (1995). TD Learning of Game Evaluation Functions with Hierarchical Neural Architectures. Master's thesis, University of Amsterdam, pdf
Michael A. Arbib (ed.) (1995, 2002). The Handbook of Brain Theory and Neural Networks. The MIT Press
Nicol N. Schraudolph (1995). Optimization of Entropy with Neural Networks. Ph.D. thesis, University of California, San Diego
Robert W. Howard (1995). Learning and Memory: Major Ideas, Principles, Issues and Applications. Praeger, amazon.com

1996

Leemon C. Baird III, Mance E. Harmon, A. Harry Klopf (1996). Reinforcement Learning: An Alternative Approach to Machine Intelligence. pdf
Sebastian Thrun (1996). Explanation-Based Neural Network Learning: A Lifelong Learning Approach. Kluwer Academic Publishers
Leslie Pack Kaelbling, Michael L. Littman, Andrew W. Moore (1996). Reinforcement Learning: A Survey. JAIR Vol. 4, pdf
Eduardo F. Morales (1996). Learning Playing Strategies in Chess. Computational Intelligence, Vol. 12, No. 1, CiteSeerX
Wee Sun Lee (1996). Agnostic Learning and Single Hidden Layer Neural Networks. Ph.D. thesis, Australian National University, ps
Johannes Fürnkranz (1996). Machine Learning in Computer Chess: The Next Generation. ICCA Journal, Vol. 19, No. 3, zipped ps
Adriaan de Groot, Fernand Gobet (1996). Perception and memory in chess. Heuristics of the professional eye. Assen: Van Gorcum, The Netherlands. ISBN 90-232-2949-5. Chapter 9; A discussion: Two authors, two different views? word
Stuart Russell (1996). Machine Learning. Chapter 4 of M. A. Boden (Ed.), Artificial Intelligence, Academic Press. Part of the Handbook of Perception and Cognition, ps
Barney Pell, Susan L. Epstein, Robert Levinson (1996). Introduction to the special issue on games: Structure and Learning. Computational Intelligence, Vol. 12, No. 1, pdf
Robert Levinson (1996). General Game-Playing and Reinforcement Learning. Computational Intelligence, Vol. 12, No. 1
Tristan Cazenave (1996). Learning to forecast by explaining the consequences of actions. pdf
Tristan Cazenave (1996). Self fuzzy learning. pdf
Yoav Freund, Robert Schapire (1996). Game Theory, On-line Prediction and Boosting. COLT 1996, pdf
Christopher D. Rosin, Richard K. Belew (1996). A Competitive Approach in Game Learning. COLT 1996, pdf

1997

Yoav Freund, Robert Schapire (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, Vol. 55, No. 1, 1996 pdf » AdaBoost
Sepp Hochreiter, Jürgen Schmidhuber (1997). Long short-term memory. Neural Computation, Vol. 9, No. 8, pdf ^[17]
Eduardo F. Morales (1997). On Learning How to Play. Advances in Computer Chess 8, CiteSeerX
Don Beal, Martin C. Smith (1997). Learning Piece Values Using Temporal Differences. ICCA Journal, Vol. 20, No. 3
Kieran Greer, Piyush Ojha, David A. Bell (1997). Learning Search Heuristics from Examples: A Study in Computer Chess, Seventh Conference of the Spanish Association for Artificial Intelligence, CAEPIA’97, November, pp. 695-704.
Nir Friedman, Moises Goldszmidt, David Heckerman, Stuart Russell (1997). Where is the Impact of Bayesian Networks in Learning? In Proc. Fifteenth International Joint Conference on Artificial Intelligence, Nagoya, Japan, ps
Ronald Parr, Stuart Russell (1997). Reinforcement Learning with Hierarchies of Machines. In Advances in Neural Information Processing Systems 10, MIT Press, zipped ps
Tristan Cazenave (1997). Gogol (an Analytical Learning Program). IJCAI'97, pdf
Tom Mitchell (1997). Machine Learning. McGraw Hill
Michèle Sebag (1997). Stochastic Heuristics for Machine Learning & Machine Learning for Stochastic Optimization. Habilitation, Paris-Sud 11 University
William Uther, Manuela M. Veloso (1997). Adversarial Reinforcement Learning. Carnegie Mellon University, ps
William Uther, Manuela M. Veloso (1997). Generalizing Adversarial Reinforcement Learning. Carnegie Mellon University, ps
Marco Wiering, Jürgen Schmidhuber (1997). HQ-learning. Adaptive Behavior, Vol. 6, No 2

1998

Jonathan Baxter, Andrew Tridgell, Lex Weaver (1998). Knightcap: A chess program that learns by combining td(λ) with game-tree search, Proceedings of the 15th International Conference on Machine Learning, pdf via citeseerX
Jonathan Baxter, Andrew Tridgell, Lex Weaver (1998). TDLeaf(lambda): Combining Temporal Difference Learning with Game-Tree Search. Australian Journal of Intelligent Information Processing Systems, Vol. 5 No. 1, arXiv:cs/9901001
Jonathan Baxter, Andrew Tridgell, Lex Weaver (1998). Experiments in Parameter Learning Using Temporal Differences. ICCA Journal, Volume 21 No. 2, pdf
Lev Finkelstein, Shaul Markovitch (1998). Learning to Play Chess Selectively by Acquiring Move Patterns. ICCA Journal, Vol. 21, No. 2, pdf
Csaba Szepesvári (1998). Reinforcement Learning: Theory and Practice. Proceedings of the 2nd Slovak Conference on Artificial Neural Networks, zipped ps
Richard Sutton, Andrew Barto (1998). Reinforcement Learning: An Introduction. MIT Press
Ryszard Michalski, Ivan Bratko, Miroslav Kubat (eds.) (1998). Machine Learning and Data Mining: Methods and Applications. John Wiley & Sons

Miroslav Kubat, Ivan Bratko, Ryszard Michalski (1998). A Review of Machine Learning Methods. pdf

Nobusuke Sasaki, Yasuji Sawada, Jin Yoshimura (1998). A Neural Network Program of Tsume-Go. CG 1998 ^[18]
Tristan Cazenave (1998). Machine Introspection for Machine Learning. Tucson 1998, pdf
Tristan Cazenave (1998). Integration of Different Reasoning Modes in a Go Playing and Learning System. pdf
Tristan Cazenave (1998). Learning with Fuzzy Definitions of Goals. pdf
Ryszard Michalski (1998). Learnable Evolution: Combining Symbolic and Evolutionary Learning. Proceedings of the Fourth International Workshop on Multistrategy Learning (MSL'98)
Krzysztof Krawiec, Roman Slowinski, Irmina Szczesniak (1998). Pedagogical Method for Extraction of Symbolic Knowledge from Neural Networks. Rough Sets and Current Trends in Computing 1998
Marco Wiering, Jürgen Schmidhuber (1998). Fast online Q (λ). Machine Learning, Vol. 33, No. 1

1999

Robert Hyatt (1999). Book Learning - a Methodology to Tune an Opening Book Automatically. ICCA Journal, Vol. 22, No. 1
Kieran Greer, Piyush Ojha, David A. Bell (1999). A Pattern-Oriented Approach to Move Ordering: the Chessmaps Heuristic. ICCA Journal, Vol. 22, No. 1
Michael Buro (1999). Toward Opening Book Learning. ICCA Journal, Vol. 22, No. 2, pdf
Don Beal, Martin C. Smith (1999). Learning Piece-Square Values using Temporal Differences. ICCA Journal, Vol. 22, No. 4
David Heckerman (1999). A tutorial on learning with Bayesian networks. pdf from CiteSeerX
F. De Comité, F. Denis, R. Gilleron et Fabien Letouzey (1999). Positive and Unlabeled Examples help Learning, The 10th International Conference on Algorithmic Learning Theory, ps
Vassilis Papavassiliou, Stuart Russell (1999). Convergence of reinforcement learning with general function approximators. In Proc. IJCAI-99, Stockholm, ps
Philip G. K. Reiser, Patricia J. Riddle (1999). Evolving Logic Programs to Classify Chess-Endgame Positions. Simulated Evolution and Learning, Canberra, Australia. Lecture Notes in Artificial Intelligence, No. 1585, Springer, pdf » Endgame
Marco Wiering (1999). [Explorations in Efficient Reinforcement Learning. Ph.D. thesis, University of Amsterdam, advisors Frans Groen and Jürgen Schmidhuber
Geoffrey E. Hinton, Terrence J. Sejnowski (eds.) (1999). Unsupervised Learning: Foundations of Neural Computation. MIT Press

2000 ...

Miroslav Kubat, Jan Žižka (2000). Learning Middle Game Patterns in Chess: A Case Study. Lecture Notes in Computer Science, Vol. 1821, Springer
Vladimir Vapnik (2000). The nature of statistical learning theory. Springer
Sebastian Thrun, Michael L. Littman (2000). A Review of Reinforcement Learning. AI Magazine, Vol. 21, No. 1
Johannes Fürnkranz (2000). Machine Learning in Games: A Survey. Austrian Research Institute for Artificial Intelligence, OEFAI-TR-2000-3, pdf
Johannes Fürnkranz, Bernhard Pfahringer, Hermann Kaindl, Stefan Kramer (2000). Learning to Use Operational Advice. ECAI-00, pdf
Jack van Rijswijck (2000). Learning from Perfection: A Data Mining Approach to Evaluation Function Learning in Awari. CG 2000, pdf
Robert Levinson, Ryan Weber (2000). Chess Neighborhoods, Function Combination, and Reinforcement Learning. CG 2000
Jan Ramon, Tom Francis, Hendrik Blockeel (2000). Learning a Go Heuristic with Tilde. CG 2000
Levente Kocsis, Jos Uiterwijk, Jaap van den Herik (2000). Learning Time Allocation using Neural Networks. CG 2000, postscript
Michael Buro (2000). Toward Opening Book Learning. Games in AI Research (eds. Jaap van den Herik and Hiroyuki Iida), pp. 47-54. Universiteit Maastricht, Maastricht, The Netherlands. ISBN 90-621-6416-1.
Fabien Letouzey, François Denis, Rémi Gilleron (2000). Learning from Positive and Unlabeled Examples. ALT 2000: 71-85, ps
Andrew Ng, Stuart Russell (2000). Algorithms for inverse reinforcement learning. In Proceedings of the Seventeenth International Conference on Machine Learning, Stanford, California: Morgan Kaufmann, pdf
Dean F. Hougen, Maria Gini, James R. Slagle (2000). An Integrated Connectionist Approach to Reinforcement Learning for Robotic Control. ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Ryszard Michalski (2000). LEARNABLE EVOLUTION MODEL: Evolutionary Processes Guided by Machine Learning. Machine Learning, Vol. 38 ^[19]
Jonathan Baxter, Andrew Tridgell, Lex Weaver (2000). Learning to Play Chess Using Temporal Differences. Machine Learning, Vol 40, No. 3, pdf
Michael Bain, Stephen Muggleton, Ashwin Srinivasan (2000). Generalising Closed World Specialisation: A Chess End Game Application. CitySeerX

2001

Nicol N. Schraudolph, Peter Dayan, Terrence J. Sejnowski (2001). Learning to Evaluate Go Positions via Temporal Difference Methods. in Norio Baba, Lakhmi C. Jain (eds.) (2001). Computational Intelligence in Games, Studies in Fuzziness and Soft Computing. Physica-Verlag, revised version of 1994 paper
Jonathan Schaeffer, Markian Hlynka, Vili Jussila (2001). Temporal Difference Learning Applied to a High-Performance Game-Playing Program. IJCAI 2001
Michael Bowling, Manuela M. Veloso (2001). Rational and Convergent Learning in Stochastic Games. IJCAI 2001
Levente Kocsis, Jos Uiterwijk, Jaap van den Herik (2001). Move Ordering using Neural Networks, IEA/AIE 2001, LNCS 2070, 45-50 ps
Marty Hirsch (2001). Machine Learning in MChess Professional. Advances in Computer Games 9
Yngvi Björnsson, Tony Marsland (2001). Learning Search Control in Adversary Games. Advances in Computer Games 9, pp. 157-174. pdf
Robert Levinson, Ryan Weber (2001). Chess Neighborhoods, Function Combinations and Reinforcements Learning. In Computers and Games (eds. Tony Marsland and I. Frank). Lecture Notes in Computer Science,. Springer,. pdf
Jean Hayes Michie (2001). Machine Learning and Light Relief: A Review of Truth from Trash. AI Magazine Vol. 22 No. 4, pdf
Pieter Spronck, Ida Sprinkhuizen-Kuyper, Eric Postma (2001). Infused Evolutionary Learning. Proceedings of the Eleventh Belgian-Dutch Conference on Machine Learning, pdf, pdf
Charles Elkan (2001). The Foundations of Cost-Sensitive Learning. IJCAI 2001
Alex B. Meijer, Henk Koppelaar (2001). A learning architecture for the game of Go. Game-On 2001
Johannes Fürnkranz, Miroslav Kubat (2001). Machines that Learn to Play Games. Advances in Computation: Theory and Practice, Vol. 8,. NOVA Science Publishers

2002

Yngvi Björnsson, Tony Marsland (2002). Learning Control of Search Extensions. Proceedings of the 6th Joint Conference on Information Sciences (JCIS 2002), pp. 446-449. pdf
Michael Buro (2002). Improving Mini-max Search by Supervised Learning. Artificial Intelligence, Vol. 134, No. 1, pp. 85-99. ISSN 0004-3702. pdf
Levente Kocsis, Jos Uiterwijk, Eric Postma, Jaap van den Herik (2002). The Neural MoveMap Heuristic in Chess. CG 2002, ps
Erik van der Werf, Jos Uiterwijk, Eric Postma, Jaap van den Herik (2002). Local Move Prediction in Go. CG 2002
Ari Shapiro, Gil Fuchs, Robert Levinson (2002). Learning a Game Strategy Using Pattern-Weights and Self-play. CG 2002, pdf
Mark Winands, Levente Kocsis, Jos Uiterwijk, Jaap van den Herik (2002). Temporal difference learning and the Neural MoveMap heuristic in the game of Lines of Action. In Mehdi, Q,., Gouch, N., and Cavazza, M., editors, GAME-ON 2002 3rd International Conference on Intelligent Games and Simulation, pages 99-103. SCS Europe Bvba. pdf
Roman Grekovs (2002). Methods of Fuzzy Pattern Recognition Riga Technical University, ps, covers Fuzzy Kora algorithm
Pieter Spronck, Ida Sprinkhuizen-Kuyper, Eric Postma (2003). Improved opponent intelligence trough offline learning. International Journal of Intelligent Games & Simulation, Vol. 2
Krzysztof Krawiec (2002). Genetic Programming-based Construction of Features for Machine Learning and Knowledge Discovery Tasks. Genetic Programming and Evolvable Machines, Vol. 3, No. 4
Peter Auer, Nicolò Cesa-Bianchi, Paul Fischer (2002). Finite-time Analysis of the Multiarmed Bandit Problem. Machine Learning, Vol. 47, No. 2, pdf
Paul E. Utgoff, David J. Stracuzzi (2002). Many-Layered Learning. Neural Computation, Vol. 14, No. 10, pdf

2003

Levente Kocsis, Jaap van den Herik, Jos Uiterwijk (2003). Two Learning Algorithms for Forward Pruning. ICGA Journal, Vol 26, No. 3, ps
Levente Kocsis (2003) Learning Search Decisions, PhD thesis, Universiteit Maastricht ps
Marco Block-Berlitz (2003). Reinforcement Learning in der Schachprogrammierung. Studienarbeit, Freie Universität Berlin, Dozent: Prof. Dr. Raúl Rojas, pdf (German)
Dave Gomboc, Tony Marsland, Michael Buro (2003). Evaluation Function Tuning via Ordinal Correlation. Advances in Computer Games 10, pdf
Stuart Russell, Peter Norvig (2003). Artificial Intelligence: A Modern Approach. 2nd edition, 3rd edition 2009
Judea Pearl, Stuart Russell (2003). Bayesian Networks. In Michael A. Arbib, Ed., The Handbook of Brain Theory and Neural Networks, 2nd edition, MIT Press, pdf
David J.C. MacKay (2003). Information Theory, Inference, and Learning Algorithms.
Pedro Campos, Thibault Langlois (2003). Abalearn: a Program that Learns How to Play Abalone. ICGA Journal, Vol. 26, No. 4
David Gleich (2003). Machine Learning in Computer Chess: Genetic Programming and KRK. Harvey Mudd College, pdf
Henk Mannen (2003). Learning to play chess using reinforcement learning with database games. Master’s thesis, Cognitive Artiﬁcial Intelligence, Utrecht University
Jan Žižka, Michal Mádr (2003). Learning Representative Patterns from Real Chess Positions: A Case Study. IICAI 2003

2004

Yngvi Björnsson, Vignir Hafsteinsson, Ársæll Jóhannsson, Einar Jónsson (2004). Efficient Use of Reinforcement Learning in a Computer Game. In Computer Games: Artificial Intellignece, Design and Education (CGAIDE'04), pp. 379–383, 2004. pdf
Dave Gomboc (2004). Tuning Evaluation Functions by Maximizing Concordance Master of Science Thesis, pdf
Adam Marczyk (2004). Genetic Algorithms and Evolutionary Computation from the TalkOrigins Archive
Petr Aksenov (2004). Genetic algorithms for optimising chess position scoring, Masters thesis, pdf
Marek Strejczek (2004). Some aspects of chess programming, Technical University of Łódź , Faculty of Electrical and Electronic Engineering, Department of Computer Science, zipped pdf
Imran Ghory (2004). Reinforcement learning in board games. CSTR-04-004, Department of Computer Science, University of Bristol. pdf ^[20]
Mathieu Autonès, Aryel Beck, Phillippe Camacho, Nicolas Lassabe, Hervé Luga, François Scharffe (2004). Evaluation of Chess Position by Modular Neural network Generated by Genetic Algorithm. EuroGP 2004
Jean-Yves Audibert (2004). PAC-Bayesian Statistical Learning Theory. Ph.D. thesis, Université Paris VI, pdf, slides as pdf
Eric Wiewiora (2004). Efficient Exploration for Reinforcement Learning. MSc thesis, pdf
David B. Fogel, Timothy J. Hays, Sarah L. Hahn, James Quon (2004). A Self-Learning Evolutionary Chess Program. Proceedings of the IEEE, Vol. 92 No. 12, pp. 1947-1954, CiteSeerX
Alejandro González Romero, René Alquézar (2004). Learning Through the KRKa2 Chess Ending. CIARP 2004
Daniel Osman, Jacek Mańdziuk (2004). Comparison of TDLeaf and TD learning in Game Playing Domain. 11. ICONIP, pdf
Albert Xin Jiang (2004). Multiagent Reinforcement Learning in Stochastic Games with Continuous Action Spaces. pdf
Henk Mannen, Marco Wiering (2004). Learning to play chess using TD(λ)-learning with database games. Cognitive Artiﬁcial Intelligence, Utrecht University, Benelearn’04

2005 ...

Dave Gomboc, Michael Buro, Tony Marsland (2005). Tuning evaluation functions by maximizing concordance Theoretical Computer Science, Volume 349, Issue 2, pp. 202-229, pdf
David B. Fogel, Timothy J. Hays, Sarah L. Hahn, James Quon (2005). Further Evolution of a Self-Learning Chess Program. IEEE Symposium on Computational Intelligence & Games, CiteSeerX
Tristan Caulfield, Joanna J. Bryson (2005). Chess by Imitation. Department of Computer Science, University of Bath, pdf ^[21]
Marco Wiering, Jan Peter Patist, Henk Mannen (2005). Learning to Play Board Games using Temporal Difference Methods. Technical Report, Utrecht University, UU-CS-2005-048, pdf
David J. Stracuzzi (2005). Scalable learning in many layers. University of Massachusetts Amherst, TR-05-02, pdf
Levente Kocsis, Csaba Szepesvári, Mark Winands (2005). RSPSA: Enhanced Parameter Optimization in Games. Advances in Computer Games 11, pdf
Christian Posthoff, Michael Schlosser (2005). Optimal strategies — Learning from examples — Boolean equations. in Klaus P. Jantke, Steffen Lange (eds.) (2005). Algorithmic Learning for Knowledge-Based Systems, Lecture Notes in Computer Science 961, Springer

2006

Levente Kocsis, Csaba Szepesvári (2006). Universal Parameter Optimisation in Games Based on SPSA. Machine Learning, Special Issue on Machine Learning and Games, Vol. 63, No. 3
Sverrir Sigmundarson, Yngvi Björnsson. (2006) Value Back-Propagation vs. Backtracking in Real-Time Search. In Proceedings of the National Conference on Artificial Intelligence (AAAI), Workshop on Learning For Search, pp. 136–141, AAAI Press, Boston, Massachusetts, USA, July 2006. pdf
Sylvain Gelly, Olivier Teytaud, Nicolas Bredèche, Marc Schoenauer (2006). Universal Consistency and Bloat in GP. Some theoretical considerations about Genetic Programming from a Statistical Learning Theory viewpoint. pdf
Sylvain Gelly, Jérémie Mary, Olivier Teytaud (2006). Learning for stochastic dynamic programming. pdf
Olivier Teytaud, Sylvain Gelly (2006). General lower bounds for evolutionary algorithms. pdf
Makoto Miwa, Daisaku Yokoyama, Takashi Chikayama (2006). Automatic Construction of Static Evaluation Functions for Computer Game Players. ALT ’06
Tom Mitchell (2006). The Discipline of Machine Learning. CMU-ML-06-108, Carnegie Mellon University, pdf
Tom Mitchell (2006). Human and Machine Learning. Carnegie Mellon University, slides as pdf
Jeff Rollason (2006). Playing Stronger by learning. AI Factory, Winter 2006
Simon Lucas, Thomas Philip Runarsson (2006). Temporal Difference Learning versus Co-Evolution for Acquiring Othello Position Evaluation. IEEE Symposium on Computational Intelligence and Games » Othello
Nicolò Cesa-Bianchi, Gábor Lugosi (2006). Prediction, Learning, and Games. Cambridge University Press
David J. Stracuzzi (2006). Scalable Knowledge Acquisition through Cumulative Learning and Memory Organization. Ph.D. thesis, University of Massachusetts Amherst, advisor Paul E. Utgoff, pdf
Michael Bowling, Johannes Fürnkranz, Thore Graepel, Ron Musick (2006). Machine learning and Games. Machine Learning, Vol. 63, No. 3

2007

Sylvain Gelly, Olivier Teytaud, Jérémie Mary (2007). Active learning in regression, with application to stochastic dynamic programming. ICINCO and CAP, pdf
Sylvain Gelly (2007). A Contribution to Reinforcement Learning; Application to Computer Go. Ph.D. thesis, pdf
Jean-Yves Audibert, Rémi Munos, Csaba Szepesvári (2007). Tuning Bandit Algorithms in Stochastic Environments. pdf
Makoto Miwa, Daisaku Yokoyama, Takashi Chikayama (2007). Automatic Generation of Evaluation Features for Computer Game Players. pdf
Yong Duan, Baoxia Cui, Xinhe Xu (2007). State Space Partition for Reinforcement Learning Based on Fuzzy Min-Max Neural Network. ISNN 2007
Yasuhiro Osaki, Kazutomo Shibahara, Yasuhiro Tajima, Yoshiyuki Kotani (2007). Reinforcement Learning of Evaluation Functions Using Temporal Difference-Monte Carlo learning method. 12th Game Programming Workshop
Igor Kononenko, Matjaž Kukar (2007). Machine Learning and Data Mining: Introduction to Principles and Algorithms.
Krzysztof Krawiec (2007). Generative Learning of Visual Concepts using Multiobjective Genetic Programming. Pattern Recognition Letters, Vol. 28, No. 16
Simon Lucas (2007). Learning to play Othello with N-tuple systems. Australian Journal of Intelligent Information Processing Systems, Special Issue on Game Technology, Vol. 9, No. 4 » Othello
Edward P. Manning (2007). Temporal Difference Learning of an Othello Evaluation Function for a Small Neural Network with Shared Weights. IEEE Symposium on Computational Intelligence and AI in Games » Othello
David J. Stracuzzi (2007). Randomized Feature Selection. in Huan Liu, Hiroshi Motoda (eds.) Computational Methods of Feature Selection. CRC Press, pdf
Johannes Fürnkranz (2007). Recent advances in machine learning and game playing. ÖGAI Journal, Vol. 26, No. 2, Computer Game Playing, pdf

2008

Marco Block, Maro Bader, Ernesto Tapia, Marte Ramírez, Ketill Gunnarsson, Erik Cuevas, Daniel Zaldivar, Raúl Rojas (2008). Using Reinforcement Learning in Chess Engines. CONCIBE SCIENCE 2008, Research in Computing Science: Special Issue in Electronics and Biomedical Engineering, Computer Science and Informatics, ISSN:1870-4069, Vol. 35, pp. 31-40, Guadalajara, Mexico, pdf
Sacha Droste, Johannes Fürnkranz (2008). Learning of Piece Values for Chess Variants. Technical Report TUD–KE–2008-07, Knowledge Engineering Group, TU Darmstadt, pdf
Sacha Droste, Johannes Fürnkranz (2008). Learning the Piece Values for three Chess Variants. ICGA Journal, Vol 31, No. 4
Richard Sutton, Csaba Szepesvári, Hamid Reza Maei (2008). A Convergent O(n) Algorithm for Off-policy Temporal-difference Learning with Linear Function Approximation, pdf (draft)
Matej Guid, Martin Možina, Jana Krivec, Aleksander Sadikov, Ivan Bratko (2008). Learning Positional Features for Annotating Chess Games: A Case Study. CG 2008, pdf
Martin Možina, Matej Guid, Jana Krivec, Aleksander Sadikov, Ivan Bratko (2008). Fighting Knowledge Acquisition Bottleneck with Argument Based Machine Learning. 18th European Conference on Artificial Intelligence (ECAI 2008), Patras, Greece. pdf
Cécile Germain-Renaud, Julien Pérez, Balázs Kégl, Charles Loomis (2008). Grid Differentiated Services: a Reinforcement Learning Approach. In 8th IEEE Symposium on Cluster Computing and the Grid. Lyon, pdf
Yasuhiro Osaki, Kazutomo Shibahara, Yasuhiro Tajima, Yoshiyuki Kotani (2008). An Othello Evaluation Function Based on Temporal Difference Learning using Probability of Winning. CIG'08, pdf
Antonio Fernández, Antonio Salmerón (2008). BayesChess: A computer chess program based on Bayesian networks. Pattern Recognition Letters, Vol. 29, No. 8
Joaquin Vanschoren, Bernhard Pfahringer, Geoffrey Holmes (2008). Learning from the Past with Experiment Databases. PRICAI 2008, pdf
Ilya Sutskever, Vinod Nair (2008). Mimicking Go Experts with Convolutional Neural Networks. ICANN 2008, pdf » Go
Andrew Cook (2008). Chunk Learning and Move Prompting: Making Moves in Chess. Technical Report CSR-08-12, University of Birmingham
Byoung-Tak Zhang (2008). Hypernetworks: A molecular evolutionary architecture for cognitive learning and memory. IEEE Computational Intelligence Magazine, Vol. 3, No. 3, pdf
Maria Cutumisu, Michael Bowling, Duane Szafron, Richard Sutton (2008). Agent Learning using Action-Dependent Learning Rates in Computer Role-Playing Games. Proceedings of the Fourth Artificial Intelligence and Interactive Digital Entertainment Conference, pdf

2009

Hamid Reza Maei, Csaba Szepesvári, Shalabh Bhatnagar, Doina Precup, David Silver, Richard Sutton (2009). Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation. Accepted in Advances in Neural Information Processing Systems 22, Vancouver, BC. December 2009. MIT Press. pdf
Joel Veness, David Silver, William Uther, Alan Blair (2009). Bootstrapping from Game Tree Search. Neural Information Processing Systems (NIPS), 2009, pdf
Martin Možina (2009). Argument Based Machine Learning, PhD Thesis, pdf
David Silver (2009). Reinforcement Learning and Simulation-Based Search. Ph.D. thesis, University of Alberta. pdf
Omid David, Jaap van den Herik, Moshe Koppel, Nathan S. Netanyahu (2009). Simulating Human Grandmasters: Evolution and Coevolution of Evaluation Functions. ACM Genetic and Evolutionary Computation Conference (GECCO '09), pp. 1483 - 1489, Montreal, Canada, pdf
Omid David (2009). Genetic Algorithms Based Learning for Evolving Intelligent Organisms. Ph.D. Thesis ^[22]
Nur Merve Amil, Nicolas Bredèche, Christian Gagné, Sylvain Gelly, Marc Schoenauer, Olivier Teytaud (2009). A Statistical Learning Perspective of Genetic Programming. EuroGP 2009, pdf
Richard Sutton, Hamid Reza Maei, Doina Precup, Shalabh Bhatnagar, David Silver, Csaba Szepesvári, Eric Wiewiora. (2009). Fast Gradient-Descent Methods for Temporal-Difference Learning with Linear Function Approximation. In Proceedings of the 26th International Conference on Machine Learning (ICML-09). pdf
Mesut Kirci, Jonathan Schaeffer, Nathan Sturtevant (2009). Feature Learning Using State Differences. pdf
David Silver, Gerald Tesauro (2009). Monte-Carlo Simulation Balancing. ICML 2009, pdf ^[23]
Broch Davison (2009). Playing Chess with Matlab. M.Sc. thesis supervised by Nello Cristianini, pdf ^[24]
Marcin Szubert, Wojciech Jaśkowski, Krzysztof Krawiec (2009). Coevolutionary Temporal Difference Learning for Othello. IEEE Symposium on Computational Intelligence and Games, pdf » Othello
Mark Levene, Trevor Fenner (2009). A Methodology for Learning Players' Styles from Game Records. arXiv:0904.2595v1
Trevor Hastie, Robert Tibshirani, Jerome Friedman (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Second Edition, Springer

2010 ...

Jacek Mańdziuk (2010). Knowledge-Free and Learning-Based Methods in Intelligent Game Playing. Studies in Computational Intelligence, Vol. 276, Springer
Joel Veness, Kee Siong Ng, Marcus Hutter, David Silver (2010). Reinforcement Learning via AIXI Approximation. Association for the Advancement of Artificial Intelligence (AAAI), pdf
Omid David, Moshe Koppel, Nathan S. Netanyahu (2010). Expert-Driven Genetic Algorithms for Simulating Evaluation Functions. pdf
Omid David, Nathan S. Netanyahu, Yoav Rosenberg, Moshe Shimoni (2010). Genetic Algorithms for Automatic Classification of Moving Objects. ACM Genetic and Evolutionary Computation Conference (GECCO '10), Portland, OR, pdf
Omid David, Moshe Koppel, Nathan S. Netanyahu (2010). Genetic Algorithms for Automatic Search Tuning. ICGA Journal, Vol 33, No. 2
Mesut Kirci (2010). Feature Learning using State Differences. Master's thesis, Department of Computing Science, University of Alberta, pdf » General Game Playing
Amine Bourki, Matthieu Coulm, Philippe Rolet, Olivier Teytaud, Paul Vayssière (2010). Parameter Tuning by Simple Regret Algorithms and Multiple Simultaneous Hypothesis Testing. pdf
Julien Pérez, Cécile Germain-Renaud, Balázs Kégl, Charles Loomis (2010). Multi-objective Reinforcement Learning for Responsive Grids. In The Journal of Grid Computing. pdf
Jean-Yves Audibert (2010). PAC-Bayesian aggregation and multi-armed bandits. Habilitation thesis, Université Paris Est, pdf, slides as pdf
Hamid Reza Maei, Richard Sutton (2010). GQ(λ): A general gradient algorithm for temporal-difference prediction learning with eligibility traces. In Proceedings of the Third Conference on Artificial General Intelligence
Karol Walędzik, Jacek Mańdziuk (2010). The Layered Learning method and its Application to Generation of Evaluation Functions for the Game of Checkers. 11. PPSN, pdf » Checkers
Krzysztof Krawiec, Marcin Szubert (2010). Coevolutionary Temporal Difference Learning for small-board Go. IEEE Congress on Evolutionary Computation » Go
Edward P. Manning (2010). Using Resource-Limited Nash Memory to Improve an Othello Evaluation Function. IEEE Transactions on Computational Intelligence and AI in Games, Vol. 2, No. 1 » Othello
Edward P. Manning (2010). Coevolution in a Large Search Space using Resource-limited Nash Memory. GECCO '10 » Othello
Marco Wiering (2010). Self-play and using an expert to learn to play backgammon with temporal difference learning. Journal of Intelligent Learning Systems and Applications, Vol. 2, No. 2

2011

Joel Veness (2011). Approximate Universal Artificial Intelligence and Self-Play Learning for Games. Ph.D. thesis, University of New South Wales, supervisors: Kee Siong Ng, Marcus Hutter, Alan Blair, William Uther, John Lloyd; pdf
Mesut Kirci, Nathan Sturtevant, Jonathan Schaeffer (2011). A GGP Feature Learning Algorithm. KI 25(1): 35-42, pdf » General Game Playing
I-Chen Wu, Hsin-Ti Tsai, Hung-Hsuan Lin, Yi-Shan Lin, Chieh-Min Chang, Ping-Hung Lin (2011). Temporal Difference Learning for Connect6. Advances in Computer Games 13
Tomoyuki Kaneko, Kunihito Hoki (2011). Analysis of Evaluation-Function Learning by Comparison of Sibling Nodes. Advances in Computer Games 13
Jiao Wang, Shiyuan Li, Jitong Chen, Xin Wei, Huizhan Lv, Xinhe Xu (2011). 4*4-Pattern and Bayesian Learning in Monte-Carlo Go. Advances in Computer Games 13
Charles Elkan (2011). Reinforcement Learning with a Bilinear Q Function. EWRL 2011
Krzysztof Krawiec, Marcin Szubert (2011). Learning N-Tuple Networks for Othello by Coevolutionary Gradient Search. GECCO 2011, pdf
Krzysztof Krawiec, Wojciech Jaśkowski, Marcin Szubert (2011). Evolving small-board Go players using Coevolutionary Temporal Difference Learning with Archives. Applied Mathematics and Computer Science, Vol. 21, No. 4
Marcin Szubert, Wojciech Jaśkowski, Krzysztof Krawiec (2011). Learning Board Evaluation Function for Othello by Hybridizing Coevolution with Temporal Difference Learning. Control and Cybernetics, Vol. 40, No. 3, pdf » Othello
Hamid Reza Maei (2011). Gradient Temporal-Difference Learning Algorithms. Ph.D. thesis, University of Alberta, advisor Richard Sutton, pdf

2012

Marco Wiering, Martijn Van Otterlo (2012). Reinforcement learning: State-of-the-art. Adaptation, Learning, and Optimization, Vol. 12, Springer

István Szita (2012). Reinforcement Learning in Games. Chapter 17

Sjoerd van den Dries, Marco Wiering (2012). Neural-fitted TD-leaf learning for playing Othello with structured neural networks. IEEE Transactions on Neural Networks and Learning Systems, Vol. 23, No. 11
Amir Ban (2012). Automatic Learning of Evaluation, with Applications to Computer Chess. Discussion Paper 613, The Hebrew University of Jerusalem - Center for the Study of Rationality, Givat Ram
Adrien Couetoux, Olivier Teytaud, Hassen Doghmen (2012). Learning a Move-Generator for Upper Confidence Trees. ICS 2012, Hualien, Taiwan, December 2012 » UCT
Robert Schapire, Yoav Freund (2012). Boosting: Foundations and Algorithms. MIT Press
Arthur Guez, David Silver, Peter Dayan (2012). Efficient Bayes-Adaptive Reinforcement Learning using Sample-Based Search. NIPS 2012, pdf
Peter Dayan (2012). How to set the switches on this thing. Current Opinion in Neurobiology, Vol. 22, pdf

2013

Arthur Guez, David Silver, Peter Dayan (2013). Scalable and Efficient Bayes-Adaptive Reinforcement Learning Based on Monte-Carlo Tree Search. Journal of Artificial Intelligence Research, Vol. 48, pdf
Katja Grace (2013). Algorithmic Progress in Six Domains. Technical report 2013-3, Machine Intelligence Research Institute, Berkeley, CA, pdf, 5 Game Playing, 5.1 Chess, 5.2 Go, 9 Machine Learning
Marcin Szubert, Wojciech Jaśkowski, Paweł Liskowski, Krzysztof Krawiec (2013). Shaping Fitness Function for Evolutionary Learning of Game Strategies. GECCO 2013, pdf
Marcin Szubert, Wojciech Jaśkowski, Krzysztof Krawiec (2013). On Scalability, Generalization, and Hybridization of Coevolutionary Learning: a Case Study for Othello. IEEE Transactions on Computational Intelligence and AI in Games, Vol. 5, No. 3 » Othello
Michiel van der Ree, Marco Wiering (2013). Reinforcement Learning in the Game of Othello: Learning Against a Fixed Opponent and Learning from Self-Play. ADPRL 2013
Luuk Bom, Ruud Henken, Marco Wiering (2013). Reinforcement Learning to Train Ms. Pac-Man Using Higher-order Action-relative Inputs. ADPRL 2013 ^[25]
Peter Auer, Marcus Hutter, Laurent Orseau (2013). Reinforcement Learning. Dagstuhl Reports, Vol. 3, No. 8, DOI: 10.4230/DagRep.3.8.1, URN: urn:nbn:de:0030-drops-43409
Igor Roizen, Judea Pearl (2013). Learning Link-Probabilities in Causal Trees. arXiv:1304.3103
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, Martin Riedmiller (2013). Playing Atari with Deep Reinforcement Learning. arXiv:1312.5602 ^[26] ^[27]

2014

Omid David, Jaap van den Herik, Moshe Koppel, Nathan S. Netanyahu (2014). Genetic Algorithms for Evolving Computer Chess Programs. IEEE Transactions on Evolutionary Computation, pdf ^[28]
Wojciech Jaśkowski, Marcin Szubert, Paweł Liskowski (2014). Multi-Criteria Comparison of Coevolution and Temporal Difference Learning on Othello. EvoApplications 2014, Springer, volume 8602 » Othello
Marcin Szubert, Wojciech Jaśkowski (2014). Temporal Difference Learning of N-Tuple Networks for the Game 2048. IEEE Conference on Computational Intelligence and Games, pdf ^[29]
Marcin Szubert (2014). Coevolutionary Shaping for Reinforcement Learning. Ph.D. thesis, Poznań University of Technology, supervisor Krzysztof Krawiec, co-supervisor Wojciech Jaśkowski, pdf
Wojciech Jaśkowski (2014). Systematic n-Tuple Networks for Othello Position Evaluation. ICGA Journal, Vol. 37, No. 2, preprint as pdf » Othello
Christopher Clark, Amos Storkey (2014). Teaching Deep Convolutional Neural Networks to Play Go. arXiv:1412.3409 » Neural Networks ^[30] ^[31] ^[32]
Chris J. Maddison, Aja Huang, Ilya Sutskever, David Silver (2014). Move Evaluation in Go Using Deep Convolutional Neural Networks. arXiv:1412.6564v1
I-Chen Wu, Kun-Hao Yeh, Chao-Chin Liang, Chia-Chuan Chang, Han Chiang (2014). Multi-Stage Temporal Difference Learning for 2048. TAAI 2014, best paper award ^[33]
Ronald Ortner, Daniil Ryabko, Peter Auer, Rémi Munos (2014). Regret bounds for restless Markov bandits. Theoretical Computer Science 558, pdf

2015 ...

Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, Demis Hassabis (2015). Human-level control through deep reinforcement learning. Nature, Vol. 518
Tobias Graf, Marco Platzner (2015). Adaptive Playouts in Monte Carlo Tree Search with Policy Gradient Reinforcement Learning. Advances in Computer Games 14
Yuichiro Sato, Hiroyuki Iida, Jaap van den Herik (2015). Transfer Learning by Inductive Logic Programming. Advances in Computer Games 14
Kokolo Ikeda, Takanari Shishido, Simon Viennot (2015). Machine-Learning of Shape Names for the Game of Go. Advances in Computer Games 14
Arun Nair, Praveen Srinivasan, Sam Blackwell, Cagdas Alcicek, Rory Fearon, Alessandro De Maria, Veda Panneershelvam, Mustafa Suleyman, Charles Beattie, Stig Petersen, Shane Legg, Volodymyr Mnih, Koray Kavukcuoglu, David Silver (2015). Massively Parallel Methods for Deep Reinforcement Learning. arXiv:1507.04296
Matthew Lai (2015). Giraffe: Using Deep Reinforcement Learning to Play Chess. M.Sc. thesis, Imperial College London, arXiv:1509.01549v1 » Giraffe
Hado van Hasselt, Arthur Guez, David Silver (2015). Deep Reinforcement Learning with Double Q-learning. arXiv:1509.06461
Tom Schaul, John Quan, Ioannis Antonoglou, David Silver (2015). Prioritized Experience Replay. arXiv:1511.05952
Miroslav Kubat (2015). An Introduction to Machine Learning. Springer
Christian Wirth, Johannes Fürnkranz (2015). On Learning From Game Annotations. IEEE Transactions on Computational Intelligence and AI in Games, Vol. 7, No. 3

2016

Dharshan Kumaran, Demis Hassabis, James L. McClelland (2016). What learning systems do intelligent agents need? Complementary Learning Systems Theory Updated. Trends in Cognitive Sciences, Vol. 20, No. 7, pdf
Ziyu Wang, Nando de Freitas, Marc Lanctot (2016). Dueling Network Architectures for Deep Reinforcement Learning. arXiv:1511.06581
Jialin Liu, Olivier Teytaud, Tristan Cazenave (2016). Fast seed-learning algorithms for games. CG 2016
Omid E. David, Nathan S. Netanyahu, Lior Wolf (2016). DeepChess: End-to-End Deep Neural Network for Automatic Learning in Chess. ICAAN 2016, Lecture Notes in Computer Science, Vol. 9887, Springer, pdf preprint » DeepChess ^[34] ^[35]
Ian Goodfellow, Yoshua Bengio, Aaron Courville (2016). Deep Learning. MIT Press
Max Jaderberg, Volodymyr Mnih, Wojciech Marian Czarnecki, Tom Schaul, Joel Z. Leibo, David Silver, Koray Kavukcuoglu (2016). Reinforcement Learning with Unsupervised Auxiliary Tasks. arXiv:1611.05397v1

2017

Muthuraman Chidambaram, Yanjun Qi (2017). Style Transfer Generative Adversarial Networks: Learning to Play Chess Differently. arXiv:1702.06762v1 ^[36] » Neural Networks
Johannes Fürnkranz (2017). Machine Learning and Game Playing. in Claude Sammut, Geoffrey I. Webb (eds) (2017). Encyclopedia of Machine Learning and Data Mining. Springer, Boston, MA
David Silver, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Matthew Lai, Arthur Guez, Marc Lanctot, Laurent Sifre, Dharshan Kumaran, Thore Graepel, Timothy Lillicrap, Karen Simonyan, Demis Hassabis (2017). Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm. arXiv:1712.01815 » AlphaZero

2018

Arthur Guez, Théophane Weber, Ioannis Antonoglou, Karen Simonyan, Oriol Vinyals, Daan Wierstra, Rémi Munos, David Silver (2018). Learning to Search with MCTSnets. arXiv:1802.04697 » Monte-Carlo Tree Search
Matthia Sabatelli, Francesco Bidoia, Valeriu Codreanu, Marco Wiering (2018). Learning to Evaluate Chess Positions with Deep Neural Networks and Limited Lookahead. ICPRAM 2018, pdf

Forum Posts

1998 ...

Book learning and rating bias by Don Dailey, CCC, May 01, 1998
BookLearning Under the Microscope!!! by Robert Henry Durrett, CCC, August 31, 1998
Book learning? by Werner Inmann, CCC, December 31, 1998
Book learning by James Robertson, CCC, September 12, 1999

2000 ...

question about book and learning by Uri Blass, CCC, April 26, 2002
Time to implement Learning by Tom Likens, CCC, February 26, 2004

2005 ...

RomiChess && learning or the emperor has no clothes by Michael Sherwin, Winboard Programming Forum, May 19, 2006
learning by Jim, CCC, February 03, 2008
Information on engines with learning capabilities by Martin Thoresen, CCC, April 06, 2008
naive bayes classifier by Don Dailey, CCC, July 21, 2009 ^[37]

2010 ...

[Computer-go learning patterns for mc go] by Hendrik Baier, Computer Go Archive, April 26, 2010
Positional learning by Ben-Hur Carlos Vieira Langoni Junior, CCC, December 13, 2010
Ban: Automatic Learning of Evaluation [...] by BB+, OpenChess Forum, May 10, 2012 ^[38]
Teaching Deep Convolutional Neural Networks to Play Go by Hiroshi Yamashita, The Computer-go Archives, December 14, 2014 ^[39]
Teaching Deep Convolutional Neural Networks to Play Go by Michel Van den Bergh, CCC, December 16, 2014 ^[40]

2015 ...

Piece weights with regression analysis (in Russian) by Vladimir Medvedev, CCC, April 30, 2015 » Point Value by Regression Analysis
Position learning and opening books by Forrest Hoch, CCC, May 11, 2015
A database for learning evaluation functions by Álvaro Begué, CCC, October 28, 2016 » Automated Tuning, Evaluation, Texel's Tuning Method

External Links

Machine Learning

AI

Artificial Intelligence II by Nikos Drakos, Computer Based Learning Unit, University of Leeds

Learning I

Learning II

Chess

Supervised Learning

AdaBoost from Wikipedia

Unsupervised Learning

Reinforcement Learning

TD Learning

Statistics

Naive Bayes classifier from Wikipedia

Probabilistic classification from Wikipedia

Outline of regression analysis from Wikipedia

Linear regression from Wikipedia

Logistic regression from Wikipedia

Normal distribution from Wikipedia

Pseudorandom number generator from Wikipedia

Pseudo-random number sampling from Wikipedia

Randomness from Wikipedia

Statistical randomness from Wikipedia

Markov Models

NNs

ANNs

Topics

RNNs

Blogs

Neural Networks Blog by Ilya Sutskever
Dynamic Notions by John Wakefield , a Blog about the evolution of neural networks with C# samples:

The Single Layer Perceptron

Hidden Neurons and Feature Space

Training Neural Networks Using Back Propagation in C#

Data Mining with Artificial Neural Networks (ANN)

Blog - Welch Labs

Courses

References

↑ A depiction of the world's oldest continually operating university, the University of Bologna, Italy, by Laurentius de Voltolina, second half of 14th century, Learning from Wikipedia
↑ Inductive learning vs Deductive learning
↑ David Slate (1987). A Chess Program that uses its Transposition Table to Learn from Experience. ICCA Journal, Vol. 10, No. 2
↑ Robert Hyatt (1999). Book Learning - a Methodology to Tune an Opening Book Automatically. ICCA Journal, Vol. 22, No. 1
↑ Don Beal, Martin C. Smith (1997). Learning Piece Values Using Temporal Differences. ICCA Journal, Vol. 20, No. 3
↑ Don Beal, Martin C. Smith (1999). Learning Piece-Square Values using Temporal Differences. ICCA Journal, Vol. 22, No. 4
↑ Yngvi Björnsson, Tony Marsland (2001). Learning Search Control in Adversary Games. Advances in Computer Games 9, pdf
↑ Levente Kocsis, Jos Uiterwijk, Jaap van den Herik (2000). Learning Time Allocation using Neural Networks. CG 2000, postscript
↑ AI Horizon: Machine Learning, Part II: Supervised and Unsupervised Learning
↑ online papers from Machine Learning in Games by Jay Scott
↑ Rosenblatt's Contributions
↑ Ratio Club from Wikipedia
↑ Royal Radar Establishment from Wikipedia
↑ see Swap-off by Helmut Richter
↑ The abandonment of connectionism in 1969 - Wikipedia
↑ Frank Rosenblatt (1962). Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms. Spartan Books
↑ Long short term memory from Wikipedia
↑ Tsumego from Wikipedia
↑ Learnable Evolution Model from Wikipedia
↑ University of Bristol - Department of Computer Science - Technical Reports
↑ Generalized Hebbian Algorithm from Wikipedia
↑ Dap Hartmann (2010). Mimicking the Black Box - Genetically evolving evaluation functions and search algorithms. Review on Omid David's Ph.D. Thesis, ICGA Journal, Vol 33, No. 1
↑ Monte-Carlo Simulation Balancing - videolectures.net by David Silver
↑ MATLAB from Wikipedia
↑ Ms. Pac-Man from Wikipedia
↑ Demystifying Deep Reinforcement Learning by Tambet Matiisen, Nervana, December 21, 2015
↑ Patent US20150100530 - Methods and apparatus for reinforcement learning - Google Patents
↑ Jaap van den Herik wint Humies Award 2014 - LIACS - Leiden Institute of Advanced Computer Science
↑ 2048 (video game) from Wikipedia
↑ Teaching Deep Convolutional Neural Networks to Play Go by Hiroshi Yamashita, The Computer-go Archives, December 14, 2014
↑ Teaching Deep Convolutional Neural Networks to Play Go by Michel Van den Bergh, CCC, December 16, 2014
↑ Convolutional neural network from Wikipedia
↑ Best Paper Awards | TAAI 2014
↑ DeepChess: Another deep-learning based chess program by Matthew Lai, CCC, October 17, 2016
↑ ICANN 2016 | Recipients of the best paper awards
↑ Using GAN to play chess by Evgeniy Zheltonozhskiy, CCC, February 23, 2017
↑ Naive Bayes classifier from Wikipedia
↑ Amir Ban (2012). Automatic Learning of Evaluation, with Applications to Computer Chess. Discussion Paper 613, The Hebrew University of Jerusalem - Center for the Study of Rationality, Givat Ram
↑ Christopher Clark, Amos Storkey (2014). Teaching Deep Convolutional Neural Networks to Play Go. arXiv:1412.3409
↑ Chris J. Maddison, Aja Huang, Ilya Sutskever, David Silver (2014). Move Evaluation in Go Using Deep Convolutional Neural Networks. arXiv:1412.6564v1

Up one Level

[1] A depiction of the world's oldest continually operating university, the University of Bologna, Italy, by Laurentius de Voltolina, second half of 14th century, Learning from Wikipedia

[2] Inductive learning vs Deductive learning

[3] David Slate (1987). A Chess Program that uses its Transposition Table to Learn from Experience. ICCA Journal, Vol. 10, No. 2

[4] Robert Hyatt (1999). Book Learning - a Methodology to Tune an Opening Book Automatically. ICCA Journal, Vol. 22, No. 1

[5] Don Beal, Martin C. Smith (1997). Learning Piece Values Using Temporal Differences. ICCA Journal, Vol. 20, No. 3

[6] Don Beal, Martin C. Smith (1999). Learning Piece-Square Values using Temporal Differences. ICCA Journal, Vol. 22, No. 4

[7] Yngvi Björnsson, Tony Marsland (2001). Learning Search Control in Adversary Games. Advances in Computer Games 9, pdf

[8] Levente Kocsis, Jos Uiterwijk, Jaap van den Herik (2000). Learning Time Allocation using Neural Networks. CG 2000, postscript

[9] AI Horizon: Machine Learning, Part II: Supervised and Unsupervised Learning

[10] rs from Machine Learning in Games by Jay Scott

[11] Rosenblatt's Contributions

[12] Ratio Club from Wikipedia

[13] Royal Radar Establishment from Wikipedia

[14] see Swap-off by Helmut Richter

[15] The abandonment of connectionism in 1969 - Wikipedia

[16] Frank Rosenblatt (1962). Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms. Spartan Books

[17] Long short term memory from Wikipedia

[18] Tsumego from Wikipedia

[19] Learnable Evolution Model from Wikipedia

[20] University of Bristol - Department of Computer Science - Technical Reports

[21] Generalized Hebbian Algorithm from Wikipedia

[22] Dap Hartmann (2010). Mimicking the Black Box - Genetically evolving evaluation functions and search algorithms. Review on Omid David's Ph.D. Thesis, ICGA Journal, Vol 33, No. 1

[23] Monte-Carlo Simulation Balancing - videolectures.net by David Silver

[24] MATLAB from Wikipedia

[25] Ms. Pac-Man from Wikipedia

[26] Demystifying Deep Reinforcement Learning by Tambet Matiisen, Nervana, December 21, 2015

[27] Patent US20150100530 - Methods and apparatus for reinforcement learning - Google Patents

[28] Jaap van den Herik wint Humies Award 2014 - LIACS - Leiden Institute of Advanced Computer Science

[29] 2048 (video game) from Wikipedia

[30] Teaching Deep Convolutional Neural Networks to Play Go by Hiroshi Yamashita, The Computer-go Archives, December 14, 2014

[31] Teaching Deep Convolutional Neural Networks to Play Go by Michel Van den Bergh, CCC, December 16, 2014

[32] Convolutional neural network from Wikipedia

[33] Best Paper Awards | TAAI 2014

[34] DeepChess: Another deep-learning based chess program by Matthew Lai, CCC, October 17, 2016

[35] ICANN 2016 | Recipients of the best paper awards

[36] Using GAN to play chess by Evgeniy Zheltonozhskiy, CCC, February 23, 2017

[37] Naive Bayes classifier from Wikipedia

[38] Amir Ban (2012). Automatic Learning of Evaluation, with Applications to Computer Chess. Discussion Paper 613, The Hebrew University of Jerusalem - Center for the Study of Rationality, Givat Ram

[39] Christopher Clark, Amos Storkey (2014). Teaching Deep Convolutional Neural Networks to Play Go. arXiv:1412.3409

[40] Chris J. Maddison, Aja Huang, Ilya Sutskever, David Silver (2014). Move Evaluation in Go Using Deep Convolutional Neural Networks. arXiv:1412.6564v1

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]

[28]

[29]

[30]

[31]

[32]

[33]

[34]

[35]

[36]

[37]

[38]

[39]

[40]

@@ Line 224: / Line 224: @@
 * [[Gerald Tesauro]] ('''1995'''). ''Temporal Difference Learning and TD-Gammon''. [[ACM#Communications|Communications of the ACM]] Vol. 38, No. 3
 * [[Sebastian Thrun]] ('''1995'''). ''[http://robots.stanford.edu/papers/thrun.nips7.neuro-chess.html Learning to Play the Game of Chess]''. in [[Gerald Tesauro]], [https://en.wikipedia.org/wiki/David_S._Touretzky David S. Touretzky], [http://mitpress.mit.edu/authors/todd-k-leen Todd K. Leen] (eds.) Advances in Neural Information Processing Systems 7, [https://en.wikipedia.org/wiki/MIT_Press MIT Press]
-* [[Marco Wiering]] ('''1995'''). ''[https://scholar.google.com/citations?view_op=view_citation&hl=en&user=xVas0I8AAAAJ&cstart=20&citation_for_view=xVas0I8AAAAJ:roLk4NBRz8UC TD Learning of Game Evaluation Functions with Hierarchical Neural Architectures]''. Master's thesis, [https://en.wikipedia.org/wiki/University_of_Amsterdam University of Amsterdam], [http://webber.physik.uni-freiburg.de/~hon/vorlss02/Literatur/reinforcement/GameEvaluationWithNeuronal.pdf pdf]
+* [[Marco Wiering]] ('''1995'''). ''TD Learning of Game Evaluation Functions with Hierarchical Neural Architectures''. Master's thesis, [https://en.wikipedia.org/wiki/University_of_Amsterdam University of Amsterdam], [http://webber.physik.uni-freiburg.de/~hon/vorlss02/Literatur/reinforcement/GameEvaluationWithNeuronal.pdf pdf]
 * [[Mathematician#MAArbib|Michael A. Arbib]] (ed.) ('''1995, 2002'''). ''[http://mitpress.mit.edu/books/handbook-brain-theory-and-neural-networks The Handbook of Brain Theory and Neural Networks]''. [https://en.wikipedia.org/wiki/MIT_Press The MIT Press]
 * [[Nicol N. Schraudolph]] ('''1995'''). ''[http://nic.schraudolph.org/bib2html/b2hd-Schraudolph95 Optimization of Entropy with Neural Networks]''. Ph.D. thesis, [https://en.wikipedia.org/wiki/University_of_California,_San_Diego University of California, San Diego]
@@ Line 256: / Line 256: @@
 * [[William Uther]], [[Manuela Veloso|Manuela M. Veloso]] ('''1997'''). ''Adversarial Reinforcement Learning''. [[Carnegie Mellon University]], [http://www.cse.unsw.edu.au/~willu/w/papers/Uther97a.ps ps]
 * [[William Uther]], [[Manuela Veloso|Manuela M. Veloso]] ('''1997'''). ''Generalizing Adversarial Reinforcement Learning''. [[Carnegie Mellon University]], [http://www.cse.unsw.edu.au/~willu/w/papers/Uther97b.ps ps]
-* [[Marco Wiering]],  [[Jürgen Schmidhuber]] ('''1997'''). ''[https://scholar.google.com/citations?view_op=view_citation&hl=en&user=xVas0I8AAAAJ&citation_for_view=xVas0I8AAAAJ:u5HHmVD_uO8C HQ-learning]''. [https://en.wikipedia.org/wiki/Adaptive_Behavior_%28journal%29 Adaptive Behavior], Vol. 6, No 2
+* [[Marco Wiering]],  [[Jürgen Schmidhuber]] ('''1997'''). ''HQ-learning''. [https://en.wikipedia.org/wiki/Adaptive_Behavior_%28journal%29 Adaptive Behavior], Vol. 6, No 2
 '''1998'''
 * [[Jonathan Baxter]], [[Andrew Tridgell]], [[Lex Weaver]] ('''1998'''). ''Knightcap: A chess program that learns by combining td(λ) with game-tree search'', Proceedings of the 15th International Conference on Machine Learning, [http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.54.8263&rep=rep1&type=pdf pdf] via [http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.54.8263 citeseerX]
@@ Line 272: / Line 272: @@
 * [[Ryszard Michalski]] ('''1998'''). ''Learnable Evolution: Combining Symbolic and Evolutionary Learning''. Proceedings of the Fourth International Workshop on Multistrategy Learning (MSL'98)
 * [[Krzysztof Krawiec]], [http://www.informatik.uni-trier.de/~ley/pers/hd/s/Slowinski:Roman.html Roman Slowinski], [http://www.informatik.uni-trier.de/~ley/pers/hd/s/Szczesniak:Irmina.html Irmina Szczesniak] ('''1998'''). ''[http://link.springer.com/chapter/10.1007%2F3-540-69115-4_60 Pedagogical Method for Extraction of Symbolic Knowledge from Neural Networks]''. [http://link.springer.com/book/10.1007%2F3-540-69115-4 Rough Sets and Current Trends in Computing 1998]
-* [[Marco Wiering]],  [[Jürgen Schmidhuber]] ('''1998'''). ''[https://scholar.google.com/citations?view_op=view_citation&hl=en&user=xVas0I8AAAAJ&citation_for_view=xVas0I8AAAAJ:2osOgNQ5qMEC Fast online Q (λ)]''. [https://en.wikipedia.org/wiki/Machine_Learning_%28journal%29 Machine Learning], Vol. 33, No. 1
+* [[Marco Wiering]],  [[Jürgen Schmidhuber]] ('''1998'''). ''Fast online Q (λ)''. [https://en.wikipedia.org/wiki/Machine_Learning_%28journal%29 Machine Learning], Vol. 33, No. 1
 '''1999'''
 * [[Robert Hyatt]] ('''1999'''). ''[http://www.craftychess.com/hyatt/learning.html Book Learning - a Methodology to Tune an Opening Book Automatically]''. [[ICGA Journal#22_1|ICCA Journal, Vol. 22, No. 1]]
@@ Line 282: / Line 282: @@
 * [http://www.ilsp.gr/homepages/papavasiliou_eng.html Vassilis Papavassiliou], [[Stuart Russell]] ('''1999'''). ''Convergence of reinforcement learning with general function approximators.'' In Proc. IJCAI-99, Stockholm, [http://www.cs.berkeley.edu/~russell/papers/ijcai99-bridge.ps ps]
 * [[Philip G. K. Reiser]], [[Patricia J. Riddle]] ('''1999'''). ''[http://link.springer.com/chapter/10.1007%2F3-540-48873-1_19 Evolving Logic Programs to Classify Chess-Endgame Positions]''. [http://link.springer.com/book/10.1007%2F3-540-48873-1 Simulated Evolution and Learning], [https://en.wikipedia.org/wiki/Canberra Canberra], Australia. [http://www.springer.com/series/1244 Lecture Notes in Artificial Intelligence], No. 1585, [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer], [http://stancomb.co.uk/Papers/seal98.pdf pdf] » [[Endgame]]
-* [[Marco Wiering]] ('''1999'''). ''[https://scholar.google.com/citations?view_op=view_citation&hl=en&user=xVas0I8AAAAJ&pagesize=100&citation_for_view=xVas0I8AAAAJ:9yKSN-GCB0IC Explorations in Efficient Reinforcement Learning]''. Ph.D. thesis, [https://en.wikipedia.org/wiki/University_of_Amsterdam University of Amsterdam], advisors [[Mathematician#FGroen|Frans Groen]] and [[Jürgen Schmidhuber]]
+* [[Marco Wiering]] ('''1999'''). ''[Explorations in Efficient Reinforcement Learning''. Ph.D. thesis, [https://en.wikipedia.org/wiki/University_of_Amsterdam University of Amsterdam], advisors [[Mathematician#FGroen|Frans Groen]] and [[Jürgen Schmidhuber]]
 * [[Mathematician#GEHinton|Geoffrey E. Hinton]], [[Terrence J. Sejnowski]] (eds.) ('''1999'''). ''[https://mitpress.mit.edu/books/unsupervised-learning Unsupervised Learning: Foundations of Neural Computation]''.  [https://en.wikipedia.org/wiki/MIT_Press MIT Press]
 ==2000 ...==
@@ Line 352: / Line 352: @@
 * [[Daniel Osman]], [[Jacek Mańdziuk]] ('''2004'''). ''Comparison of TDLeaf and TD learning in Game Playing Domain''. [http://www.informatik.uni-trier.de/~ley/db/conf/iconip/iconip2004.html#OsmanM04 11. ICONIP], [http://www.mini.pw.edu.pl/~mandziuk/PRACE/ICONIP04.pdf pdf]
 * [[Albert Xin Jiang]] ('''2004'''). ''Multiagent Reinforcement Learning in Stochastic Games with Continuous Action Spaces''. [http://www.cs.ubc.ca/%7Ejiang/papers/continuous.pdf pdf]
-* [[Henk Mannen]], [[Marco Wiering]] ('''2004'''). ''[http://scholar.google.com/citations?view_op=view_citation&hl=en&user=xVas0I8AAAAJ&cstart=20&pagesize=80&citation_for_view=xVas0I8AAAAJ:7PzlFSSx8tAC Learning to play chess using TD(λ)-learning with database games]''. [http://students.uu.nl/en/hum/cognitive-artificial-intelligence Cognitive Artiﬁcial Intelligence], [https://en.wikipedia.org/wiki/Utrecht_University Utrecht University], Benelearn’04
+* [[Henk Mannen]], [[Marco Wiering]] ('''2004'''). ''Learning to play chess using TD(λ)-learning with database games''. [http://students.uu.nl/en/hum/cognitive-artificial-intelligence Cognitive Artiﬁcial Intelligence], [https://en.wikipedia.org/wiki/Utrecht_University Utrecht University], Benelearn’04
 ==2005 ...==
 * [[Dave Gomboc]], [[Michael Buro]], [[Tony Marsland]] ('''2005'''). ''Tuning evaluation functions by maximizing concordance'' Theoretical Computer Science, Volume 349, Issue 2, pp. 202-229, [http://www.cs.ualberta.ca/%7Emburo/ps/tcs-learn.pdf pdf]
@@ Line 433: / Line 433: @@
 * [[Edward P. Manning]] ('''2010'''). ''[http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=5409565 Using Resource-Limited Nash Memory to Improve an Othello Evaluation Function]''. [[IEEE#TOCIAIGAMES|IEEE Transactions on Computational Intelligence and AI in Games]], Vol. 2, No. 1 » [[Othello]]
 * [[Edward P. Manning]] ('''2010'''). ''[http://dl.acm.org/citation.cfm?id=1830667 Coevolution in a Large Search Space using Resource-limited Nash Memory]''. [http://www.informatik.uni-trier.de/~ley/db/conf/gecco/gecco2010.html#Manning10 GECCO '10] » [[Othello]]
-* [[Marco Wiering]] ('''2010'''). ''[https://scholar.google.com/citations?view_op=view_citation&hl=en&user=xVas0I8AAAAJ&cstart=20&citation_for_view=xVas0I8AAAAJ:_kc_bZDykSQC Self-play and using an expert to learn to play backgammon with temporal difference learning]''. [http://www.scirp.org/journal/jilsa/ Journal of Intelligent Learning Systems and Applications], Vol. 2, No. 2
+* [[Marco Wiering]] ('''2010'''). ''Self-play and using an expert to learn to play backgammon with temporal difference learning''. [http://www.scirp.org/journal/jilsa/ Journal of Intelligent Learning Systems and Applications], Vol. 2, No. 2
 '''2011'''
 * [[Joel Veness]] ('''2011'''). ''Approximate Universal Artificial Intelligence and Self-Play Learning for Games''. Ph.D. thesis, [https://en.wikipedia.org/wiki/University_of_New_South_Wales University of New South Wales], supervisors: [[Kee Siong Ng]], [[Marcus Hutter]], [[Alan Blair]], [[William Uther]], [[John Lloyd]]; [http://jveness.info/publications/veness_phd_thesis_final.pdf pdf]
@@ Line 446: / Line 446: @@
 * [[Hamid Reza Maei]] ('''2011'''). ''Gradient Temporal-Difference Learning Algorithms''. Ph.D. thesis, [[University of Alberta]], advisor [[Richard Sutton]], [http://webdocs.cs.ualberta.ca/~sutton/papers/maei-thesis-2011.pdf pdf]
 '''2012'''
-* [[Marco Wiering]], [http://martijnvanotterlo.nl/ Martijn Van Otterlo] ('''2012'''). ''[https://scholar.google.com/citations?view_op=view_citation&hl=en&user=xVas0I8AAAAJ&citation_for_view=xVas0I8AAAAJ:abG-DnoFyZgC Reinforcement learning: State-of-the-art]''. [http://link.springer.com/book/10.1007/978-3-642-27645-3 Adaptation, Learning, and Optimization, Vol. 12], [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer]
+* [[Marco Wiering]], [http://martijnvanotterlo.nl/ Martijn Van Otterlo] ('''2012'''). ''Reinforcement learning: State-of-the-art''. [http://link.springer.com/book/10.1007/978-3-642-27645-3 Adaptation, Learning, and Optimization, Vol. 12], [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer]
 : [[István Szita]] ('''2012'''). ''[http://link.springer.com/chapter/10.1007%2F978-3-642-27645-3_17 Reinforcement Learning in Games]''. Chapter 17
 * [http://dblp.uni-trier.de/pers/hd/d/Dries:Sjoerd_van_den Sjoerd van den Dries], [[Marco Wiering]] ('''2012'''). ''[https://scholar.google.com/citations?view_op=view_citation&hl=en&user=xVas0I8AAAAJ&cstart=40&citation_for_view=xVas0I8AAAAJ:P5F9QuxV20EC Neural-fitted TD-leaf learning for playing Othello with structured neural networks]''. [[IEEE#NN|IEEE Transactions on Neural Networks and Learning Systems]], Vol. 23, No. 11
@@ Line 459: / Line 459: @@
 * [[Marcin Szubert]], [[Wojciech Jaśkowski]], [[Paweł Liskowski]], [[Krzysztof Krawiec]] ('''2013'''). ''Shaping Fitness Function for Evolutionary Learning of Game Strategies''. [http://www.informatik.uni-trier.de/~ley/db/conf/gecco/gecco2013.html#SzubertJLK13 GECCO 2013], [http://www.cs.put.poznan.pl/wjaskowski/pub/papers/szubert2013shaping.pdf pdf]
 * [[Marcin Szubert]], [[Wojciech Jaśkowski]], [[Krzysztof Krawiec]] ('''2013'''). ''[http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6504736 On Scalability, Generalization, and Hybridization of Coevolutionary Learning: a Case Study for Othello]''. [[IEEE#TOCIAIGAMES|IEEE Transactions on Computational Intelligence and AI in Games]], Vol. 5, No. 3 » [[Othello]]
-* [http://dblp.uni-trier.de/pers/hd/r/Ree:M=_van_der Michiel van der Ree], [[Marco Wiering]] ('''2013'''). ''[https://scholar.google.com/citations?view_op=view_citation&hl=en&user=xVas0I8AAAAJ&cstart=60&pagesize=80&citation_for_view=xVas0I8AAAAJ:K3LRdlH-MEoC Reinforcement Learning in the Game of Othello: Learning Against a Fixed Opponent and Learning from Self-Play]''. [http://dblp.uni-trier.de/db/conf/adprl/adprl2013.html#ReeW13 ADPRL 2013]
+* [http://dblp.uni-trier.de/pers/hd/r/Ree:M=_van_der Michiel van der Ree], [[Marco Wiering]] ('''2013'''). ''Reinforcement Learning in the Game of Othello: Learning Against a Fixed Opponent and Learning from Self-Play''. [http://dblp.uni-trier.de/db/conf/adprl/adprl2013.html#ReeW13 ADPRL 2013]
-* [http://dblp.uni-trier.de/pers/hd/b/Bom:Luuk Luuk Bom], [http://dblp.uni-trier.de/pers/hd/h/Henken:Ruud Ruud Henken], [[Marco Wiering]] ('''2013'''). ''[https://scholar.google.com/citations?view_op=view_citation&hl=en&user=xVas0I8AAAAJ&cstart=40&citation_for_view=xVas0I8AAAAJ:l7t_Zn2s7bgC Reinforcement Learning to Train Ms. Pac-Man Using Higher-order Action-relative Inputs]''. [http://dblp.uni-trier.de/db/conf/adprl/adprl2013.html#BomHW13 ADPRL 2013] <ref>[https://en.wikipedia.org/wiki/Ms._Pac-Man Ms. Pac-Man from Wikipedia]</ref>
+* [http://dblp.uni-trier.de/pers/hd/b/Bom:Luuk Luuk Bom], [http://dblp.uni-trier.de/pers/hd/h/Henken:Ruud Ruud Henken], [[Marco Wiering]] ('''2013'''). ''Reinforcement Learning to Train Ms. Pac-Man Using Higher-order Action-relative Inputs''. [http://dblp.uni-trier.de/db/conf/adprl/adprl2013.html#BomHW13 ADPRL 2013] <ref>[https://en.wikipedia.org/wiki/Ms._Pac-Man Ms. Pac-Man from Wikipedia]</ref>
 * [[Peter Auer]], [[Marcus Hutter]], [[Laurent Orseau]] ('''2013'''). ''[http://drops.dagstuhl.de/opus/volltexte/2013/4340/ Reinforcement Learning]''. [http://dblp.uni-trier.de/db/journals/dagstuhl-reports/dagstuhl-reports3.html#AuerHO13 Dagstuhl Reports, Vol. 3, No. 8], DOI: [http://drops.dagstuhl.de/opus/volltexte/2013/4340/ 10.4230/DagRep.3.8.1], URN: [http://drops.dagstuhl.de/opus/volltexte/2013/4340/ urn:nbn:de:0030-drops-43409]
 * [[Igor Roizen]], [[Judea Pearl]] ('''2013'''). ''Learning Link-Probabilities in Causal Trees.'' [https://arxiv.org/abs/1304.3103 arXiv:1304.3103]
@@ Line 498: / Line 498: @@
 '''2018'''
 * [[Arthur Guez]], [[Théophane Weber]], [[Ioannis Antonoglou]], [[Karen Simonyan]], [[Oriol Vinyals]], [[Daan Wierstra]], [[Rémi Munos]], [[David Silver]] ('''2018'''). ''Learning to Search with MCTSnets''. [https://arxiv.org/abs/1802.04697 arXiv:1802.04697] » [[Monte-Carlo Tree Search]]
+* [[Matthia Sabatelli]], [[Francesco Bidoia]], [[Valeriu Codreanu]], [[Marco Wiering]] ('''2018'''). ''Learning to Evaluate Chess Positions with Deep Neural Networks and Limited Lookahead''. ICPRAM 2018, [http://www.ai.rug.nl/~mwiering/GROUP/ARTICLES/ICPRAM_CHESS_DNN_2018.pdf pdf]
 =Forum Posts=

Difference between revisions of "Learning"

Revision as of 15:13, 25 August 2018

Contents

Learning inside a Chess Program

Learning Paradigms

Supervised Learning

Unsupervised Learning

Reinforcement Learning

Learning Topics

Programs

See also

Selected Publications

1940 ...

1950 ...

1955 ...

1960 ...

1965 ...

1970 ...

1975 ...

1980 ...

1985 ...

1990 ...

1995 ...

2000 ...

2005 ...

2010 ...

2015 ...

Forum Posts

1998 ...

2000 ...

2005 ...

2010 ...

2015 ...

External Links

Machine Learning

AI

Chess

Supervised Learning

Unsupervised Learning

Reinforcement Learning

TD Learning

Statistics

Markov Models

NNs

ANNs

Courses

References

Navigation menu

Search