Gerald Tesauro

Home * People * Gerald Tesauro



Gerald Tesauro, an American physicist, computer scientist and games researcher at IBM Watson Research Center and pioneer in applying Neural Networks, Reinforcement Learning and Temporal Difference Learning to stochastic games, especially the game of Backgammon. More recently, he is member of the team around David Ferrucci, which developed Watson. He explained that Watson's wagers were based on its confidence level for the category and a complex regression model called the Game State Evaluator. Gerald Tesauro holds a Ph.D. in theoretical physics from Princeton University. He is section editor of the ICGA Journal.

=Programs= Gerald Tesauro is author of the Backgammon programs Neurogammon, which won the Gold medal at the 1st Computer Olympiad 1989 in London, and TD-Gammon, improved by TD-Lambda based Temporal Difference Learning, and along with Martin Müller, Broderick Arneson, Richard Segal, Markus Enzenberger and Arpad Rimmel (since 2010), co-author of the Go playing program Fuego , which won the Gold medal at the 14th Computer Olympiad in 9x9 Go, as well the Silver medal in 19x19 Go.

=Selected Publications=

1987 ...

 * Gerald Tesauro, Terrence J. Sejnowski (1987). A 'Neural' Network that Learns to Play Backgammon. NIPS 1987
 * Gerald Tesauro (1988). Connectionist Learning of Expert Backgammon Evaluations. ML, 1988
 * Gerald Tesauro (1988). Neural network defeats creator in backgammon match. Technical report no. CCSR-88-6, Center for Complex Systems Research, University of Illinois at Urbana-Champaign
 * Gerald Tesauro (1989). Connections learning of expert preferences by comparison training. Advances in Neural Information Processing Systems Morgan Kaufman.
 * Gerald Tesauro (1989). NEUROGAMMON: A Neural-Network Backgammon Learning Program. Heuristic Programming in Artificial Intelligence 1
 * Gerald Tesauro (1989). Neurogammon Wins Computer Olympiad. Neural Computation Vol. 1, No. 3
 * Gerald Tesauro, Terrence J. Sejnowski (1989). A Parallel Network that Learns to Play Backgammon. Artificial Intelligence, Vol. 39, No. 3

1990 ...

 * Gerald Tesauro (1992). Temporal Difference Learning of Backgammon Strategy. ML 1992
 * Gerald Tesauro (1992). Practical Issues in Temporal Difference Learning. Machine Learning, Vol. 8, No. 3-4
 * Gerald Tesauro (1994). TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play. Neural Computation Vol. 6, No. 2
 * Gerald Tesauro (1995). Temporal Difference Learning and TD-Gammon. Communications of the ACM, Vol. 38, No. 3

2000 ...

 * Gerald Tesauro (2001). Comparison Training of Chess Evaluation Functions. In Johannes Fürnkranz, Miroslav Kubat (eds.) (2001). Machines that learn to play games, 117–130, Nova Science Publishers » Automated Tuning, SCP, Deep Blue
 * Gerald Tesauro (2002). Programming backgammon using self-teaching neural nets. Artificial Intelligence Vol. 134 No. 1-2
 * Gerald Tesauro (2007). Reinforcement Learning in Autonomic Computing: A Manifesto and Case Studies. IEEE Internet Computing Vol. 11
 * David Silver, Gerald Tesauro (2009). Monte-Carlo Simulation Balancing. ICML 2009, pdf

=External Links=
 * Gerald Tesauro's ICGA Tournaments
 * Gerald Tesauro - IBM
 * Gerald Tesauro - IBM Watson Research Center - videolectures.net
 * Neural Network learns Backgammon by Kimon Tsinteris and David Wilson

=References=

Up one level