Gerald Tesauro

From Chessprogramming wiki

Revision as of 21:57, 3 November 2020 by GerdIsenberg (talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Jump to: navigation, search

Home * People * Gerald Tesauro

Gerald Tesauro ^[1]

Gerald Tesauro,
an American physicist, computer scientist and games researcher at IBM Watson Research Center and pioneer in applying Neural Networks, Reinforcement Learning and Temporal Difference Learning to stochastic games, especially the game of Backgammon. More recently, he is member of the team around David Ferrucci, which developed Watson ^[2]. He explained that Watson's wagers were based on its confidence level for the category and a complex regression model called the Game State Evaluator ^[3]. Gerald Tesauro holds a Ph.D. in theoretical physics from Princeton University. He is section editor of the ICGA Journal.

Programs

Gerald Tesauro is author of the Backgammon programs Neurogammon, which won the Gold medal at the 1st Computer Olympiad 1989 in London, and TD-Gammon, improved by TD-Lambda based Temporal Difference Learning ^[4], and along with Martin Müller, Broderick Arneson, Richard Segal, Markus Enzenberger and Arpad Rimmel (since 2010), co-author of the Go playing program Fuego ^[5], which won the Gold medal at the 14th Computer Olympiad in 9x9 Go, as well the Silver medal in 19x19 Go ^[6] ^[7].

Selected Publications

1987 ...

Gerald Tesauro, Terrence J. Sejnowski (1987). A 'Neural' Network that Learns to Play Backgammon. NIPS 1987
Gerald Tesauro (1988). Connectionist Learning of Expert Backgammon Evaluations. ML, 1988
Gerald Tesauro (1988). Neural network defeats creator in backgammon match. Technical report no. CCSR-88-6, Center for Complex Systems Research, University of Illinois at Urbana-Champaign
Gerald Tesauro (1989). Connections learning of expert preferences by comparison training. Advances in Neural Information Processing Systems Morgan Kaufman.
Gerald Tesauro (1989). NEUROGAMMON: A Neural-Network Backgammon Learning Program. Heuristic Programming in Artificial Intelligence 1
Gerald Tesauro (1989). Neurogammon Wins Computer Olympiad. Neural Computation Vol. 1, No. 3
Gerald Tesauro, Terrence J. Sejnowski (1989). A Parallel Network that Learns to Play Backgammon. Artificial Intelligence, Vol. 39, No. 3

1990 ...

Gerald Tesauro (1992). Temporal Difference Learning of Backgammon Strategy. ML 1992
Gerald Tesauro (1992). Practical Issues in Temporal Difference Learning. Machine Learning, Vol. 8, No. 3-4
Gerald Tesauro (1994). TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play. Neural Computation Vol. 6, No. 2
Gerald Tesauro (1995). Temporal Difference Learning and TD-Gammon. Communications of the ACM, Vol. 38, No. 3

2000 ...

Gerald Tesauro (2001). Comparison Training of Chess Evaluation Functions. In Johannes Fürnkranz, Miroslav Kubat (eds.) (2001). Machines that learn to play games, 117–130, Nova Science Publishers » Automated Tuning, SCP, Deep Blue
Gerald Tesauro (2002). Programming backgammon using self-teaching neural nets. Artificial Intelligence Vol. 134 No. 1-2
Gerald Tesauro (2007). Reinforcement Learning in Autonomic Computing: A Manifesto and Case Studies. IEEE Internet Computing Vol. 11
David Silver, Gerald Tesauro (2009). Monte-Carlo Simulation Balancing. ICML 2009, pdf ^[10]

External Links

Gerald Tesauro's ICGA Tournaments
Gerald Tesauro - IBM
Gerald Tesauro - IBM Watson Research Center - videolectures.net
Neural Network learns Backgammon by Kimon Tsinteris and David Wilson
Standing on the shoulders of giants by Albert Silver, ChessBase News, September 18, 2019

References

↑ IBM Research: Watson’s wagering strategies by Gerald Tesauro, February 13, 2011
↑ IBM Press room - 2011-08-29 IBM and Jeopardy! Relive History with Encore Presentation of Jeopardy!: The IBM Challenge - United States
↑ IBM Research: Watson’s wagering strategies by Gerald Tesauro, February 13, 2011
↑ Richard Sutton, Andrew Barto (1998). Reinforcement Learning: An Introduction. MIT Press, 11.1 TD-Gammon
↑ Fuego from sourceforge
↑ Markus Enzenberger, Martin Müller (2009). Fuego - An Open-source Framework for Board Games and Go Engine Based on Monte-Carlo Tree Search. Tecnical Report
↑ Martin Müller (2009). Fuego at the Computer Olympiad in Pamplona 2009: A Tournament Report.
↑ ICGA Reference Database (pdf)
↑ DBLP: Gerald Tesauro
↑ Monte-Carlo Simulation Balancing - videolectures.net

Up one level

Retrieved from "https://www.chessprogramming.org/index.php?title=Gerald_Tesauro&oldid=22042"