Gerald Tesauro

From Chessprogramming wiki
Jump to: navigation, search

Home * People * Gerald Tesauro

Gerald Tesauro [1]

Gerald Tesauro,
an American physicist, computer scientist and games researcher at IBM Watson Research Center and pioneer in applying Neural Networks, Reinforcement Learning and Temporal Difference Learning to stochastic games, especially the game of Backgammon. More recently, he is member of the team around David Ferrucci, which developed Watson [2]. He explained that Watson's wagers were based on its confidence level for the category and a complex regression model called the Game State Evaluator [3]. Gerald Tesauro holds a Ph.D. in theoretical physics from Princeton University. He is section editor of the ICGA Journal.

Programs

Gerald Tesauro is author of the Backgammon programs Neurogammon, which won the Gold medal at the 1st Computer Olympiad 1989 in London, and TD-Gammon, improved by TD-Lambda based Temporal Difference Learning [4], and along with Martin Müller, Broderick Arneson, Richard Segal, Markus Enzenberger and Arpad Rimmel (since 2010), co-author of the Go playing program Fuego [5], which won the Gold medal at the 14th Computer Olympiad in 9x9 Go, as well the Silver medal in 19x19 Go [6] [7].

Selected Publications

[8] [9]

1987 ...

1990 ...

2000 ...

External Links

References

  1. IBM Research: Watson’s wagering strategies by Gerald Tesauro, February 13, 2011
  2. IBM Press room - 2011-08-29 IBM and Jeopardy! Relive History with Encore Presentation of Jeopardy!: The IBM Challenge - United States
  3. IBM Research: Watson’s wagering strategies by Gerald Tesauro, February 13, 2011
  4. Richard Sutton, Andrew Barto (1998). Reinforcement Learning: An Introduction. MIT Press, 11.1 TD-Gammon
  5. Fuego from sourceforge
  6. Markus Enzenberger, Martin Müller (2009). Fuego - An Open-source Framework for Board Games and Go Engine Based on Monte-Carlo Tree Search. Tecnical Report
  7. Martin Müller (2009). Fuego at the Computer Olympiad in Pamplona 2009: A Tournament Report.
  8. ICGA Reference Database (pdf)
  9. DBLP: Gerald Tesauro
  10. Monte-Carlo Simulation Balancing - videolectures.net

Up one level