Richard Sutton

Home * People * Richard Sutton



Richard Stuart Sutton, an American computer scientist and AI-researcher. Since 2003, Richard S. Sutton is a professor in the Department of Computing Science at the University of Alberta and is principal investigator of the Reinforcement Learning and Artificial Intelligence (RLAI) group. Rich's research interests center on the learning problems facing a decision-maker interacting with its environment, which he sees as central to artificial intelligence. He is the author of the original paper on Temporal Difference Learning and, with Andrew Barto, of the textbook Reinforcement Learning: An Introduction. He is also interested in animal learning psychology, in connectionist networks, and generally in systems that continually improve their representations and models of the world.

=Selected Publications=

1978

 * Richard Sutton (1978). Single channel theory: A neuronal theory of learning. Brain Theory Newsletter 3, No. 3/4

1980 ...

 * Richard Sutton, Andrew Barto (1981). Toward a modern theory of adaptive networks: Expectation and prediction. Psychological Review, Vol. 88
 * Richard Sutton (1984). Temporal Credit Assignment in Reinforcement Learning. Ph.D. dissertation, University of Massachusetts
 * Richard Sutton (1988). Learning to Predict by the Methods of Temporal Differences. Machine Learning, Vol. 3, No. 1

1990 ...

 * Richard Sutton, Andrew Barto (1990). Time-Derivative Models of Pavlovian Reinforcement. in Michael Gabriel, John Moore (eds.) (1990). Learning and Computational Neuroscience: Foundations of Adaptive Networks. MIT Press
 * Doina Precup, Richard Sutton (1997). Multi-time Models for Temporally Abstract Planning. NIPS 1997
 * Richard Sutton, Andrew Barto (1998). Reinforcement Learning: An Introduction. MIT Press

2000 ...

 * Michael L. Littman, Richard Sutton, Satinder Singh (2001). Predictive Representations of State. NIPS 2001, pdf
 * Richard Sutton, Brian Tanner (2005). Temporal-Difference Networks. Advances in Neural Information Processing Systems 17
 * David Silver, Richard Sutton, Martin Müller (2007). Reinforcement learning of local shape in the game of Go. 20th IJCAI, pdf
 * Richard Sutton, Csaba Szepesvári, Hamid Reza Maei (2008). A Convergent O(n) Algorithm for Off-policy Temporal-difference Learning with Linear Function Approximation.
 * David Silver, Richard Sutton, Martin Müller (2008). Sample-Based Learning and Search with Permanent and Transient Memories. In Proceedings of the 25th International Conference on Machine Learning, pdf
 * Maria Cutumisu, Michael Bowling, Duane Szafron, Richard Sutton (2008). Agent Learning using Action-Dependent Learning Rates in Computer Role-Playing Games. Proceedings of the Fourth Artificial Intelligence and Interactive Digital Entertainment Conference, pdf
 * Hamid Reza Maei, Csaba Szepesvári, Shalabh Bhatnagar, Doina Precup, David Silver, Richard Sutton (2009). Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation. NIPS 22, MIT Press
 * Richard Sutton, Hamid Reza Maei, Doina Precup, Shalabh Bhatnagar, David Silver, Csaba Szepesvári, Eric Wiewiora. (2009). Fast Gradient-Descent Methods for Temporal-Difference Learning with Linear Function Approximation. ICML-09, pdf

2010

 * Hamid Reza Maei, Richard Sutton (2010). GQ(λ): A general gradient algorithm for temporal-difference prediction learning with eligibility traces. In Proceedings of the Third Conference on Artificial General Intelligence
 * David Silver, Richard Sutton, Martin Mueller (2013). Temporal-Difference Search in Computer Go. Proceedings of the ICAPS-13 Workshop on Planning and Learning, pdf
 * Huizhen Yu, A. Rupam Mahmood, Richard Sutton (2017). On Generalized Bellman Equations and Temporal-Difference Learning. Canadian Conference on AI 2017, arXiv:1704.04463

=External Links=
 * Rich Sutton's Home Page
 * Richard S. Sutton from Wikipedia
 * The Mathematics Genealogy Project - Richard Sutton
 * Reinforcement Learning: An Introduction ebook by Richard Sutton and Andrew Barto
 * Deconstructing Reinforcement Learning, videolecture by Richard Sutton, June 2009
 * Fast Gradient-Descent Methods for Temporal-Difference Learning with Linear Function Approximation, videolecture by Richard Sutton, June 2009
 * DeepMind expands to Canada with new research office in Edmonton, Alberta by Demis Hassabis, DeepMind, July 5, 2017

=References=

Up one level