Revision as of 08:55, 9 June 2018

Home * People * Richard Sutton

Richard Sutton ^[1]

Richard Stuart Sutton,
an American computer scientist and AI-researcher. Since 2003, Richard S. Sutton is a professor in the Department of Computing Science ^[2] at the University of Alberta and is principal investigator of the Reinforcement Learning and Artificial Intelligence (RLAI) ^[3] group. Rich's research interests center on the learning problems facing a decision-maker interacting with its environment, which he sees as central to artificial intelligence. He is the author of the original paper on Temporal Difference Learning ^[4] and, with Andrew Barto, of the textbook Reinforcement Learning: An Introduction ^[5] . He is also interested in animal learning psychology, in connectionist networks, and generally in systems that continually improve their representations and models of the world ^[6] .

Selected Publications

^[7]^[8]

1978

Richard Sutton (1978). Single channel theory: A neuronal theory of learning. Brain Theory Newsletter 3, No. 3/4

1980 ...

Richard Sutton, Andrew Barto (1981). Toward a modern theory of adaptive networks: Expectation and prediction. Psychological Review, Vol. 88
Richard Sutton (1984). Temporal Credit Assignment in Reinforcement Learning. Ph.D. dissertation, University of Massachusetts
Richard Sutton (1988). Learning to Predict by the Methods of Temporal Differences. Machine Learning, Vol. 3, No. 1

1990 ...

Richard Sutton, Andrew Barto (1990). Time-Derivative Models of Pavlovian Reinforcement. in Michael Gabriel, John Moore (eds.) (1990). Learning and Computational Neuroscience: Foundations of Adaptive Networks. MIT Press
Doina Precup, Richard Sutton (1997). Multi-time Models for Temporally Abstract Planning. NIPS 1997
Richard Sutton, Andrew Barto (1998). Reinforcement Learning: An Introduction. MIT Press

2000 ...

Michael L. Littman, Richard Sutton, Satinder Singh (2001). Predictive Representations of State. NIPS 2001, pdf
Richard Sutton, Brian Tanner (2005). Temporal-Difference Networks. Advances in Neural Information Processing Systems 17
David Silver, Richard Sutton, Martin Müller (2007). Reinforcement learning of local shape in the game of Go. 20th IJCAI, pdf
Richard Sutton, Csaba Szepesvári, Hamid Reza Maei (2008). A Convergent O(n) Algorithm for Off-policy Temporal-difference Learning with Linear Function Approximation.
David Silver, Richard Sutton, Martin Müller (2008). Sample-Based Learning and Search with Permanent and Transient Memories. In Proceedings of the 25th International Conference on Machine Learning, pdf
Maria Cutumisu, Michael Bowling, Duane Szafron, Richard Sutton (2008). Agent Learning using Action-Dependent Learning Rates in Computer Role-Playing Games. Proceedings of the Fourth Artificial Intelligence and Interactive Digital Entertainment Conference, pdf
Hamid Reza Maei, Csaba Szepesvári, Shalabh Bhatnagar, Doina Precup, David Silver, Richard Sutton (2009). Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation. NIPS 22, MIT Press
Richard Sutton, Hamid Reza Maei, Doina Precup, Shalabh Bhatnagar, David Silver, Csaba Szepesvári, Eric Wiewiora. (2009). Fast Gradient-Descent Methods for Temporal-Difference Learning with Linear Function Approximation. ICML-09, pdf

2010

Hamid Reza Maei, Richard Sutton (2010). GQ(λ): A general gradient algorithm for temporal-difference prediction learning with eligibility traces. In Proceedings of the Third Conference on Artificial General Intelligence
David Silver, Richard Sutton, Martin Mueller (2013). Temporal-Difference Search in Computer Go. Proceedings of the ICAPS-13 Workshop on Planning and Learning, pdf
Huizhen Yu, A. Rupam Mahmood, Richard Sutton (2017). On Generalized Bellman Equations and Temporal-Difference Learning. Canadian Conference on AI 2017, arXiv:1704.04463

External Links

Rich Sutton's Home Page
Richard S. Sutton from Wikipedia
The Mathematics Genealogy Project - Richard Sutton
Reinforcement Learning: An Introduction ebook by Richard Sutton and Andrew Barto
Deconstructing Reinforcement Learning, videolecture by Richard Sutton, June 2009
Fast Gradient-Descent Methods for Temporal-Difference Learning with Linear Function Approximation, videolecture by Richard Sutton, June 2009
DeepMind expands to Canada with new research office in Edmonton, Alberta by Demis Hassabis, DeepMind, July 5, 2017

References

↑ Richard Sutton, October 27, 2016, Image source Deep Thinkers on Deep Learning, Author Jurvetson, Menlo Park, USA
↑ Home | Department of Computing Science
↑ Reinforcement Learning and Artificial Intelligence (RLAI)
↑ Richard Sutton (1988). Learning to Predict by the Methods of Temporal Differences. Machine Learning, Vol. 3, No. 1
↑ Reinforcement Learning: An Introduction ebook by Richard Sutton and Andrew Barto
↑ Brief Biography for Richard Sutton
↑ dblp: Richard S. Sutton
↑ ICGA Reference Database (pdf)

Up one level

[1] Richard Sutton, October 27, 2016, Image source Deep Thinkers on Deep Learning, Author Jurvetson, Menlo Park, USA

[2] Home | Department of Computing Science

[3] Reinforcement Learning and Artificial Intelligence (RLAI)

[4] Richard Sutton (1988). Learning to Predict by the Methods of Temporal Differences. Machine Learning, Vol. 3, No. 1

[5] Reinforcement Learning: An Introduction ebook by Richard Sutton and Andrew Barto

[6] Brief Biography for Richard Sutton

[7] : Richard S. Sutton

[8] ICGA Reference Database (pdf)

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

@@ Line 9: / Line 9: @@
 <ref>[http://dblp.uni-trier.de/pers/hd/s/Sutton:Richard_S= dblp: Richard S. Sutton]</ref><ref>[http://ilk.uvt.nl/icga/journal/docs/References.pdf ICGA Reference Database] (pdf)</ref>
 ==1978==
-* [[Richard Sutton]] ('''1978'''). ''Single channel theory: A neuronal theory of learning''. Brain Theory Newsletter 3, No. 3/4, pp. 72-75. [http://www.cs.ualberta.ca/%7Esutton/papers/sutton-78-BTN.pdf pdf]
+* [[Richard Sutton]] ('''1978'''). ''Single channel theory: A neuronal theory of learning''. Brain Theory Newsletter 3, No. 3/4
 ==1980 ...==
-* [[Richard Sutton]], [[Andrew Barto]] ('''1981'''). ''Toward a modern theory of adaptive networks: Expectation and prediction''. Psychological Review, Vol. 88, pp. 135-170. [http://www.cs.ualberta.ca/%7Esutton/papers/sutton-barto-81-PsychRev.pdf pdf]
+* [[Richard Sutton]], [[Andrew Barto]] ('''1981'''). ''Toward a modern theory of adaptive networks: Expectation and prediction''. Psychological Review, Vol. 88
 * [[Richard Sutton]] ('''1984'''). ''[http://scholarworks.umass.edu/dissertations/AAI8410337/ Temporal Credit Assignment in Reinforcement Learning]''. Ph.D. dissertation, [https://en.wikipedia.org/wiki/University_of_Massachusetts University of Massachusetts]
-* [[Richard Sutton]] ('''1988'''). ''Learning to Predict by the Methods of Temporal Differences''. [https://en.wikipedia.org/wiki/Machine_Learning_%28journal%29 Machine Learning], Vol. 3, No. 1, [https://webdocs.cs.ualberta.ca/~sutton/papers/sutton-88-with-erratum.pdf pdf]
+* [[Richard Sutton]] ('''1988'''). ''Learning to Predict by the Methods of Temporal Differences''. [https://en.wikipedia.org/wiki/Machine_Learning_%28journal%29 Machine Learning], Vol. 3, No. 1
 ==1990 ...==
-* [[Richard Sutton]], [[Andrew Barto]] ('''1990'''). ''Time-Derivative Models of Pavlovian Reinforcement''. in [http://node.realityspline.net/ari/work/neuro/people/showpeople.php?person=faculty/mgabriel.php Michael Gabriel], [http://people.umass.edu/jwmoore/people.htm#JWMoore John Moore] (eds.) ('''1990'''). ''Learning and Computational Neuroscience: Foundations of Adaptive Networks''. [https://en.wikipedia.org/wiki/MIT_Press MIT Press], [https://webdocs.cs.ualberta.ca/~sutton/papers/sutton-barto-90.pdf pdf]
+* [[Richard Sutton]], [[Andrew Barto]] ('''1990'''). ''Time-Derivative Models of Pavlovian Reinforcement''. in [http://node.realityspline.net/ari/work/neuro/people/showpeople.php?person=faculty/mgabriel.php Michael Gabriel], [http://people.umass.edu/jwmoore/people.htm#JWMoore John Moore] (eds.) ('''1990'''). ''Learning and Computational Neuroscience: Foundations of Adaptive Networks''. [https://en.wikipedia.org/wiki/MIT_Press MIT Press]
 * [[Doina Precup]], [[Richard Sutton]] ('''1997'''). ''Multi-time Models for Temporally Abstract Planning''. [http://dblp.uni-trier.de/db/conf/nips/nips1997.html#PrecupS97 NIPS 1997]
 * [[Richard Sutton]], [[Andrew Barto]] ('''1998'''). ''[http://incompleteideas.net/book/the-book.html Reinforcement Learning: An Introduction]''. [https://en.wikipedia.org/wiki/MIT_Press MIT Press]
 ==2000 ...==
 * [[Michael L. Littman]], [[Richard Sutton]], [[Mathematician#SSingh|Satinder Singh]] ('''2001'''). ''Predictive Representations of State''. [http://dblp.uni-trier.de/db/conf/nips/nips2001.html#LittmanSS01 NIPS 2001], [http://web.eecs.umich.edu/~baveja/Papers/psr.pdf pdf]
-* [[Richard Sutton]], [http://dblp.uni-trier.de/pers/hd/t/Tanner:Brian Brian Tanner] ('''2005'''). ''Temporal-Difference Networks''. Advances in Neural Information Processing Systems 17, pages 1377-1384. [http://www.cs.ualberta.ca/%7Esutton/papers/sutton-tanner-04.pdf pdf]
+* [[Richard Sutton]], [http://dblp.uni-trier.de/pers/hd/t/Tanner:Brian Brian Tanner] ('''2005'''). ''Temporal-Difference Networks''. Advances in Neural Information Processing Systems 17
-* [[David Silver]], [[Richard Sutton]], [[Martin Müller]] ('''2007'''). ''Reinforcement learning of local shape in the game of [[Go]]''. In Twentieth International Joint Conference on Artificial Intelligence (IJCAI), pages 1053-1058, [https://en.wikipedia.org/wiki/Hyderabad,_India Hyderabad, India]. [http://webdocs.cs.ualberta.ca/%7Emmueller/ps/silver-ijcai2007.pdf pdf]
+* [[David Silver]], [[Richard Sutton]], [[Martin Müller]] ('''2007'''). ''Reinforcement learning of local shape in the game of Go''. [[Conferences#IJCAI2007|20th IJCAI]], [http://webdocs.cs.ualberta.ca/~mmueller/ps/silver-ijcai2007.pdf pdf]
-* [[Richard Sutton]], [[Csaba Szepesvári]], [[Hamid Reza Maei]] ('''2008'''). ''A Convergent O(n) Algorithm for Off-policy Temporal-difference Learning with Linear Function Approximation'', available as [http://www.sztaki.hu/%7Eszcsaba/papers/gtdnips08.pdf pdf] (draft)
+* [[Richard Sutton]], [[Csaba Szepesvári]], [[Hamid Reza Maei]] ('''2008'''). ''A Convergent O(n) Algorithm for Off-policy Temporal-difference Learning with Linear Function Approximation''.
 * [[David Silver]], [[Richard Sutton]], [[Martin Müller]] ('''2008'''). ''Sample-Based Learning and Search with Permanent and Transient Memories''. In Proceedings of the 25th International Conference on Machine Learning, [http://icml2008.cs.helsinki.fi/papers/564.pdf pdf]
 * [[Maria Cutumisu]], [[Michael Bowling]], [[Duane Szafron]], [[Richard Sutton]] ('''2008'''). ''Agent Learning using Action-Dependent Learning Rates in Computer Role-Playing Games''. [https://www.aaai.org/Library/AIIDE/aiide08contents.php Proceedings of the Fourth Artificial Intelligence and Interactive Digital Entertainment Conference], [https://webdocs.cs.ualberta.ca/~duane/publications/pdf/2008aiide.pdf pdf]
-* [[Hamid Reza Maei]], [[Csaba Szepesvári]], [[Shalabh Bhatnagar]], [[Doina Precup]], [[David Silver]], [[Richard Sutton]] ('''2009'''). ''Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation.'' Accepted in Advances in Neural Information Processing Systems 22, Vancouver, BC. December 2009. MIT Press. [http://books.nips.cc/papers/files/nips22/NIPS2009_1121.pdf pdf]
+* [[Hamid Reza Maei]], [[Csaba Szepesvári]], [[Shalabh Bhatnagar]], [[Doina Precup]], [[David Silver]], [[Richard Sutton]] ('''2009'''). ''Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation.'' NIPS 22, MIT Press
-* [[Richard Sutton]], [[Hamid Reza Maei]], [[Doina Precup]], [[Shalabh Bhatnagar]], [[David Silver]], [[Csaba Szepesvári]], [[Eric Wiewiora]]. ('''2009'''). ''Fast Gradient-Descent Methods for Temporal-Difference Learning with Linear Function Approximation''. In Proceedings of the 26th International Conference on Machine Learning (ICML-09). [http://www.sztaki.hu/~szcsaba/papers/GTD-ICML09.pdf pdf]
+* [[Richard Sutton]], [[Hamid Reza Maei]], [[Doina Precup]], [[Shalabh Bhatnagar]], [[David Silver]], [[Csaba Szepesvári]], [[Eric Wiewiora]]. ('''2009'''). ''Fast Gradient-Descent Methods for Temporal-Difference Learning with Linear Function Approximation''. ICML-09, [http://www.sztaki.hu/~szcsaba/papers/GTD-ICML09.pdf pdf]
 ==2010==
 * [[Hamid Reza Maei]], [[Richard Sutton]] ('''2010'''). ''[http://www.incompleteideas.net/sutton/publications.html#GQ GQ(λ): A general gradient algorithm for temporal-difference prediction learning with eligibility traces]''. In Proceedings of the Third Conference on Artificial General Intelligence

Difference between revisions of "Richard Sutton"

Revision as of 08:55, 9 June 2018

Contents

Selected Publications

1978

1980 ...

1990 ...

2000 ...

2010

External Links

References

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools