Latest revision as of 14:52, 12 April 2021

Home * People * Richard Sutton

Richard Sutton ^[1]

Richard Stuart Sutton,
an American computer scientist and AI-researcher. Since 2003, Richard S. Sutton is a professor in the Department of Computing Science ^[2] at the University of Alberta and is principal investigator of the Reinforcement Learning and Artificial Intelligence (RLAI) ^[3] group. Rich's research interests center on the learning problems facing a decision-maker interacting with its environment, which he sees as central to artificial intelligence. He is the author of the original paper on Temporal Difference Learning ^[4] and, with Andrew Barto, of the textbook Reinforcement Learning: An Introduction ^[5] . He is also interested in animal learning psychology, in connectionist networks, and generally in systems that continually improve their representations and models of the world ^[6] .

Selected Publications

^[7]^[8]

1978

Richard Sutton (1978). Single channel theory: A neuronal theory of learning. Brain Theory Newsletter 3, No. 3/4

1980 ...

Richard Sutton, Andrew Barto (1981). Toward a modern theory of adaptive networks: Expectation and prediction. Psychological Review, Vol. 88
Richard Sutton (1984). Temporal Credit Assignment in Reinforcement Learning. Ph.D. dissertation, University of Massachusetts
Richard Sutton (1988). Learning to Predict by the Methods of Temporal Differences. Machine Learning, Vol. 3, No. 1

1990 ...

Richard Sutton, Andrew Barto (1990). Time-Derivative Models of Pavlovian Reinforcement. in Michael Gabriel, John Moore (eds.) (1990). Learning and Computational Neuroscience: Foundations of Adaptive Networks. MIT Press
Doina Precup, Richard Sutton (1997). Multi-time Models for Temporally Abstract Planning. NIPS 1997, pdf
Richard Sutton, Andrew Barto (1998). Reinforcement Learning: An Introduction. MIT Press
Richard Sutton, Doina Precup, Satinder Singh (1999). Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence, Vol. 112, pdf

2000 ...

Michael L. Littman, Richard Sutton, Satinder Singh (2001). Predictive Representations of State. NIPS 2001, pdf
Richard Sutton, Brian Tanner (2005). Temporal-Difference Networks. Advances in Neural Information Processing Systems 17
David Silver, Richard Sutton, Martin Müller (2007). Reinforcement learning of local shape in the game of Go. 20th IJCAI, pdf
Richard Sutton, Csaba Szepesvári, Hamid Reza Maei (2008). A Convergent O(n) Algorithm for Off-policy Temporal-difference Learning with Linear Function Approximation. NIPS 2008, pdf
David Silver, Richard Sutton, Martin Müller (2008). Sample-Based Learning and Search with Permanent and Transient Memories. In Proceedings of the 25th International Conference on Machine Learning, pdf
Maria Cutumisu, Michael Bowling, Duane Szafron, Richard Sutton (2008). Agent Learning using Action-Dependent Learning Rates in Computer Role-Playing Games. Proceedings of the Fourth Artificial Intelligence and Interactive Digital Entertainment Conference, pdf
Hamid Reza Maei, Csaba Szepesvári, Shalabh Bhatnagar, Doina Precup, David Silver, Richard Sutton (2009). Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation. NIPS 2009, pdf
Richard Sutton, Hamid Reza Maei, Doina Precup, Shalabh Bhatnagar, David Silver, Csaba Szepesvári, Eric Wiewiora. (2009). Fast Gradient-Descent Methods for Temporal-Difference Learning with Linear Function Approximation. ICML 2009

2010

Hamid Reza Maei, Csaba Szepesvári, Shalabh Bhatnagar, Richard Sutton (2010). Toward Off-Policy Learning Control with Function Approximation. ICML 2010, pdf
Hamid Reza Maei, Richard Sutton (2010). GQ(λ): A general gradient algorithm for temporal-difference prediction learning with eligibility traces. AGI 2010
David Silver, Richard Sutton, Martin Mueller (2013). Temporal-Difference Search in Computer Go. Proceedings of the ICAPS-13 Workshop on Planning and Learning, pdf
Huizhen Yu, A. Rupam Mahmood, Richard Sutton (2017). On Generalized Bellman Equations and Temporal-Difference Learning. Canadian Conference on AI 2017, arXiv:1704.04463

External Links

Rich Sutton's Home Page
Richard S. Sutton from Wikipedia
The Mathematics Genealogy Project - Richard Sutton
Reinforcement Learning: An Introduction ebook by Richard Sutton and Andrew Barto
Deconstructing Reinforcement Learning, videolecture by Richard Sutton, June 2009
Fast Gradient-Descent Methods for Temporal-Difference Learning with Linear Function Approximation, videolecture by Richard Sutton, June 2009
DeepMind expands to Canada with new research office in Edmonton, Alberta by Demis Hassabis, DeepMind, July 5, 2017
Standing on the shoulders of giants by Albert Silver, ChessBase News, September 18, 2019

References

↑ Richard Sutton, October 27, 2016, Image source Deep Thinkers on Deep Learning, Author Jurvetson, Menlo Park, USA
↑ Home | Department of Computing Science
↑ Reinforcement Learning and Artificial Intelligence (RLAI)
↑ Richard Sutton (1988). Learning to Predict by the Methods of Temporal Differences. Machine Learning, Vol. 3, No. 1
↑ Reinforcement Learning: An Introduction ebook by Richard Sutton and Andrew Barto
↑ Brief Biography for Richard Sutton
↑ dblp: Richard S. Sutton
↑ ICGA Reference Database

Up one level

[1] Richard Sutton, October 27, 2016, Image source Deep Thinkers on Deep Learning, Author Jurvetson, Menlo Park, USA

[2] Home | Department of Computing Science

[3] Reinforcement Learning and Artificial Intelligence (RLAI)

[4] Richard Sutton (1988). Learning to Predict by the Methods of Temporal Differences. Machine Learning, Vol. 3, No. 1

[5] Reinforcement Learning: An Introduction ebook by Richard Sutton and Andrew Barto

[6] Brief Biography for Richard Sutton

[7] : Richard S. Sutton

[8] ICGA Reference Database

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

@@ Line 4: / Line 4: @@
 '''Richard Stuart Sutton''',<br/>
-an American computer scientist and [[Artificial Intelligence|AI]]-researcher. Since 2003, Richard S. Sutton is a professor  in the Department of Computing Science <ref>[http://www.cs.ualberta.ca/ Home | Department of Computing Science]</ref> at the [[University of Alberta]] and is principal investigator of the [[Reinforcement Learning]] and [[Artificial Intelligence]] (RLAI) <ref>[http://rlai.cs.ualberta.ca/RLAI/ualberta.html Reinforcement Learning and Artificial Intelligence (RLAI)]</ref> group. Rich's research interests center on the [[Learning|learning]] problems facing a decision-maker interacting with its environment, which he sees as central to artificial intelligence. He is the author of the original paper on [[Temporal Difference Learning]] <ref>[[Richard Sutton]] ('''1988'''). ''Learning to Predict by the Methods of Temporal Differences''. [https://en.wikipedia.org/wiki/Machine_Learning_(journal) Machine Learning], Vol. 3, No. 1</ref> and, with [[Andrew Barto]], of the textbook ''Reinforcement Learning: An Introduction'' <ref>[http://incompleteideas.net/book/the-book.html Reinforcement Learning: An Introduction] ebook by Richard Sutton and [[Andrew Barto]]</ref> . He is also interested in animal learning psychology, in [https://en.wikipedia.org/wiki/Connectionism connectionist] networks, and generally in systems that continually improve their representations and models of the world <ref>[http://incompleteideas.net/BriefBio.html Brief Biography for Richard Sutton]</ref> .
+an American computer scientist and [[Artificial Intelligence|AI]]-researcher. Since 2003, Richard S. Sutton is a professor in the Department of Computing Science <ref>[http://www.cs.ualberta.ca/ Home | Department of Computing Science]</ref> at the [[University of Alberta]] and is principal investigator of the [[Reinforcement Learning]] and [[Artificial Intelligence]] (RLAI) <ref>[http://rlai.cs.ualberta.ca/RLAI/ualberta.html Reinforcement Learning and Artificial Intelligence (RLAI)]</ref> group. Rich's research interests center on the [[Learning|learning]] problems facing a decision-maker interacting with its environment, which he sees as central to artificial intelligence. He is the author of the original paper on [[Temporal Difference Learning]] <ref>[[Richard Sutton]] ('''1988'''). ''Learning to Predict by the Methods of Temporal Differences''. [https://en.wikipedia.org/wiki/Machine_Learning_(journal) Machine Learning], Vol. 3, No. 1</ref> and, with [[Andrew Barto]], of the textbook ''Reinforcement Learning: An Introduction'' <ref>[http://incompleteideas.net/book/the-book.html Reinforcement Learning: An Introduction] ebook by Richard Sutton and [[Andrew Barto]]</ref> . He is also interested in animal learning psychology, in [https://en.wikipedia.org/wiki/Connectionism connectionist] networks, and generally in systems that continually improve their representations and models of the world <ref>[http://incompleteideas.net/BriefBio.html Brief Biography for Richard Sutton]</ref> .
 =Selected Publications=
-<ref>[http://dblp.uni-trier.de/pers/hd/s/Sutton:Richard_S= dblp: Richard S. Sutton]</ref><ref>[http://ilk.uvt.nl/icga/journal/docs/References.pdf ICGA Reference Database] (pdf)</ref>
+<ref>[http://dblp.uni-trier.de/pers/hd/s/Sutton:Richard_S= dblp: Richard S. Sutton]</ref><ref>[[ICGA Journal#RefDB|ICGA Reference Database]]</ref>
 ==1978==
 * [[Richard Sutton]] ('''1978'''). ''Single channel theory: A neuronal theory of learning''. Brain Theory Newsletter 3, No. 3/4
@@ Line 16: / Line 16: @@
 ==1990 ...==
 * [[Richard Sutton]], [[Andrew Barto]] ('''1990'''). ''Time-Derivative Models of Pavlovian Reinforcement''. in [http://node.realityspline.net/ari/work/neuro/people/showpeople.php?person=faculty/mgabriel.php Michael Gabriel], [http://people.umass.edu/jwmoore/people.htm#JWMoore John Moore] (eds.) ('''1990'''). ''Learning and Computational Neuroscience: Foundations of Adaptive Networks''. [https://en.wikipedia.org/wiki/MIT_Press MIT Press]
-* [[Doina Precup]], [[Richard Sutton]] ('''1997'''). ''Multi-time Models for Temporally Abstract Planning''. [http://dblp.uni-trier.de/db/conf/nips/nips1997.html#PrecupS97 NIPS 1997]
+* [[Doina Precup]], [[Richard Sutton]] ('''1997'''). ''Multi-time Models for Temporally Abstract Planning''. [https://dblp.uni-trier.de/db/conf/nips/nips1997.html#PrecupS97 NIPS 1997], [https://papers.nips.cc/paper/1997/file/a9be4c2a4041cadbf9d61ae16dd1389e-Paper.pdf pdf]
 * [[Richard Sutton]], [[Andrew Barto]] ('''1998'''). ''[http://incompleteideas.net/book/the-book.html Reinforcement Learning: An Introduction]''. [https://en.wikipedia.org/wiki/MIT_Press MIT Press]
+* [[Richard Sutton]], [[Doina Precup]], [[Mathematician#SSingh|Satinder Singh]] ('''1999'''). ''Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning''. [https://en.wikipedia.org/wiki/Artificial_Intelligence_(journal) Artificial Intelligence], Vol. 112,  [https://people.cs.umass.edu/~barto/courses/cs687/Sutton-Precup-Singh-AIJ99.pdf pdf]
 ==2000 ...==
 * [[Michael L. Littman]], [[Richard Sutton]], [[Mathematician#SSingh|Satinder Singh]] ('''2001'''). ''Predictive Representations of State''. [http://dblp.uni-trier.de/db/conf/nips/nips2001.html#LittmanSS01 NIPS 2001], [http://web.eecs.umich.edu/~baveja/Papers/psr.pdf pdf]
 * [[Richard Sutton]], [http://dblp.uni-trier.de/pers/hd/t/Tanner:Brian Brian Tanner] ('''2005'''). ''Temporal-Difference Networks''. Advances in Neural Information Processing Systems 17
 * [[David Silver]], [[Richard Sutton]], [[Martin Müller]] ('''2007'''). ''Reinforcement learning of local shape in the game of Go''. [[Conferences#IJCAI2007|20th IJCAI]], [http://webdocs.cs.ualberta.ca/~mmueller/ps/silver-ijcai2007.pdf pdf]
-* [[Richard Sutton]], [[Csaba Szepesvári]], [[Hamid Reza Maei]] ('''2008'''). ''A Convergent O(n) Algorithm for Off-policy Temporal-difference Learning with Linear Function Approximation''.
+* [[Richard Sutton]], [[Csaba Szepesvári]], [[Hamid Reza Maei]] ('''2008'''). ''A Convergent O(n) Algorithm for Off-policy Temporal-difference Learning with Linear Function Approximation''. [https://dblp.uni-trier.de/db/conf/nips/nips2008.html#SuttonSM08 NIPS 2008], [https://proceedings.neurips.cc/paper/2008/file/e0c641195b27425bb056ac56f8953d24-Paper.pdf pdf]
 * [[David Silver]], [[Richard Sutton]], [[Martin Müller]] ('''2008'''). ''Sample-Based Learning and Search with Permanent and Transient Memories''. In Proceedings of the 25th International Conference on Machine Learning, [http://icml2008.cs.helsinki.fi/papers/564.pdf pdf]
 * [[Maria Cutumisu]], [[Michael Bowling]], [[Duane Szafron]], [[Richard Sutton]] ('''2008'''). ''Agent Learning using Action-Dependent Learning Rates in Computer Role-Playing Games''. [https://www.aaai.org/Library/AIIDE/aiide08contents.php Proceedings of the Fourth Artificial Intelligence and Interactive Digital Entertainment Conference], [https://webdocs.cs.ualberta.ca/~duane/publications/pdf/2008aiide.pdf pdf]
-* [[Hamid Reza Maei]], [[Csaba Szepesvári]], [[Shalabh Bhatnagar]], [[Doina Precup]], [[David Silver]], [[Richard Sutton]] ('''2009'''). ''Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation.'' NIPS 22, MIT Press
+* [[Hamid Reza Maei]], [[Csaba Szepesvári]], [[Shalabh Bhatnagar]], [[Doina Precup]], [[David Silver]], [[Richard Sutton]] ('''2009'''). ''Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation''. [https://dblp.uni-trier.de/db/conf/nips/nips2009.html#MaeiSBPSS09 NIPS 2009], [https://papers.nips.cc/paper/2009/file/3a15c7d0bbe60300a39f76f8a5ba6896-Paper.pdf pdf]
-* [[Richard Sutton]], [[Hamid Reza Maei]], [[Doina Precup]], [[Shalabh Bhatnagar]], [[David Silver]], [[Csaba Szepesvári]], [[Eric Wiewiora]]. ('''2009'''). ''Fast Gradient-Descent Methods for Temporal-Difference Learning with Linear Function Approximation''. ICML-09, [http://www.sztaki.hu/~szcsaba/papers/GTD-ICML09.pdf pdf]
+* [[Richard Sutton]], [[Hamid Reza Maei]], [[Doina Precup]], [[Shalabh Bhatnagar]], [[David Silver]], [[Csaba Szepesvári]], [[Eric Wiewiora]]. ('''2009'''). ''[https://dl.acm.org/doi/10.1145/1553374.1553501 Fast Gradient-Descent Methods for Temporal-Difference Learning with Linear Function Approximation]''. [https://dblp.uni-trier.de/db/conf/icml/icml2009.html#SuttonMPBSSW09 ICML 2009]
 ==2010==
-* [[Hamid Reza Maei]], [[Richard Sutton]] ('''2010'''). ''[http://www.incompleteideas.net/sutton/publications.html#GQ GQ(λ): A general gradient algorithm for temporal-difference prediction learning with eligibility traces]''. In Proceedings of the Third Conference on Artificial General Intelligence
+* [[Hamid Reza Maei]], [[Csaba Szepesvári]], [[Shalabh Bhatnagar]], [[Richard Sutton]] ('''2010'''). ''Toward Off-Policy Learning Control with Function Approximation''. [https://dblp.uni-trier.de/db/conf/icml/icml2010.html#MaeiSBS10 ICML 2010], [https://icml.cc/Conferences/2010/papers/627.pdf pdf]
+* [[Hamid Reza Maei]], [[Richard Sutton]] ('''2010'''). ''[https://www.researchgate.net/publication/215990384_GQlambda_A_general_gradient_algorithm_for_temporal-difference_prediction_learning_with_eligibility_traces GQ(λ): A general gradient algorithm for temporal-difference prediction learning with eligibility traces]''. [https://agi-conf.org/2010/ AGI 2010]
 * [[David Silver]], [[Richard Sutton]], [[Martin Müller|Martin Mueller]] ('''2013'''). ''Temporal-Difference Search in Computer Go''. Proceedings of the [http://icaps13.icaps-conference.org/technical-program/workshop-program/planning-and-learning/ ICAPS-13 Workshop on Planning and Learning], [http://webdocs.cs.ualberta.ca/~sutton/papers/SSM-ICAPS-13.pdf pdf]
 * [[Huizhen Yu]], [[A. Rupam Mahmood]], [[Richard Sutton]] ('''2017'''). ''On Generalized Bellman Equations and Temporal-Difference Learning''. Canadian Conference on AI 2017, [https://arxiv.org/abs/1704.04463 arXiv:1704.04463]
@@ Line 44: / Line 46: @@
 =References=
 <references />
 '''[[People|Up one level]]'''
+[[Category:Researcher|Sutton]]

Difference between revisions of "Richard Sutton"

Latest revision as of 14:52, 12 April 2021

Contents

Selected Publications

1978

1980 ...

1990 ...

2000 ...

2010

External Links

References

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools