Difference between revisions of "Richard Sutton"
GerdIsenberg (talk | contribs) |
GerdIsenberg (talk | contribs) |
||
Line 16: | Line 16: | ||
==1990 ...== | ==1990 ...== | ||
* [[Richard Sutton]], [[Andrew Barto]] ('''1990'''). ''Time-Derivative Models of Pavlovian Reinforcement''. in [http://node.realityspline.net/ari/work/neuro/people/showpeople.php?person=faculty/mgabriel.php Michael Gabriel], [http://people.umass.edu/jwmoore/people.htm#JWMoore John Moore] (eds.) ('''1990'''). ''Learning and Computational Neuroscience: Foundations of Adaptive Networks''. [https://en.wikipedia.org/wiki/MIT_Press MIT Press] | * [[Richard Sutton]], [[Andrew Barto]] ('''1990'''). ''Time-Derivative Models of Pavlovian Reinforcement''. in [http://node.realityspline.net/ari/work/neuro/people/showpeople.php?person=faculty/mgabriel.php Michael Gabriel], [http://people.umass.edu/jwmoore/people.htm#JWMoore John Moore] (eds.) ('''1990'''). ''Learning and Computational Neuroscience: Foundations of Adaptive Networks''. [https://en.wikipedia.org/wiki/MIT_Press MIT Press] | ||
− | * [[Doina Precup]], [[Richard Sutton]] ('''1997'''). ''Multi-time Models for Temporally Abstract Planning''. [ | + | * [[Doina Precup]], [[Richard Sutton]] ('''1997'''). ''Multi-time Models for Temporally Abstract Planning''. [https://dblp.uni-trier.de/db/conf/nips/nips1997.html#PrecupS97 NIPS 1997], [https://papers.nips.cc/paper/1997/file/a9be4c2a4041cadbf9d61ae16dd1389e-Paper.pdf pdf] |
* [[Richard Sutton]], [[Andrew Barto]] ('''1998'''). ''[http://incompleteideas.net/book/the-book.html Reinforcement Learning: An Introduction]''. [https://en.wikipedia.org/wiki/MIT_Press MIT Press] | * [[Richard Sutton]], [[Andrew Barto]] ('''1998'''). ''[http://incompleteideas.net/book/the-book.html Reinforcement Learning: An Introduction]''. [https://en.wikipedia.org/wiki/MIT_Press MIT Press] | ||
+ | * [[Richard Sutton]], [[Doina Precup]], [[Mathematician#SSingh|Satinder Singh]] ('''1999'''). ''Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning''. [https://en.wikipedia.org/wiki/Artificial_Intelligence_(journal) Artificial Intelligence], Vol. 112, [https://people.cs.umass.edu/~barto/courses/cs687/Sutton-Precup-Singh-AIJ99.pdf pdf] | ||
==2000 ...== | ==2000 ...== | ||
* [[Michael L. Littman]], [[Richard Sutton]], [[Mathematician#SSingh|Satinder Singh]] ('''2001'''). ''Predictive Representations of State''. [http://dblp.uni-trier.de/db/conf/nips/nips2001.html#LittmanSS01 NIPS 2001], [http://web.eecs.umich.edu/~baveja/Papers/psr.pdf pdf] | * [[Michael L. Littman]], [[Richard Sutton]], [[Mathematician#SSingh|Satinder Singh]] ('''2001'''). ''Predictive Representations of State''. [http://dblp.uni-trier.de/db/conf/nips/nips2001.html#LittmanSS01 NIPS 2001], [http://web.eecs.umich.edu/~baveja/Papers/psr.pdf pdf] |
Latest revision as of 14:52, 12 April 2021
Home * People * Richard Sutton
Richard Stuart Sutton,
an American computer scientist and AI-researcher. Since 2003, Richard S. Sutton is a professor in the Department of Computing Science [2] at the University of Alberta and is principal investigator of the Reinforcement Learning and Artificial Intelligence (RLAI) [3] group. Rich's research interests center on the learning problems facing a decision-maker interacting with its environment, which he sees as central to artificial intelligence. He is the author of the original paper on Temporal Difference Learning [4] and, with Andrew Barto, of the textbook Reinforcement Learning: An Introduction [5] . He is also interested in animal learning psychology, in connectionist networks, and generally in systems that continually improve their representations and models of the world [6] .
Contents
Selected Publications
1978
- Richard Sutton (1978). Single channel theory: A neuronal theory of learning. Brain Theory Newsletter 3, No. 3/4
1980 ...
- Richard Sutton, Andrew Barto (1981). Toward a modern theory of adaptive networks: Expectation and prediction. Psychological Review, Vol. 88
- Richard Sutton (1984). Temporal Credit Assignment in Reinforcement Learning. Ph.D. dissertation, University of Massachusetts
- Richard Sutton (1988). Learning to Predict by the Methods of Temporal Differences. Machine Learning, Vol. 3, No. 1
1990 ...
- Richard Sutton, Andrew Barto (1990). Time-Derivative Models of Pavlovian Reinforcement. in Michael Gabriel, John Moore (eds.) (1990). Learning and Computational Neuroscience: Foundations of Adaptive Networks. MIT Press
- Doina Precup, Richard Sutton (1997). Multi-time Models for Temporally Abstract Planning. NIPS 1997, pdf
- Richard Sutton, Andrew Barto (1998). Reinforcement Learning: An Introduction. MIT Press
- Richard Sutton, Doina Precup, Satinder Singh (1999). Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence, Vol. 112, pdf
2000 ...
- Michael L. Littman, Richard Sutton, Satinder Singh (2001). Predictive Representations of State. NIPS 2001, pdf
- Richard Sutton, Brian Tanner (2005). Temporal-Difference Networks. Advances in Neural Information Processing Systems 17
- David Silver, Richard Sutton, Martin Müller (2007). Reinforcement learning of local shape in the game of Go. 20th IJCAI, pdf
- Richard Sutton, Csaba Szepesvári, Hamid Reza Maei (2008). A Convergent O(n) Algorithm for Off-policy Temporal-difference Learning with Linear Function Approximation. NIPS 2008, pdf
- David Silver, Richard Sutton, Martin Müller (2008). Sample-Based Learning and Search with Permanent and Transient Memories. In Proceedings of the 25th International Conference on Machine Learning, pdf
- Maria Cutumisu, Michael Bowling, Duane Szafron, Richard Sutton (2008). Agent Learning using Action-Dependent Learning Rates in Computer Role-Playing Games. Proceedings of the Fourth Artificial Intelligence and Interactive Digital Entertainment Conference, pdf
- Hamid Reza Maei, Csaba Szepesvári, Shalabh Bhatnagar, Doina Precup, David Silver, Richard Sutton (2009). Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation. NIPS 2009, pdf
- Richard Sutton, Hamid Reza Maei, Doina Precup, Shalabh Bhatnagar, David Silver, Csaba Szepesvári, Eric Wiewiora. (2009). Fast Gradient-Descent Methods for Temporal-Difference Learning with Linear Function Approximation. ICML 2009
2010
- Hamid Reza Maei, Csaba Szepesvári, Shalabh Bhatnagar, Richard Sutton (2010). Toward Off-Policy Learning Control with Function Approximation. ICML 2010, pdf
- Hamid Reza Maei, Richard Sutton (2010). GQ(λ): A general gradient algorithm for temporal-difference prediction learning with eligibility traces. AGI 2010
- David Silver, Richard Sutton, Martin Mueller (2013). Temporal-Difference Search in Computer Go. Proceedings of the ICAPS-13 Workshop on Planning and Learning, pdf
- Huizhen Yu, A. Rupam Mahmood, Richard Sutton (2017). On Generalized Bellman Equations and Temporal-Difference Learning. Canadian Conference on AI 2017, arXiv:1704.04463
External Links
- Rich Sutton's Home Page
- Richard S. Sutton from Wikipedia
- The Mathematics Genealogy Project - Richard Sutton
- Reinforcement Learning: An Introduction ebook by Richard Sutton and Andrew Barto
- Deconstructing Reinforcement Learning, videolecture by Richard Sutton, June 2009
- Fast Gradient-Descent Methods for Temporal-Difference Learning with Linear Function Approximation, videolecture by Richard Sutton, June 2009
- DeepMind expands to Canada with new research office in Edmonton, Alberta by Demis Hassabis, DeepMind, July 5, 2017
- Standing on the shoulders of giants by Albert Silver, ChessBase News, September 18, 2019
References
- ↑ Richard Sutton, October 27, 2016, Image source Deep Thinkers on Deep Learning, Author Jurvetson, Menlo Park, USA
- ↑ Home | Department of Computing Science
- ↑ Reinforcement Learning and Artificial Intelligence (RLAI)
- ↑ Richard Sutton (1988). Learning to Predict by the Methods of Temporal Differences. Machine Learning, Vol. 3, No. 1
- ↑ Reinforcement Learning: An Introduction ebook by Richard Sutton and Andrew Barto
- ↑ Brief Biography for Richard Sutton
- ↑ dblp: Richard S. Sutton
- ↑ ICGA Reference Database