Changes

Richard Sutton

851 bytes removed, 08:55, 9 June 2018

no edit summary

<ref>[http://dblp.uni-trier.de/pers/hd/s/Sutton:Richard_S= dblp: Richard S. Sutton]</ref><ref>[http://ilk.uvt.nl/icga/journal/docs/References.pdf ICGA Reference Database] (pdf)</ref>

==1978==

* [[Richard Sutton]] ('''1978'''). ''Single channel theory: A neuronal theory of learning''. Brain Theory Newsletter 3, No. 3/4~~, pp. 72-75. [http://www.cs.ualberta.ca/%7Esutton/papers/sutton-78-BTN.pdf pdf]~~

==1980 ...==

* [[Richard Sutton]], [[Andrew Barto]] ('''1981'''). ''Toward a modern theory of adaptive networks: Expectation and prediction''. Psychological Review, Vol. 88~~, pp. 135-170. [http://www.cs.ualberta.ca/%7Esutton/papers/sutton-barto-81-PsychRev.pdf pdf]~~

* [[Richard Sutton]] ('''1984'''). ''[http://scholarworks.umass.edu/dissertations/AAI8410337/ Temporal Credit Assignment in Reinforcement Learning]''. Ph.D. dissertation, [https://en.wikipedia.org/wiki/University_of_Massachusetts University of Massachusetts]

* [[Richard Sutton]] ('''1988'''). ''Learning to Predict by the Methods of Temporal Differences''. [https://en.wikipedia.org/wiki/Machine_Learning_%28journal%29 Machine Learning], Vol. 3, No. 1~~, [https://webdocs.cs.ualberta.ca/~sutton/papers/sutton-88-with-erratum.pdf pdf]~~

==1990 ...==

* [[Richard Sutton]], [[Andrew Barto]] ('''1990'''). ''Time-Derivative Models of Pavlovian Reinforcement''. in [http://node.realityspline.net/ari/work/neuro/people/showpeople.php?person=faculty/mgabriel.php Michael Gabriel], [http://people.umass.edu/jwmoore/people.htm#JWMoore John Moore] (eds.) ('''1990'''). ''Learning and Computational Neuroscience: Foundations of Adaptive Networks''. [https://en.wikipedia.org/wiki/MIT_Press MIT Press~~], [https://webdocs.cs.ualberta.ca/~sutton/papers/sutton-barto-90.pdf pdf~~]

* [[Doina Precup]], [[Richard Sutton]] ('''1997'''). ''Multi-time Models for Temporally Abstract Planning''. [http://dblp.uni-trier.de/db/conf/nips/nips1997.html#PrecupS97 NIPS 1997]

* [[Richard Sutton]], [[Andrew Barto]] ('''1998'''). ''[http://incompleteideas.net/book/the-book.html Reinforcement Learning: An Introduction]''. [https://en.wikipedia.org/wiki/MIT_Press MIT Press]

==2000 ...==

* [[Michael L. Littman]], [[Richard Sutton]], [[Mathematician#SSingh|Satinder Singh]] ('''2001'''). ''Predictive Representations of State''. [http://dblp.uni-trier.de/db/conf/nips/nips2001.html#LittmanSS01 NIPS 2001], [http://web.eecs.umich.edu/~baveja/Papers/psr.pdf pdf]

* [[Richard Sutton]], [http://dblp.uni-trier.de/pers/hd/t/Tanner:Brian Brian Tanner] ('''2005'''). ''Temporal-Difference Networks''. Advances in Neural Information Processing Systems 17~~, pages 1377-1384. [http://www.cs.ualberta.ca/%7Esutton/papers/sutton-tanner-04.pdf pdf]~~* [[David Silver]], [[Richard Sutton]], [[Martin Müller]] ('''2007'''). ''Reinforcement learning of local shape in the game of [[Go]]''. ~~In Twentieth International Joint Conference on Artificial Intelligence (~~[[Conferences#IJCAI2007|20th IJCAI~~), pages 1053-1058, [https://en.wikipedia.org/wiki/Hyderabad,_India Hyderabad~~]], ~~India].~~ [http://webdocs.cs.ualberta.ca/~~%7Emmueller~~~mmueller/ps/silver-ijcai2007.pdf pdf]* [[Richard Sutton]], [[Csaba Szepesvári]], [[Hamid Reza Maei]] ('''2008'''). ''A Convergent O(n) Algorithm for Off-policy Temporal-difference Learning with Linear Function Approximation''~~, available as [http://www.sztaki~~.~~hu/%7Eszcsaba/papers/gtdnips08.pdf pdf] (draft)~~

* [[David Silver]], [[Richard Sutton]], [[Martin Müller]] ('''2008'''). ''Sample-Based Learning and Search with Permanent and Transient Memories''. In Proceedings of the 25th International Conference on Machine Learning, [http://icml2008.cs.helsinki.fi/papers/564.pdf pdf]

* [[Maria Cutumisu]], [[Michael Bowling]], [[Duane Szafron]], [[Richard Sutton]] ('''2008'''). ''Agent Learning using Action-Dependent Learning Rates in Computer Role-Playing Games''. [https://www.aaai.org/Library/AIIDE/aiide08contents.php Proceedings of the Fourth Artificial Intelligence and Interactive Digital Entertainment Conference], [https://webdocs.cs.ualberta.ca/~duane/publications/pdf/2008aiide.pdf pdf]

* [[Hamid Reza Maei]], [[Csaba Szepesvári]], [[Shalabh Bhatnagar]], [[Doina Precup]], [[David Silver]], [[Richard Sutton]] ('''2009'''). ''Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation.'' ~~Accepted in Advances in Neural Information Processing Systems~~ NIPS 22, ~~Vancouver, BC. December 2009.~~ MIT Press~~. [http://books.nips.cc/papers/files/nips22/NIPS2009_1121.pdf pdf]~~* [[Richard Sutton]], [[Hamid Reza Maei]], [[Doina Precup]], [[Shalabh Bhatnagar]], [[David Silver]], [[Csaba Szepesvári]], [[Eric Wiewiora]]. ('''2009'''). ''Fast Gradient-Descent Methods for Temporal-Difference Learning with Linear Function Approximation''. ~~In Proceedings of the 26th International Conference on Machine Learning (~~ICML-09). , [http://www.sztaki.hu/~szcsaba/papers/GTD-ICML09.pdf pdf]

==2010==

* [[Hamid Reza Maei]], [[Richard Sutton]] ('''2010'''). ''[http://www.incompleteideas.net/sutton/publications.html#GQ GQ(λ): A general gradient algorithm for temporal-difference prediction learning with eligibility traces]''. In Proceedings of the Third Conference on Artificial General Intelligence

GerdIsenberg

Bureaucrats, Administrators

25,161

edits

Changes

Richard Sutton

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools