Changes

Jump to: navigation, search

Reinforcement Learning

233 bytes added, 13:21, 25 August 2018
no edit summary
==1990 ...==
* [[Richard Sutton]], [[Andrew Barto]] ('''1990'''). ''Time Derivative Models of Pavlovian Reinforcement''. Learning and Computational Neuroscience: Foundations of Adaptive Networks: 497-537
* [[Jürgen Schmidhuber]] ('''1990'''). ''Reinforcement Learning in Markovian and Non-Markovian Environments''. [https://dblp.uni-trier.de/db/conf/nips/nips1990.html NIPS 1990], [ftp://ftp.idsia.ch/pub/juergen/nipsnonmarkov.pdf pdf]
* [[Chris Watkins]], [[Peter Dayan]] ('''1992'''). ''[http://www.gatsby.ucl.ac.uk/~dayan/papers/wd92.html Q-learning]''. [https://en.wikipedia.org/wiki/Machine_Learning_(journal) Machine Learning], Vol. 8, No. 2
* [[Gerald Tesauro]] ('''1992'''). ''Temporal Difference Learning of Backgammon Strategy''. [http://www.informatik.uni-trier.de/~ley/db/conf/icml/ml1992.html#Tesauro92 ML 1992]

Navigation menu