Changes

Jump to: navigation, search

Temporal Difference Learning

63 bytes added, 16:52, 11 June 2019
no edit summary
* [[Michael Gherrity]] ('''1993'''). ''A Game Learning Machine''. Ph.D. thesis, [https://de.wikipedia.org/wiki/University_of_California,_San_Diego University of California, San Diego], advisor [[Mathematician#PKube|Paul Kube]], [http://www.gherrity.org/thesis.pdf pdf], [http://www.top-5000.nl/ps/A%20game%20learning%20machine.pdf pdf]
* [[Peter Dayan]] ('''1993'''). ''Improving generalisation for temporal difference learning: The successor representation''. [https://en.wikipedia.org/wiki/Neural_Computation_(journal) Neural Computation], Vol. 5, [http://www.gatsby.ucl.ac.uk/~dayan/papers/sr93.pdf pdf]
* [[Nicol N. Schraudolph]], [[Peter Dayan]], [[Terrence J. Sejnowski]] ('''19941993'''). ''[httphttps://nicpapers.schraudolphnips.orgcc/bib2htmlpaper/b2hd820-temporal-difference-learning-of-position-evaluation-in-the-game-of-SchDaySej94.html go Temporal Difference Learning of Position Evaluation in the Game of Go]''. [httphttps://papers.nips.cc/book/advances-in-neural-information-processing-systems-6-1993 Advances in Neural Information Processing Systems 6NIPS 1993] <ref>[http://satirist.org/learn-game/systems/go-net.html Nici Schraudolph’s go networks], review by [[Jay Scott]]</ref>
* [[Peter Dayan]], [[Terrence J. Sejnowski]] ('''1994'''). ''TD(λ) converges with Probability 1''. [https://en.wikipedia.org/wiki/Machine_Learning_(journal) Machine Learning], Vol. 14, No. 1, [https://www.researchgate.net/profile/Terrence_Sejnowski/publication/228392650_TD_X_Converges_with_Probability/links/54a4afea0cf256bf8bb327a9.pdf?origin=publication_detail pdf]
==1995 ...==
* [[Jonathan Schaeffer]], [[Markian Hlynka]], [[Vili Jussila]] ('''2001'''). ''Temporal Difference Learning Applied to a High-Performance Game-Playing Program''. [http://www.informatik.uni-trier.de/~ley/db/conf/ijcai/ijcai2001.html#SchaefferHJ01 IJCAI 2001]
* [[Don Beal]], [[Martin C. Smith]] ('''2001'''). ''[http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B6V1G-41MJ1SV-7&_user=10&_coverDate=02%2F06%2F2001&_rdoc=1&_fmt=high&_orig=search&_sort=d&_docanchor=&view=c&_searchStrId=1436661548&_rerunOrigin=google&_acct=C000050221&_version=1&_urlVersion=0&_userid=10&md5=d855cbad10953476dbb92258347c8e94 Temporal difference learning applied to game playing and the results of application to Shogi]''. [https://en.wikipedia.org/wiki/Theoretical_Computer_Science_%28journal%29 Theoretical Computer Science], Vol. 252, Nos. 1-2
* [[Nicol N. Schraudolph]], [[Peter Dayan]], [[Terrence J. Sejnowski]] ('''2001'''). ''[http://nic.schraudolph.org/bib2html/b2hd-SchDaySej01.html Learning to Evaluate Go Positions via Temporal Difference Methods]''. in [[Norio Baba]], [[Lakhmi C. Jain]] (eds.) ('''2001'''). ''[http://jasss.soc.surrey.ac.uk/7/1/reviews/takama.html Computational Intelligence in Games, Studies in Fuzziness and Soft Computing]''. , [http://www.springer.com/economics?SGWID=1-165-6-73481-0 Physica-Verlag]
* [[Lex Weaver]], [[Jonathan Baxter]] ('''2001'''). ''STD (λ): learning state differences with TD (λ)''. [http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.20.7737 CiteSeerX]
'''2002'''

Navigation menu