Changes

Jump to: navigation, search

Temporal Difference Learning

183 bytes added, 13:26, 23 June 2018
no edit summary
* [[Don Beal]], [[Martin C. Smith]] ('''2001'''). ''[http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B6V1G-41MJ1SV-7&_user=10&_coverDate=02%2F06%2F2001&_rdoc=1&_fmt=high&_orig=search&_sort=d&_docanchor=&view=c&_searchStrId=1436661548&_rerunOrigin=google&_acct=C000050221&_version=1&_urlVersion=0&_userid=10&md5=d855cbad10953476dbb92258347c8e94 Temporal difference learning applied to game playing and the results of application to Shogi]''. [https://en.wikipedia.org/wiki/Theoretical_Computer_Science_%28journal%29 Theoretical Computer Science], Vol. 252, Nos. 1-2
* [[Nicol N. Schraudolph]], [[Peter Dayan]], [[Terrence J. Sejnowski]] ('''2001'''). ''[http://nic.schraudolph.org/bib2html/b2hd-SchDaySej01.html Learning to Evaluate Go Positions via Temporal Difference Methods]''. in [[Norio Baba]], [[Lakhmi C. Jain]] (eds.) ('''2001'''). ''[http://jasss.soc.surrey.ac.uk/7/1/reviews/takama.html Computational Intelligence in Games, Studies in Fuzziness and Soft Computing]''. [http://www.springer.com/economics?SGWID=1-165-6-73481-0 Physica-Verlag]
* [[Lex Weaver]], [[Jonathan Baxter]] ('''2001'''). ''STD (λ): learning state differences with TD (λ)''. [http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.20.7737 CiteSeerX]
'''2002'''
* [[Ari Shapiro]], [[Gil Fuchs]], [[Robert Levinson]] ('''2002'''). ''[http://www.arishapiro.com/researchportfolio/Learning%20Game%20Strategy/index.htm Learning a Game Strategy Using Pattern-Weights and Self-play]''. [[CG 2002]], [http://www.arishapiro.com//ShapiroA_CG2002.pdf pdf]

Navigation menu