Changes

Jump to: navigation, search

Temporal Difference Learning

780 bytes added, 17:03, 9 December 2019
no edit summary
* [[James Swafford]] ('''2002'''). ''Optimizing Parameter Learning using Temporal Differences''. [http://www.aaai.org/Conferences/AAAI/aaai02.php AAAI-02], Student Abstracts, [https://www.aaai.org/Papers/AAAI/2002/AAAI02-150.pdf pdf]
* [[Justin A. Boyan]] ('''2002'''). ''[https://link.springer.com/article/10.1023%2FA%3A1017936530646 Technical Update: Least-Squares Temporal Difference Learning]''. [https://en.wikipedia.org/wiki/Machine_Learning_(journal) Machine Learning], Vol. 49, [http://research.cs.rutgers.edu/~lihong/project/ahlp/boyan02least.pdf pdf]
* [[Don Beal]] ('''2002'''). ''[https://www.researchgate.net/publication/221556841_TD_mu_A_Modificaiton_of_TD_lambda_That_Enables_a_Program_to_Learn_Weights_for_Good_Play_Even_if_It_Observes_Only_Bad_Play TD(µ): A Modification of TD(λ) That Enables a Program to Learn Weights for Good Play Even if It Observes Only Bad Play]''. [https://dblp.org/db/conf/jcis/jcis2002 JCIS 2002]
'''2003'''
* [[Henk Mannen]] ('''2003'''). ''Learning to play chess using reinforcement learning with database games''. Master’s thesis, [http://students.uu.nl/en/hum/cognitive-artificial-intelligence Cognitive Artificial Intelligence], [https://en.wikipedia.org/wiki/Utrecht_University Utrecht University]
* [[James L. McClelland]] ('''2015'''). ''[https://web.stanford.edu/group/pdplab/pdphandbook/handbook3.html#handbookch10.html Explorations in Parallel Distributed Processing: A Handbook of Models, Programs, and Exercises]''. Second Edition, [https://web.stanford.edu/group/pdplab/pdphandbook/handbookli1.html Contents], [https://web.stanford.edu/group/pdplab/pdphandbook/handbookch10.html Temporal-Difference Learning]
* [[Matthew Lai]] ('''2015'''). ''Giraffe: Using Deep Reinforcement Learning to Play Chess''. M.Sc. thesis, [https://en.wikipedia.org/wiki/Imperial_College_London Imperial College London], [http://arxiv.org/abs/1509.01549v1 arXiv:1509.01549v1] » [[Giraffe]]
* [[Markus Thill]] ('''2015'''). ''Temporal Difference Learning Methods with Automatic Step-Size Adaption for Strategic Board Games: Connect-4 and Dots-and-Boxes''. Master thesis, [https://en.wikipedia.org/wiki/Technical_University_of_Cologne Technical University of Cologne], Campus Gummersbach, [http://www.gm.fh-koeln.de/~konen/research/PaperPDF/MT-Thill2015-final.pdf pdf]
'''2016'''
* [[Kazuto Oka]], [[Kiminori Matsuzaki]] ('''2016'''). ''Systematic Selection of N-tuple Networks for 2048''. [[CG 2016]]
* [[Huizhen Yu]], [[A. Rupam Mahmood]], [[Richard Sutton]] ('''2017'''). ''On Generalized Bellman Equations and Temporal-Difference Learning''. Canadian Conference on AI 2017, [https://arxiv.org/abs/1704.04463 arXiv:1704.04463]
'''2017'''
* [[William Uther]] ('''2017'''). ''[https://link.springer.com/referenceworkentry/10.1007/978-1-4899-7687-1_817 Temporal Difference Learning]''. in [https://en.wikipedia.org/wiki/Claude_Sammut Claude Sammut], [https://en.wikipedia.org/wiki/Geoff_Webb Geoffrey I. Webb] (eds) ('''2017'''). ''[https://link.springer.com/referencework/10.1007%2F978-1-4899-7687-1 Encyclopedia of Machine Learning and Data Mining]''. [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer]

Navigation menu