Changes

Jump to: navigation, search

Temporal Difference Learning

418 bytes removed, 15:17, 25 August 2018
no edit summary
* [[Henk Mannen]] ('''2003'''). ''Learning to play chess using reinforcement learning with database games''. Master’s thesis, [http://students.uu.nl/en/hum/cognitive-artificial-intelligence Cognitive Artificial Intelligence], [https://en.wikipedia.org/wiki/Utrecht_University Utrecht University]
'''2004'''
* [[Henk Mannen]], [[Marco Wiering]] ('''2004'''). ''[http://scholar.google.com/citations?view_op=view_citation&hl=en&user=xVas0I8AAAAJ&cstart=20&pagesize=80&citation_for_view=xVas0I8AAAAJ:7PzlFSSx8tAC Learning to play chess using TD(λ)-learning with database games]''. [http://students.uu.nl/en/hum/cognitive-artificial-intelligence Cognitive Artificial Intelligence], [https://en.wikipedia.org/wiki/Utrecht_University Utrecht University], Benelearn’04
* [[Marco Block-Berlitz|Marco Block]] ('''2004'''). ''Verwendung von Temporale-Differenz-Methoden im Schachmotor FUSc#''. Diplomarbeit, Betreuer: [[Raúl Rojas]], [[Free University of Berlin]], [http://page.mi.fu-berlin.de/block/Skripte/diplomarbeit.pdf pdf] (German)
* [[Jacek Mańdziuk]], [[Daniel Osman]] ('''2004'''). ''Temporal Difference Approach to Playing Give-Away Checkers''. [http://www.informatik.uni-trier.de/~ley/db/conf/icaisc/icaisc2004.html#MandziukO04 ICAISC 2004], [http://www.mini.pw.edu.pl/~mandziuk/PRACE/ICAISC04-3.pdf pdf]
* [http://www.cs.cmu.edu/~zkolter/ J. Zico Kolter], [[Andrew Ng]] ('''2009'''). ''Regularization and Feature Selection in Least-Squares Temporal Difference Learning''. [http://www.machinelearning.org/archive/icml2009/ ICML 2009], [http://www.cs.cmu.edu/~zkolter/pubs/kolter-icml09b-full.pdf pdf]
==2010 ...==
* [[Marco Wiering]] ('''2010'''). ''[https://scholar.google.com/citations?view_op=view_citation&hl=en&user=xVas0I8AAAAJ&cstart=20&citation_for_view=xVas0I8AAAAJ:_kc_bZDykSQC Self-play and using an expert to learn to play backgammon with temporal difference learning]''. [http://www.scirp.org/journal/jilsa/ Journal of Intelligent Learning Systems and Applications], Vol. 2, No. 2
* [[Hamid Reza Maei]], [[Richard Sutton]] ('''2010'''). ''[http://www.incompleteideas.net/sutton/publications.html#GQ GQ(λ): A general gradient algorithm for temporal-difference prediction learning with eligibility traces]''. In Proceedings of the Third Conference on Artificial General Intelligence
'''2011'''
* [[Marcin Szubert]], [[Wojciech Jaśkowski]], [[Krzysztof Krawiec]] ('''2011'''). ''Learning Board Evaluation Function for Othello by Hybridizing Coevolution with Temporal Difference Learning''. [http://control.ibspan.waw.pl:3000/mainpage Control and Cybernetics], Vol. 40, No. 3,[http://www.cs.put.poznan.pl/wjaskowski/pub/papers/szubert2011learning.pdf pdf]
'''2012'''
* [[István Szita]] ('''2012'''). ''[http://link.springer.com/chapter/10.1007%2F978-3-642-27645-3_17 Reinforcement Learning in Games]''. in [[Marco Wiering]], [http://martijnvanotterlo.nl/ Martijn Van Otterlo] (eds.). ''[https://scholar.google.com/citations?view_op=view_citation&hl=en&user=xVas0I8AAAAJ&citation_for_view=xVas0I8AAAAJ:abG-DnoFyZgC Reinforcement learning: State-of-the-art]''. [http://link.springer.com/book/10.1007/978-3-642-27645-3 Adaptation, Learning, and Optimization, Vol. 12], [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer]
'''2013'''
* [[David Silver]], [[Richard Sutton]], [[Martin Müller|Martin Mueller]] ('''2013'''). ''Temporal-Difference Search in Computer Go''. Proceedings of the [http://icaps13.icaps-conference.org/technical-program/workshop-program/planning-and-learning/ ICAPS-13 Workshop on Planning and Learning], [http://webdocs.cs.ualberta.ca/~sutton/papers/SSM-ICAPS-13.pdf pdf]

Navigation menu