Changes

Temporal Difference Learning

829 bytes added, 22:52, 12 April 2021

no edit summary

* [[Marco Wiering]] ('''2010'''). ''Self-play and using an expert to learn to play backgammon with temporal difference learning''. [http://www.scirp.org/journal/jilsa/ Journal of Intelligent Learning Systems and Applications], Vol. 2, No. 2

* [[Hamid Reza Maei]], [[Richard Sutton]] ('''2010'''). ''[https://www.researchgate.net/publication/215990384_GQlambda_A_general_gradient_algorithm_for_temporal-difference_prediction_learning_with_eligibility_traces GQ(λ): A general gradient algorithm for temporal-difference prediction learning with eligibility traces]''. [https://agi-conf.org/2010/ AGI 2010]

~~'''2011'''~~

* [[Hamid Reza Maei]] ('''2011'''). ''[https://era.library.ualberta.ca/items/fd55edcb-ce47-4f84-84e2-be281d27b16a Gradient Temporal-Difference Learning Algorithms]''. Ph.D. thesis, [[University of Alberta]], advisor [[Richard Sutton]]

* [[Joel Veness]] ('''2011'''). ''Approximate Universal Artificial Intelligence and Self-Play Learning for Games''. Ph.D. thesis, [https://en.wikipedia.org/wiki/University_of_New_South_Wales University of New South Wales], supervisors: [[Kee Siong Ng]], [[Marcus Hutter]], [[Alan Blair]], [[William Uther]], [[John Lloyd]]; [http://jveness.info/publications/veness_phd_thesis_final.pdf pdf]

* [[Krzysztof Krawiec]], [[Wojciech Jaśkowski]], [[Marcin Szubert]] ('''2011'''). ''[http://www.degruyter.com/view/j/amcs.2011.21.issue-4/v10006-011-0057-3/v10006-011-0057-3.xml Evolving small-board Go players using Coevolutionary Temporal Difference Learning with Archives]''. [http://www.degruyter.com/view/j/amcs Applied Mathematics and Computer Science], Vol. 21, No. 4

* [[Marcin Szubert]], [[Wojciech Jaśkowski]], [[Krzysztof Krawiec]] ('''2011'''). ''Learning Board Evaluation Function for Othello by Hybridizing Coevolution with Temporal Difference Learning''. [http://control.ibspan.waw.pl:3000/mainpage Control and Cybernetics], Vol. 40, No. 3, [http://www.cs.put.poznan.pl/wjaskowski/pub/papers/szubert2011learning.pdf pdf]

~~'''2012'''~~

* [[István Szita]] ('''2012'''). ''[http://link.springer.com/chapter/10.1007%2F978-3-642-27645-3_17 Reinforcement Learning in Games]''. in [[Marco Wiering]], [http://martijnvanotterlo.nl/ Martijn Van Otterlo] (eds.). ''Reinforcement learning: State-of-the-art''. [http://link.springer.com/book/10.1007/978-3-642-27645-3 Adaptation, Learning, and Optimization, Vol. 12], [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer]

~~'''2013'''~~

* [[David Silver]], [[Richard Sutton]], [[Martin Müller|Martin Mueller]] ('''2013'''). ''Temporal-Difference Search in Computer Go''. Proceedings of the [http://icaps13.icaps-conference.org/technical-program/workshop-program/planning-and-learning/ ICAPS-13 Workshop on Planning and Learning], [http://webdocs.cs.ualberta.ca/~sutton/papers/SSM-ICAPS-13.pdf pdf]

* [[Florian Kunz]] ('''2013'''). ''An Introduction to Temporal Difference Learning''. Seminar on Autonomous Learning Systems, [[Darmstadt University of Technology|TU Darmstad]], [http://www.ausy.informatik.tu-darmstadt.de/uploads/Teaching/AutonomousLearningSystems/Kunz_ALS_2013.pdf pdf]

~~'''2014'''~~

* [[I-Chen Wu]], [[Kun-Hao Yeh]], [[Chao-Chin Liang]], [[Chia-Chuan Chang]], [[Han Chiang]] ('''2014'''). ''Multi-Stage Temporal Difference Learning for 2048''. [[TAAI 2014]]

* [[Wojciech Jaśkowski]], [[Marcin Szubert]], [[Paweł Liskowski]] ('''2014'''). ''Multi-Criteria Comparison of Coevolution and Temporal Difference Learning on Othello''. [http://www.evostar.org/2014/ EvoApplications 2014], [http://www.springer.com/computer/theoretical+computer+science/book/978-3-662-45522-7 Springer, volume 8602]

* [[Matthew Lai]] ('''2015'''). ''Giraffe: Using Deep Reinforcement Learning to Play Chess''. M.Sc. thesis, [https://en.wikipedia.org/wiki/Imperial_College_London Imperial College London], [http://arxiv.org/abs/1509.01549v1 arXiv:1509.01549v1] » [[Giraffe]]

* [[Markus Thill]] ('''2015'''). ''Temporal Difference Learning Methods with Automatic Step-Size Adaption for Strategic Board Games: Connect-4 and Dots-and-Boxes''. Master thesis, [https://en.wikipedia.org/wiki/Technical_University_of_Cologne Technical University of Cologne], Campus Gummersbach, [http://www.gm.fh-koeln.de/~konen/research/PaperPDF/MT-Thill2015-final.pdf pdf]

~~'''2016'''~~

* [[Kazuto Oka]], [[Kiminori Matsuzaki]] ('''2016'''). ''Systematic Selection of N-tuple Networks for 2048''. [[CG 2016]]

* [[Huizhen Yu]], [[A. Rupam Mahmood]], [[Richard Sutton]] ('''2017'''). ''On Generalized Bellman Equations and Temporal-Difference Learning''. Canadian Conference on AI 2017, [https://arxiv.org/abs/1704.04463 arXiv:1704.04463]

~~'''2017'''~~

* [[William Uther]] ('''2017'''). ''[https://link.springer.com/referenceworkentry/10.1007/978-1-4899-7687-1_817 Temporal Difference Learning]''. in [https://en.wikipedia.org/wiki/Claude_Sammut Claude Sammut], [https://en.wikipedia.org/wiki/Geoff_Webb Geoffrey I. Webb] (eds) ('''2017'''). ''[https://link.springer.com/referencework/10.1007%2F978-1-4899-7687-1 Encyclopedia of Machine Learning and Data Mining]''. [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer]

==2020 ...==

* [https://scholar.google.ca/citations?user=yVtSOt8AAAAJ&hl=en Emmanuel Bengio], [[Joelle Pineau]], [[Doina Precup]] ('''2020'''). ''Interference and Generalization in Temporal Difference Learning''. [https://arxiv.org/abs/2003.06350 arXiv:2003.06350]

* [https://scholar.google.ca/citations?user=4C5wrXIAAAAJ&hl=en Joshua Romoff], [https://scholar.google.com/citations?user=dy_JBs0AAAAJ&hl=en Peter Henderson], [https://scholar.google.com/citations?user=HUmLDxcAAAAJ&hl=en David Kanaa], [https://scholar.google.ca/citations?user=yVtSOt8AAAAJ&hl=en Emmanuel Bengio], [https://scholar.google.com/citations?user=D4LT5xAAAAAJ&hl=en Ahmed Touati], [https://scholar.google.ca/citations?user=9H77FYYAAAAJ&hl=en Pierre-Luc Bacon], [[Joelle Pineau]] ('''2020'''). ''TDprop: Does Jacobi Preconditioning Help Temporal Difference Learning?'' [https://arxiv.org/abs/2007.02786 arXiv:2007.02786]

=Forum Posts=

GerdIsenberg

Bureaucrats, Administrators

25,161

edits

Changes

Temporal Difference Learning

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools