Changes

Jump to: navigation, search

Temporal Difference Learning

76 bytes removed, 10:52, 17 July 2020
no edit summary
* [[Thomas Philip Runarsson]], [[Simon Lucas]] ('''2005'''). ''Coevolution versus self-play temporal difference learning for acquiring position evaluation in small-board go''. [[IEEE#EC|IEEE Transactions on Evolutionary Computation]], Vol. 9, No. 6
'''2006'''
* [[Simon Lucas]], [[Thomas Philip Runarsson]] ('''2006'''). ''[http://scholar.google.is/citations?view_op=view_citation&hl=en&user=4eWdc_sAAAAJ&citation_for_view=4eWdc_sAAAAJ:qjMakFHDy7sC Temporal Difference Learning versus Co-Evolution for Acquiring Othello Position Evaluation]''. [[IEEE#CIG|IEEE Symposium on Computational Intelligence and GamesCIG 2006]]
'''2007'''
* [[Edward P. Manning]] ('''2007'''). ''[http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=4219046 Temporal Difference Learning of an Othello Evaluation Function for a Small Neural Network with Shared Weights]''. [[IEEE#CIG|IEEE Symposium on Computational Intelligence and AI in Games]]
* [[Albrecht Fiebiger]] ('''2008'''). ''Einsatz von allgemeinen Evaluierungsheuristiken in Verbindung mit der Reinforcement-Learning-Strategie in der Schachprogrammierung''. [https://de.wikipedia.org/wiki/Besondere_Lernleistung Besondere Lernleistung] im [https://de.wikipedia.org/wiki/Fachgebiet Fachbereich] [https://de.wikipedia.org/wiki/Informatik Informatik], [https://en.wikipedia.org/wiki/Federal_School_of_Saxony%E2%80%93Saint_Afra Sächsischees Landesgymnasium Sankt Afra], Internal advisor: Ralf Böttcher, External advisors: [[Stefan Meyer-Kahlen]], [[Marco Block-Berlitz|Marco Block]], [http://page.mi.fu-berlin.de/block/abschlussarbeiten/Fiebiger_BeLL.pdf pdf] (German)
'''2009'''
* [[Hamid Reza Maei]], [[Csaba Szepesvári]], [[Shalabh Bhatnagar]], [[Doina Precup]], [[David Silver]], [[Richard Sutton]] ('''2009'''). ''Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation.'' Accepted in Advances in Neural Information Processing Systems 22, Vancouver NIPS 2009, BC. December 2009. MIT Press. [http://books.nips.cc/papers/files/nips22/NIPS2009_1121.pdf pdf]* [[Joel Veness]], [[David Silver]], [[William Uther]], [[Alan Blair]] ('''2009'''). ''[http://papers.nips.cc/paper/3722-bootstrapping-from-game-tree-search Bootstrapping from Game Tree Search]''. NIPS 2009, [http://jveness.info/publications/nips2009%20-%20bootstrapping%20from%20game%20tree%20search.pdf pdf]* [[Richard Sutton]], [[Hamid Reza Maei]], [[Doina Precup]], [[Shalabh Bhatnagar]], [[David Silver]], [[Csaba Szepesvári]], [[Eric Wiewiora]]. ('''2009'''). ''Fast Gradient-Descent Methods for Temporal-Difference Learning with Linear Function Approximation''. In Proceedings of the 26th International Conference on Machine Learning (ICML-09). , [http://www.sztaki.hu/~szcsaba/papers/GTD-ICML09.pdf pdf]* [[Joel Veness]], [[David Silver]], [[William Uther]], [[Alan BlairSimon Lucas]] ('''2009'''). ''[httphttps://papersieeexplore.nipsieee.ccorg/paperdocument/3722-bootstrapping-from-game-tree-search Bootstrapping from Game Tree Search5286496 Temporal difference learning with interpolated table value functions]''. [http://jveness.info/publications/nips2009%20-%20bootstrapping%20from%20game%20tree%20search.pdf pdf[IEEE#CIG|CIG 2009]]* [[Marcin Szubert]], [[Wojciech Jaśkowski]], [[Krzysztof Krawiec]] ('''2009'''). ''Coevolutionary Temporal Difference Learning for Othello''. [[IEEE#CIG|IEEE Symposium on Computational Intelligence and GamesGIG 2009]], [http://www.cs.put.poznan.pl/wjaskowski/pub/papers/szubert09coevolutionary.pdf pdf]
* [[Marcin Szubert]] ('''2009'''). ''Coevolutionary Reinforcement Learning and its Application to Othello''. M.Sc. thesis, [https://en.wikipedia.org/wiki/Pozna%C5%84_University_of_Technology Poznań University of Technology], supervisor [[Krzysztof Krawiec]], [https://mszubert.github.io/papers/Szubert_2009_MSC.pdf pdf]
* [http://www.cs.cmu.edu/~zkolter/ J. Zico Kolter], [[Andrew Ng]] ('''2009'''). ''Regularization and Feature Selection in Least-Squares Temporal Difference Learning''. [http://www.machinelearning.org/archive/icml2009/ ICML 2009], [http://www.cs.cmu.edu/~zkolter/pubs/kolter-icml09b-full.pdf pdf]
* [[Nikolaos Papahristou]], [[Ioannis Refanidis]] ('''2011'''). ''[https://www.conftool.net/acg13/index.php?page=browseSessions&form_session=5 Improving Temporal Difference Performance in Backgammon Variants]''. [[Advances in Computer Games 13]], [http://ai.uom.gr/nikpapa/publications/Improving%20Temporal%20Difference%20Learning%20in%20Backgammon%20Variants_ACG13.pdf pdf]
* [[Krzysztof Krawiec]], [[Wojciech Jaśkowski]], [[Marcin Szubert]] ('''2011'''). ''[http://www.degruyter.com/view/j/amcs.2011.21.issue-4/v10006-011-0057-3/v10006-011-0057-3.xml Evolving small-board Go players using Coevolutionary Temporal Difference Learning with Archives]''. [http://www.degruyter.com/view/j/amcs Applied Mathematics and Computer Science], Vol. 21, No. 4
* [[Marcin Szubert]], [[Wojciech Jaśkowski]], [[Krzysztof Krawiec]] ('''2011'''). ''Learning Board Evaluation Function for Othello by Hybridizing Coevolution with Temporal Difference Learning''. [http://control.ibspan.waw.pl:3000/mainpage Control and Cybernetics], Vol. 40, No. 3,[http://www.cs.put.poznan.pl/wjaskowski/pub/papers/szubert2011learning.pdf pdf]
'''2012'''
* [[István Szita]] ('''2012'''). ''[http://link.springer.com/chapter/10.1007%2F978-3-642-27645-3_17 Reinforcement Learning in Games]''. in [[Marco Wiering]], [http://martijnvanotterlo.nl/ Martijn Van Otterlo] (eds.). ''Reinforcement learning: State-of-the-art''. [http://link.springer.com/book/10.1007/978-3-642-27645-3 Adaptation, Learning, and Optimization, Vol. 12], [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer]

Navigation menu