Changes

Temporal Difference Learning

76 bytes removed, 10:52, 17 July 2020

no edit summary

* [[Thomas Philip Runarsson]], [[Simon Lucas]] ('''2005'''). ''Coevolution versus self-play temporal difference learning for acquiring position evaluation in small-board go''. [[IEEE#EC|IEEE Transactions on Evolutionary Computation]], Vol. 9, No. 6

'''2006'''

* [[Simon Lucas]], [[Thomas Philip Runarsson]] ('''2006'''). ''[http://scholar.google.is/citations?view_op=view_citation&hl=en&user=4eWdc_sAAAAJ&citation_for_view=4eWdc_sAAAAJ:qjMakFHDy7sC Temporal Difference Learning versus Co-Evolution for Acquiring Othello Position Evaluation]''. [[IEEE#CIG|~~IEEE Symposium on Computational Intelligence and Games~~CIG 2006]]

'''2007'''

* [[Edward P. Manning]] ('''2007'''). ''[http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=4219046 Temporal Difference Learning of an Othello Evaluation Function for a Small Neural Network with Shared Weights]''. [[IEEE#CIG|IEEE Symposium on Computational Intelligence and AI in Games]]

* [[Albrecht Fiebiger]] ('''2008'''). ''Einsatz von allgemeinen Evaluierungsheuristiken in Verbindung mit der Reinforcement-Learning-Strategie in der Schachprogrammierung''. [https://de.wikipedia.org/wiki/Besondere_Lernleistung Besondere Lernleistung] im [https://de.wikipedia.org/wiki/Fachgebiet Fachbereich] [https://de.wikipedia.org/wiki/Informatik Informatik], [https://en.wikipedia.org/wiki/Federal_School_of_Saxony%E2%80%93Saint_Afra Sächsischees Landesgymnasium Sankt Afra], Internal advisor: Ralf Böttcher, External advisors: [[Stefan Meyer-Kahlen]], [[Marco Block-Berlitz|Marco Block]], [http://page.mi.fu-berlin.de/block/abschlussarbeiten/Fiebiger_BeLL.pdf pdf] (German)

'''2009'''

* [[Hamid Reza Maei]], [[Csaba Szepesvári]], [[Shalabh Bhatnagar]], [[Doina Precup]], [[David Silver]], [[Richard Sutton]] ('''2009'''). ''Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation.'' ~~Accepted in Advances in Neural Information Processing Systems 22, Vancouver~~ NIPS 2009, ~~BC. December 2009. MIT Press.~~ [http://books.nips.cc/papers/files/nips22/NIPS2009_1121.pdf pdf]* [[Joel Veness]], [[David Silver]], [[William Uther]], [[Alan Blair]] ('''2009'''). ''[http://papers.nips.cc/paper/3722-bootstrapping-from-game-tree-search Bootstrapping from Game Tree Search]''. NIPS 2009, [http://jveness.info/publications/nips2009%20-%20bootstrapping%20from%20game%20tree%20search.pdf pdf]* [[Richard Sutton]], [[Hamid Reza Maei]], [[Doina Precup]], [[Shalabh Bhatnagar]], [[David Silver]], [[Csaba Szepesvári]], [[Eric Wiewiora]]. ('''2009'''). ''Fast Gradient-Descent Methods for Temporal-Difference Learning with Linear Function Approximation''. ~~In Proceedings of the 26th International Conference on Machine Learning (~~ICML-09). , [http://www.sztaki.hu/~szcsaba/papers/GTD-ICML09.pdf pdf]* [[~~Joel Veness]], [[David Silver]], [[William Uther]], [[Alan Blair~~Simon Lucas]] ('''2009'''). ''[~~http~~https://~~papers~~ieeexplore.~~nips~~ieee.ccorg/~~paper~~document/~~3722-bootstrapping-from-game-tree-search Bootstrapping from Game Tree Search~~5286496 Temporal difference learning with interpolated table value functions]''. [~~http://jveness.info/publications/nips2009%20-%20bootstrapping%20from%20game%20tree%20search.pdf pdf~~[IEEE#CIG|CIG 2009]]* [[Marcin Szubert]], [[Wojciech Jaśkowski]], [[Krzysztof Krawiec]] ('''2009'''). ''Coevolutionary Temporal Difference Learning for Othello''. [[IEEE#CIG|~~IEEE Symposium on Computational Intelligence and Games~~GIG 2009]], [http://www.cs.put.poznan.pl/wjaskowski/pub/papers/szubert09coevolutionary.pdf pdf]

* [[Marcin Szubert]] ('''2009'''). ''Coevolutionary Reinforcement Learning and its Application to Othello''. M.Sc. thesis, [https://en.wikipedia.org/wiki/Pozna%C5%84_University_of_Technology Poznań University of Technology], supervisor [[Krzysztof Krawiec]], [https://mszubert.github.io/papers/Szubert_2009_MSC.pdf pdf]

* [http://www.cs.cmu.edu/~zkolter/ J. Zico Kolter], [[Andrew Ng]] ('''2009'''). ''Regularization and Feature Selection in Least-Squares Temporal Difference Learning''. [http://www.machinelearning.org/archive/icml2009/ ICML 2009], [http://www.cs.cmu.edu/~zkolter/pubs/kolter-icml09b-full.pdf pdf]

* [[Nikolaos Papahristou]], [[Ioannis Refanidis]] ('''2011'''). ''[https://www.conftool.net/acg13/index.php?page=browseSessions&form_session=5 Improving Temporal Difference Performance in Backgammon Variants]''. [[Advances in Computer Games 13]], [http://ai.uom.gr/nikpapa/publications/Improving%20Temporal%20Difference%20Learning%20in%20Backgammon%20Variants_ACG13.pdf pdf]

* [[Krzysztof Krawiec]], [[Wojciech Jaśkowski]], [[Marcin Szubert]] ('''2011'''). ''[http://www.degruyter.com/view/j/amcs.2011.21.issue-4/v10006-011-0057-3/v10006-011-0057-3.xml Evolving small-board Go players using Coevolutionary Temporal Difference Learning with Archives]''. [http://www.degruyter.com/view/j/amcs Applied Mathematics and Computer Science], Vol. 21, No. 4

* [[Marcin Szubert]], [[Wojciech Jaśkowski]], [[Krzysztof Krawiec]] ('''2011'''). ''Learning Board Evaluation Function for Othello by Hybridizing Coevolution with Temporal Difference Learning''. [http://control.ibspan.waw.pl:3000/mainpage Control and Cybernetics], Vol. 40, No. 3,[http://www.cs.put.poznan.pl/wjaskowski/pub/papers/szubert2011learning.pdf pdf]

'''2012'''

* [[István Szita]] ('''2012'''). ''[http://link.springer.com/chapter/10.1007%2F978-3-642-27645-3_17 Reinforcement Learning in Games]''. in [[Marco Wiering]], [http://martijnvanotterlo.nl/ Martijn Van Otterlo] (eds.). ''Reinforcement learning: State-of-the-art''. [http://link.springer.com/book/10.1007/978-3-642-27645-3 Adaptation, Learning, and Optimization, Vol. 12], [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer]

GerdIsenberg

Bureaucrats, Administrators

25,161

edits

Changes

Temporal Difference Learning

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools