Changes

Jump to: navigation, search

Reinforcement Learning

1,160 bytes added, 22:44, 12 April 2021
no edit summary
* [http://www.ilsp.gr/homepages/papavasiliou_eng.html Vassilis Papavassiliou], [[Stuart Russell]] ('''1999'''). ''Convergence of reinforcement learning with general function approximators.'' In Proc. IJCAI-99, Stockholm, [http://www.cs.berkeley.edu/~russell/papers/ijcai99-bridge.ps ps]
* [[Marco Wiering]] ('''1999'''). ''Explorations in Efficient Reinforcement Learning''. Ph.D. thesis, [https://en.wikipedia.org/wiki/University_of_Amsterdam University of Amsterdam], advisors [[Mathematician#FGroen|Frans Groen]] and [[Jürgen Schmidhuber]]
* [[Richard Sutton]], [[Doina Precup]], [[Mathematician#SSingh|Satinder Singh]] ('''1999'''). ''Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning''. [https://en.wikipedia.org/wiki/Artificial_Intelligence_(journal) Artificial Intelligence], Vol. 112, [https://people.cs.umass.edu/~barto/courses/cs687/Sutton-Precup-Singh-AIJ99.pdf pdf]
==2000 ...==
* [[Sebastian Thrun]], [[Michael L. Littman]] ('''2000'''). ''A Review of Reinforcement Learning''. [http://www.informatik.uni-trier.de/~ley/db/journals/aim/aim21.html#ThrunL00 AI Magazine, Vol. 21], No. 1
* [[David Silver]] ('''2009'''). ''Reinforcement Learning and Simulation-Based Search''. Ph.D. thesis, [[University of Alberta]]. [http://webdocs.cs.ualberta.ca/~silver/David_Silver/Publications_files/thesis.pdf pdf]
* [[Marcin Szubert]] ('''2009'''). ''Coevolutionary Reinforcement Learning and its Application to Othello''. M.Sc. thesis, [https://en.wikipedia.org/wiki/Pozna%C5%84_University_of_Technology Poznań University of Technology], supervisor [[Krzysztof Krawiec]], [https://mszubert.github.io/papers/Szubert_2009_MSC.pdf pdf]
* [[Joelle Pineau]], [[Geoffrey Gordon]], [[Sebastian Thrun]] ('''2006, 2011'''). ''Anytime Point-Based Approximations for Large POMDPs''. [https://en.wikipedia.org/wiki/Journal_of_Artificial_Intelligence_Research Journal of Artificial Intelligence Research], Vol. 27, [https://arxiv.org/abs/1110.0027 arXiv:1110.0027]
==2010 ...==
* [[Joel Veness]], [[Kee Siong Ng]], [[Marcus Hutter]], [[David Silver]] ('''2010'''). ''Reinforcement Learning via AIXI Approximation''. Association for the Advancement of Artificial Intelligence (AAAI), [http://jveness.info/publications/veness_rl_via_aixi_approx.pdf pdf]
* [[Hui Wang]], [[Michael Emmerich]], [[Aske Plaat]] ('''2018'''). ''Monte Carlo Q-learning for General Game Playing''. [https://arxiv.org/abs/1802.05944 arXiv:1802.05944] » [[Monte-Carlo Tree Search|MCTS]], [[General Game Playing]]
* [[Hui Wang]], [[Michael Emmerich]], [[Aske Plaat]] ('''2018'''). ''Assessing the Potential of Classical Q-learning in General Game Playing''. [https://arxiv.org/abs/1810.06078 arXiv:1810.06078]
* [https://scholar.google.com/citations?user=n12uNYcAAAAJ&hl=en Vincent Francois-Lavet], [https://scholar.google.com/citations?user=dy_JBs0AAAAJ&hl=en Peter Henderson], [https://scholar.google.ca/citations?user=2_4Rs44AAAAJ&hl=en Riashat Islam], [https://scholar.google.com/citations?user=uyYPun0AAAAJ&hl=en Marc G. Bellemare], [[Joelle Pineau]] ('''2018'''). ''An Introduction to Deep Reinforcement Learning''. [https://arxiv.org/abs/1811.12560 arXiv:1811.12560]
* [[David Silver]], [[Thomas Hubert]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Matthew Lai]], [[Arthur Guez]], [[Marc Lanctot]], [[Laurent Sifre]], [[Dharshan Kumaran]], [[Thore Graepel]], [[Timothy Lillicrap]], [[Karen Simonyan]], [[Demis Hassabis]] ('''2018'''). ''[http://science.sciencemag.org/content/362/6419/1140 A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play]''. [https://en.wikipedia.org/wiki/Science_(journal) Science], Vol. 362, No. 6419 <ref>[https://deepmind.com/blog/alphazero-shedding-new-light-grand-games-chess-shogi-and-go/ AlphaZero: Shedding new light on the grand games of chess, shogi and Go] by [[David Silver]], [[Thomas Hubert]], [[Julian Schrittwieser]] and [[Demis Hassabis]], [[DeepMind]], December 03, 2018</ref>
* [[Tianhe Wang]], [[Tomoyuki Kaneko]] ('''2018'''). ''Application of Deep Reinforcement Learning in Werewolf Game Agents''. [[TAAI 2018]]

Navigation menu