Changes

Reinforcement Learning

3,068 bytes added, 23:13, 1 February 2021

no edit summary

* [https://dblp.org/pid/233/8144.html Indu John], [https://scholar.google.co.in/citations?user=1QlrvHkAAAAJ&hl=en Chandramouli Kamanchi], [[Shalabh Bhatnagar]] ('''2020'''). ''Generalized Speedy Q-Learning''. [[IEEE#CSL|IEEE Control Systems Letters]], Vol. 4, No. 3, [https://arxiv.org/abs/1911.00397 arXiv:1911.00397]

* [[Takuya Hiraoka]], [https://dblp.org/pers/hd/i/Imagawa:Takahisa Takahisa Imagawa], [https://dblp.org/pers/hd/t/Tangkaratt:Voot Voot Tangkaratt], [https://dblp.org/pers/hd/o/Osa:Takayuki Takayuki Osa], [https://dblp.org/pers/hd/o/Onishi:Takashi Takashi Onishi], [https://dblp.org/pers/hd/t/Tsuruoka:Yoshimasa Yoshimasa Tsuruoka] ('''2020'''). ''Meta-Model-Based Meta-Policy Optimization''. [https://arxiv.org/abs/2006.02608 arXiv:2006.02608]

* [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Thomas Hubert]], [[Karen Simonyan]], [[Laurent Sifre]], [[Simon Schmitt]], [[Arthur Guez]], [[Edward Lockhar]], [[Demis Hassabis]], [[Thore Graepel]], [[Timothy Lillicrap]], [[David Silver]] ('''2020'''). ''[https://www.nature.com/articles/s41586-020-03051-4 Mastering Atari, Go, chess and shogi by planning with a learned model]''. [https://en.wikipedia.org/wiki/Nature_%28journal%29 Nature], Vol. 588 <ref>[https://deepmind.com/blog/article/muzero-mastering-go-chess-shogi-and-atari-without-rules?fbclid=IwAR3mSwrn1YXDKr9uuGm2GlFKh76wBilex7f8QvBiQecwiVmAvD6Bkyjx-rE MuZero: Mastering Go, chess, shogi and Atari without rules]</ref>

* [[Tristan Cazenave]], [[Yen-Chi Chen]], [[Guan-Wei Chen]], [[Shi-Yu Chen]], [[Xian-Dong Chiu]], [[Julien Dehos]], [[Maria Elsa]], [[Qucheng Gong]], [[Hengyuan Hu]], [[Vasil Khalidov]], [[Cheng-Ling Li]], [[Hsin-I Lin]], [[Yu-Jin Lin]], [[Xavier Martinet]], [[Vegard Mella]], [[Jeremy Rapin]], [[Baptiste Roziere]], [[Gabriel Synnaeve]], [[Fabien Teytaud]], [[Olivier Teytaud]], [[Shi-Cheng Ye]], [[Yi-Jun Ye]], [[Shi-Jim Yen]], [[Sergey Zagoruyko]] ('''2020'''). ''Polygames: Improved zero learning''. [[ICGA Journal#42_4|ICGA Journal, Vol. 42, No. 4]], [https://arxiv.org/abs/2001.09832 arXiv:2001.09832], [https://arxiv.org/abs/2001.09832 arXiv:2001.09832]

=Postings=

* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75411 Unsupervised reinforcement tuning from zero] by Madeleine Birchfield, [[CCC]], October 16, 2020 » [[Automated Tuning]]

* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75606 Transhuman Chess with NN and RL...] by [[Srdja Matovic]], [[CCC]], October 30, 2020 » [[Neural Networks|NN]]

* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=76465 Reinforcement learning project] by [[Harm Geert Muller]], [[CCC]], January 31, 2021 » [[Texel's Tuning Method]]

=External Links=

* [http://videolectures.net/deeplearning2016_pineau_reinforcement_learning/ Introduction to Reinforcement Learning] by [[Joelle Pineau]], [[McGill University]], 2016, [https://en.wikipedia.org/wiki/YouTube YouTube] Video

: {{#evu:https://www.youtube.com/watch?v=O_1Z63EDMvQ|alignment=left|valignment=top}}

==OpenSpiel==

* [https://github.com/deepmind/open_spiel GitHub - deepmind/open_spiel: OpenSpiel is a collection of environments and algorithms for research in general reinforcement learning and search/planning in games] <ref>[[Marc Lanctot]], [[Edward Lockhart]], [[Jean-Baptiste Lespiau]], [[Vinicius Zambaldi]], [[Satyaki Upadhyay]], [[Julien Pérolat]], [[Sriram Srinivasan]], [[Finbarr Timbers]], [[Karl Tuyls]], [[Shayegan Omidshafiei]], [[Daniel Hennes]], [[Dustin Morrill]], [[Paul Muller]], [[Timo Ewalds]], [[Ryan Faulkner]], [[János Kramár]], [[Bart De Vylder]], [[Brennan Saeta]], [[James Bradbury]], [[David Ding]], [[Sebastian Borgeaud]], [[Matthew Lai]], [[Julian Schrittwieser]], [[Thomas Anthony]], [[Edward Hughes]], [[Ivo Danihelka]], [[Jonah Ryan-Davis]] ('''2019'''). ''OpenSpiel: A Framework for Reinforcement Learning in Games''. [https://arxiv.org/abs/1908.09453 arXiv:1908.09453]</ref>

** [https://github.com/deepmind/open_spiel/tree/master/open_spiel/algorithms open_spiel/open_spiel/algorithms at master · deepmind/open_spiel · GitHub]

*** [https://github.com/deepmind/open_spiel/tree/master/open_spiel/algorithms/alpha_zero open_spiel/open_spiel/algorithms/alpha_zero at master · deepmind/open_spiel · GitHub]

** [https://github.com/deepmind/open_spiel/tree/master/open_spiel/games open_spiel/open_spiel/games at master · deepmind/open_spiel · GitHub]

*** [https://github.com/deepmind/open_spiel/tree/master/open_spiel/games/chess open_spiel/open_spiel/games/chess at master · deepmind/open_spiel · GitHub]

=References=

GerdIsenberg

Bureaucrats, Administrators

25,161

edits

Changes

Reinforcement Learning

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools