Changes

Jump to: navigation, search

Reinforcement Learning

No change in size, 18:02, 5 September 2020
no edit summary
'''2019'''
* [https://scholar.google.co.uk/citations?user=OAkRr-YAAAAJ&hl=en Sanjeevan Ahilan], [[Peter Dayan]] ('''2019'''). ''Feudal Multi-Agent Hierarchies for Cooperative Reinforcement Learning''. [https://arxiv.org/abs/1901.08492 arXiv:1901.08492]
* [https://scholar.google.co.in/citations?user=1QlrvHkAAAAJ&hl=en Chandramouli Kamanchi], [https://scholar.google.co.in/citations?user=nx4NlpsAAAAJ&hl=en Raghuram Bharadwaj Diddigi], [[Shalabh Bhatnagar]] ('''2019'''). ''Successive Over Relaxation Q-Learning''. [https://arxiv.org/abs/1903.03812 arXiv:1903.03812]
* [https://scholar.google.co.in/citations?user=1QlrvHkAAAAJ&hl=en Chandramouli Kamanchi], [https://scholar.google.co.in/citations?user=nx4NlpsAAAAJ&hl=en Raghuram Bharadwaj Diddigi], [[Shalabh Bhatnagar]] ('''2019'''). ''Second Order Value Iteration in Reinforcement Learning''. [https://arxiv.org/abs/1905.03927 arXiv:1905.03927]
* [[Marc Lanctot]], [[Edward Lockhart]], [[Jean-Baptiste Lespiau]], [[Vinicius Zambaldi]], [[Satyaki Upadhyay]], [[Julien Pérolat]], [[Sriram Srinivasan]], [[Finbarr Timbers]], [[Karl Tuyls]], [[Shayegan Omidshafiei]], [[Daniel Hennes]], [[Dustin Morrill]], [[Paul Muller]], [[Timo Ewalds]], [[Ryan Faulkner]], [[János Kramár]], [[Bart De Vylder]], [[Brennan Saeta]], [[James Bradbury]], [[David Ding]], [[Sebastian Borgeaud]], [[Matthew Lai]], [[Julian Schrittwieser]], [[Thomas Anthony]], [[Edward Hughes]], [[Ivo Danihelka]], [[Jonah Ryan-Davis]] ('''2019'''). ''OpenSpiel: A Framework for Reinforcement Learning in Games''. [https://arxiv.org/abs/1908.09453 arXiv:1908.09453] <ref>[https://github.com/deepmind/open_spiel/blob/master/docs/contributing.md open_spiel/contributing.md at master · deepmind/open_spiel · GitHub]</ref>
* [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Thomas Hubert]], [[Karen Simonyan]], [[Laurent Sifre]], [[Simon Schmitt]], [[Arthur Guez]], [[Edward Lockhart]], [[Demis Hassabis]], [[Thore Graepel]], [[Timothy Lillicrap]], [[David Silver]] ('''2019'''). ''Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model''. [https://arxiv.org/abs/1911.08265 arXiv:1911.08265] <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=2&t=72381 New DeepMind paper] by GregNeto, [[CCC]], November 21, 2019</ref>
* [[Mathematician#SrbhBose|Sourabh Bose]] ('''2019'''). ''[https://rc.library.uta.edu/uta-ir/handle/10106/28094 Learning Representations Using Reinforcement Learning]''. Ph.D. thesis, [https://en.wikipedia.org/wiki/University_of_Texas_at_Arlington University of Texas at Arlington], advisor [[Mathematician#MHuber|Manfred Huber]] <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=72810&start=6 e: Board adaptive / tuning evaluation function - no NN/AI] by Tony P., [[CCC]], January 15, 2020</ref>
* [https://scholar.google.co.in/citations?user=1QlrvHkAAAAJ&hl=en Chandramouli Kamanchi], [https://scholar.google.co.in/citations?user=nx4NlpsAAAAJ&hl=en Raghuram Bharadwaj Diddigi], [[Shalabh Bhatnagar]] ('''2019'''). ''Successive Over Relaxation Q-Learning''. [https://arxiv.org/abs/1903.03812 arXiv:1903.03812]
* [https://scholar.google.co.in/citations?user=1QlrvHkAAAAJ&hl=en Chandramouli Kamanchi], [https://scholar.google.co.in/citations?user=nx4NlpsAAAAJ&hl=en Raghuram Bharadwaj Diddigi], [[Shalabh Bhatnagar]] ('''2019'''). ''Second Order Value Iteration in Reinforcement Learning''. [https://arxiv.org/abs/1905.03927 arXiv:1905.03927]
==2020 ...==
* [[Hung Guei]], [[Ting-Han Wei]], [[I-Chen Wu]] ('''2020'''). ''2048-like games for teaching reinforcement learning''. [[ICGA Journal#42_1|ICGA Journal, Vol. 42, No. 1]]

Navigation menu