Changes

Jump to: navigation, search

Reinforcement Learning

1,057 bytes added, 18:00, 5 September 2020
no edit summary
* [[Takuya Hiraoka]], [https://dblp.org/pers/hd/t/Tsuchida:Masaaki Masaaki Tsuchida], [https://dblp.org/pers/hd/w/Watanabe:Yotaro Yotaro Watanabe] ('''2017'''). ''Deep Reinforcement Learning for Inquiry Dialog Policies with Logical Formula Embeddings''. [https://arxiv.org/abs/1708.00667 arXiv:1708.00667]
* [[William Uther]] ('''2017'''). ''[https://link.springer.com/referenceworkentry/10.1007/978-1-4899-7687-1_512 Markov Decision Processes]''. in [https://en.wikipedia.org/wiki/Claude_Sammut Claude Sammut], [https://en.wikipedia.org/wiki/Geoff_Webb Geoffrey I. Webb] (eds) ('''2017'''). ''[https://link.springer.com/referencework/10.1007%2F978-1-4899-7687-1 Encyclopedia of Machine Learning and Data Mining]''. [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer]
* [https://scholar.google.com/citations?user=zLksndkAAAAJ&hl=en Jayvant Anantpur], [https://dblp.org/pid/09/10702.html Nagendra Gulur Dwarakanath], [https://dblp.org/pid/16/4410.html Shivaram Kalyanakrishnan], [[Shalabh Bhatnagar]], [https://dblp.org/pid/45/3592.html R. Govindarajan] ('''2017'''). ''RLWS: A Reinforcement Learning based GPU Warp Scheduler''. [https://arxiv.org/abs/1712.04303 arXiv:1712.04303]
'''2018'''
* [[Hui Wang]], [[Michael Emmerich]], [[Aske Plaat]] ('''2018'''). ''Monte Carlo Q-learning for General Game Playing''. [https://arxiv.org/abs/1802.05944 arXiv:1802.05944] » [[Monte-Carlo Tree Search|MCTS]], [[General Game Playing]]
* [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Thomas Hubert]], [[Karen Simonyan]], [[Laurent Sifre]], [[Simon Schmitt]], [[Arthur Guez]], [[Edward Lockhart]], [[Demis Hassabis]], [[Thore Graepel]], [[Timothy Lillicrap]], [[David Silver]] ('''2019'''). ''Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model''. [https://arxiv.org/abs/1911.08265 arXiv:1911.08265] <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=2&t=72381 New DeepMind paper] by GregNeto, [[CCC]], November 21, 2019</ref>
* [[Mathematician#SrbhBose|Sourabh Bose]] ('''2019'''). ''[https://rc.library.uta.edu/uta-ir/handle/10106/28094 Learning Representations Using Reinforcement Learning]''. Ph.D. thesis, [https://en.wikipedia.org/wiki/University_of_Texas_at_Arlington University of Texas at Arlington], advisor [[Mathematician#MHuber|Manfred Huber]] <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=72810&start=6 e: Board adaptive / tuning evaluation function - no NN/AI] by Tony P., [[CCC]], January 15, 2020</ref>
* [https://scholar.google.co.in/citations?user=1QlrvHkAAAAJ&hl=en Chandramouli Kamanchi], [https://scholar.google.co.in/citations?user=nx4NlpsAAAAJ&hl=en Raghuram Bharadwaj Diddigi], [[Shalabh Bhatnagar]] ('''2019'''). ''Successive Over Relaxation Q-Learning''. [https://arxiv.org/abs/1903.03812 arXiv:1903.03812]
* [https://scholar.google.co.in/citations?user=1QlrvHkAAAAJ&hl=en Chandramouli Kamanchi], [https://scholar.google.co.in/citations?user=nx4NlpsAAAAJ&hl=en Raghuram Bharadwaj Diddigi], [[Shalabh Bhatnagar]] ('''2019'''). ''Second Order Value Iteration in Reinforcement Learning''. [https://arxiv.org/abs/1905.03927 arXiv:1905.03927]
==2020 ...==
* [[Hung Guei]], [[Ting-Han Wei]], [[I-Chen Wu]] ('''2020'''). ''2048-like games for teaching reinforcement learning''. [[ICGA Journal#42_1|ICGA Journal, Vol. 42, No. 1]]

Navigation menu