Changes

Jump to: navigation, search

Learning

21 bytes added, 16:54, 4 July 2020
no edit summary
* [[Mathematician#ROrtner|Ronald Ortner]], [[Mathematician#DRyabko|Daniil Ryabko]], [[Peter Auer]], [[Rémi Munos]] ('''2014'''). ''Regret bounds for restless Markov bandits''. [https://en.wikipedia.org/wiki/Theoretical_Computer_Science_%28journal%29 Theoretical Computer Science] 558, [http://daniil.ryabko.net/mabajr.pdf pdf]
==2015 ...==
* [[Volodymyr Mnih]], [[Koray Kavukcuoglu]], [[David Silver]], [[Mathematician#AARusu|Andrei A. Rusu]], [[Joel Veness]], [[Marc G. Bellemare]], [[Alex Graves]], [[Martin Riedmiller]], [[Andreas K. Fidjeland]], [[Georg Ostrovski]], [[Stig Petersen]], [[Charles Beattie]], [[Amir Sadik]], [[Ioannis Antonoglou]], [[Helen King]], [[Dharshan Kumaran]], [[Daan Wierstra]], [[Shane Legg]], [[Demis Hassabis]] ('''2015'''). ''[http://www.nature.com/nature/journal/v518/n7540/abs/nature14236.html Human-level control through deep reinforcement learning]''. [https://en.wikipedia.org/wiki/Nature_%28journal%29 Nature], Vol. 518
* [[Tobias Graf]], [[Marco Platzner]] ('''2015'''). ''Adaptive Playouts in Monte Carlo Tree Search with Policy Gradient Reinforcement Learning''. [[Advances in Computer Games 14]]
* [[Yuichiro Sato]], [[Hiroyuki Iida]], [[Jaap van den Herik]] ('''2015'''). ''Transfer Learning by Inductive Logic Programming''. [[Advances in Computer Games 14]]

Navigation menu