Changes

Jump to: navigation, search

Rémi Munos

160 bytes added, 17:53, 17 January 2019
no edit summary
* [[Mathematician#ROrtner|Ronald Ortner]], [[Mathematician#DRyabko|Daniil Ryabko]], [[Peter Auer]], [[Rémi Munos]] ('''2014'''). ''Regret bounds for restless Markov bandits''. [https://en.wikipedia.org/wiki/Theoretical_Computer_Science_%28journal%29 Theoretical Computer Science] 558, [http://daniil.ryabko.net/mabajr.pdf pdf]
* [[Rémi Munos]] ('''2014'''). ''From Bandits to Monte-Carlo Tree Search: The Optimistic Principle Applied to Optimization and Planning''. [http://dblp.uni-trier.de/db/journals/ftml/ftml7.html#Munos14 Foundations and Trends in Machine Learning, Vol. 7, No 1], [https://hal.archives-ouvertes.fr/hal-00747575 hal-00747575v5], [http://chercheurs.lille.inria.fr/~munos/papers/files/AAAI2013_slides.pdf slides as pdf]
* [[Tor Lattimore]], [[Rémi Munos]] ('''2014'''). ''Bounded Regret for Finite-Armed Structured Bandits''. [https://arxiv.org/abs/1411.2919 arXiv:1411.2919]
==2015 ...==
* [[Audrūnas Gruslys]], [[Rémi Munos]], [[Ivo Danihelka]], [[Marc Lanctot]], [[Alex Graves]] ('''2016'''). ''Memory-Efficient Backpropagation Through Time''. [https://arxiv.org/abs/1606.03401v1 arXiv:1606.03401]
* [[Jane X Wang]], [[Zeb Kurth-Nelson]], [[Dhruva Tirumala]], [[Hubert Soyer]], [[Joel Z Leibo]], [[Rémi Munos]], [[Charles Blundell]], [[Dharshan Kumaran]], [[Matt Matthew Botvinick]] ('''2016'''). ''Learning to reinforcement learn''. [https://arxiv.org/abs/1611.05763 arXiv:1611.05763]* [[Arthur Guez]], [[Théophane Weber]], [[Ioannis Antonoglou]], [[Karen Simonyan]], [[Oriol Vinyals]], [[Daan Wierstra]], [[Rémi Munos]], [[David Silver]] ('''20162018'''). ''Learning to Search with MCTSnets''. [https://arxiv.org/abs/1802.04697 arXiv:1802.04697]
=External Links=

Navigation menu