Difference between revisions of "Rémi Munos"

From Chessprogramming wiki
Jump to: navigation, search
 
Line 22: Line 22:
 
* [[Mathematician#ROrtner|Ronald Ortner]], [[Mathematician#DRyabko|Daniil Ryabko]], [[Peter Auer]], [[Rémi Munos]] ('''2014'''). ''Regret bounds for restless Markov bandits''. [https://en.wikipedia.org/wiki/Theoretical_Computer_Science_%28journal%29 Theoretical Computer Science] 558, [http://daniil.ryabko.net/mabajr.pdf pdf]
 
* [[Mathematician#ROrtner|Ronald Ortner]], [[Mathematician#DRyabko|Daniil Ryabko]], [[Peter Auer]], [[Rémi Munos]] ('''2014'''). ''Regret bounds for restless Markov bandits''. [https://en.wikipedia.org/wiki/Theoretical_Computer_Science_%28journal%29 Theoretical Computer Science] 558, [http://daniil.ryabko.net/mabajr.pdf pdf]
 
* [[Rémi Munos]] ('''2014'''). ''From Bandits to Monte-Carlo Tree Search: The Optimistic Principle Applied to Optimization and Planning''. [http://dblp.uni-trier.de/db/journals/ftml/ftml7.html#Munos14 Foundations and Trends in Machine Learning, Vol. 7, No 1], [https://hal.archives-ouvertes.fr/hal-00747575 hal-00747575v5], [http://chercheurs.lille.inria.fr/~munos/papers/files/AAAI2013_slides.pdf slides as pdf]
 
* [[Rémi Munos]] ('''2014'''). ''From Bandits to Monte-Carlo Tree Search: The Optimistic Principle Applied to Optimization and Planning''. [http://dblp.uni-trier.de/db/journals/ftml/ftml7.html#Munos14 Foundations and Trends in Machine Learning, Vol. 7, No 1], [https://hal.archives-ouvertes.fr/hal-00747575 hal-00747575v5], [http://chercheurs.lille.inria.fr/~munos/papers/files/AAAI2013_slides.pdf slides as pdf]
 +
* [[Tor Lattimore]], [[Rémi Munos]] ('''2014'''). ''Bounded Regret for Finite-Armed Structured Bandits''. [https://arxiv.org/abs/1411.2919 arXiv:1411.2919]
 
==2015 ...==
 
==2015 ...==
 
* [[Audrūnas Gruslys]], [[Rémi Munos]], [[Ivo Danihelka]], [[Marc Lanctot]], [[Alex Graves]] ('''2016'''). ''Memory-Efficient Backpropagation Through Time''. [https://arxiv.org/abs/1606.03401v1 arXiv:1606.03401]
 
* [[Audrūnas Gruslys]], [[Rémi Munos]], [[Ivo Danihelka]], [[Marc Lanctot]], [[Alex Graves]] ('''2016'''). ''Memory-Efficient Backpropagation Through Time''. [https://arxiv.org/abs/1606.03401v1 arXiv:1606.03401]

Latest revision as of 17:53, 17 January 2019

Home * People * Rémi Munos

Rémi Munos [1]

Rémi Munos,
a French mathematician and computer scientist at Google DeepMind, from 2000 to 2006 Associate Professor at the Centre de Mathématiques Appliquées, Ecole Polytechnique and later affiliated with INRIA Lille [2]. His research interests covers reinforcement learning, multi-armed bandits, and dynamic programming. Rémi Muno was contributor of the Go playing program Mogo, using Monte-Carlo Tree Search which uses patterns in the simulations and improvements in UCT.

Selected Publications

[3] [4]

1996

  • Rémi Munos (1996). A convergent reinforcement learning algorithm in the continuous case : the finite-element reinforcement learning. In International Conference on Machine Learning. Morgan Kaufmann

2005 ...

2010 ...

2015 ...

External Links

References

Up one level