Rémi Munos

Home * People * Rémi Munos



Rémi Munos, a French mathematician and computer scientist at Google DeepMind, from 2000 to 2006 Associate Professor at the Centre de Mathématiques Appliquées, Ecole Polytechnique and later affiliated with INRIA Lille. His research interests covers reinforcement learning, multi-armed bandits, and dynamic programming. Rémi Muno was contributor of the Go playing program Mogo, using Monte-Carlo Tree Search which uses patterns in the simulations and improvements in UCT.

=Selected Publications=

1996

 * Rémi Munos (1996). A convergent reinforcement learning algorithm in the continuous case : the finite-element reinforcement learning. In International Conference on Machine Learning. Morgan Kaufmann

2005 ...

 * Sylvain Gelly, Yizao Wang, Rémi Munos, Olivier Teytaud (2006). Modiﬁcation of UCT with Patterns in Monte-Carlo Go. INRIA
 * Jean-Yves Audibert, Rémi Munos, Csaba Szepesvári (2007). Tuning Bandit Algorithms in Stochastic Environments. pdf
 * Yizao Wang, Jean-Yves Audibert, Rémi Munos (2008). Algorithms for Infinitely Many-Armed Bandits,, Advances in Neural Information Processing Systems, pdf, Supplemental material - pdf
 * Rémi Munos, Csaba Szepesvári (2008). Finite time bounds for sampling based fitted value iteration. Journal of Machine Learning Research, 9:815-857, 2008. pdf, pdf
 * Raphaël Maîtrepierre, Jérémie Mary, Rémi Munos (2008). Adaptive play in Texas Hold'em Poker. ECAI 2008
 * Jean-Yves Audibert, Rémi Munos, Csaba Szepesvári (2009). Exploration-exploitation trade-off using variance estimates in multi-armed bandits. Theoretical Computer Science, 410:1876-1902, 2009, pdf
 * Vincent Berthier, Amine Bourki, Matthieu Coulm, Guillaume Chaslot, Christophe Fiter, Sylvain Gelly, Jean-Baptiste Hoock, Rémi Munos, Julien Pérez, Arpad Rimmel, Philippe Rolet, Olivier Teytaud, Paul Vayssière, Yizao Wang, Ziqin Yu (et al.) (2009). Computer-Go is not only for Go. Korea, August 2009 slides as pdf

2010 ...

 * Rémi Munos (2010). Approximate dynamic programming. In Olivier Sigaud and Olivier Buffet, editors, Markov Decision Processes in Artificial Intelligence, chapter 3, pages 67-98. ISTE Ltd and John Wiley & Sons Inc., pdf
 * Ronald Ortner, Daniil Ryabko, Peter Auer, Rémi Munos (2014). Regret bounds for restless Markov bandits. Theoretical Computer Science 558, pdf
 * Rémi Munos (2014). From Bandits to Monte-Carlo Tree Search: The Optimistic Principle Applied to Optimization and Planning. Foundations and Trends in Machine Learning, Vol. 7, No 1, hal-00747575v5, slides as pdf

2015 ...

 * Audrūnas Gruslys, Rémi Munos, Ivo Danihelka, Marc Lanctot, Alex Graves (2016). Memory-Efficient Backpropagation Through Time. arXiv:1606.03401
 * Jane X Wang, Zeb Kurth-Nelson, Dhruva Tirumala, Hubert Soyer, Joel Z Leibo, Rémi Munos, Charles Blundell, Dharshan Kumaran, Matt Botvinick (2016). Learning to reinforcement learn. arXiv:1611.05763
 * Arthur Guez, Théophane Weber, Ioannis Antonoglou, Karen Simonyan, Oriol Vinyals, Daan Wierstra, Rémi Munos, David Silver (2016). Learning to Search with MCTSnets. arXiv:1802.04697

=External Links=
 * Remi Munos Homepage
 * Rémi Munos - Google Scholar Citations

=References=

Up one level