Marc Lanctot

Home * People * Marc Lanctot



Marc Lanctot, a Canadian computer scientist at Google DeepMind involved in the AlphaZero project, and before post-doctoral researcher for the Maastricht University Games and AI Group of Mark Winands. He holds a M.Sc. from McGill University in 2005, and a Ph.D. from University of Alberta in 2013. Marc is generally interested in AI, machine learning, and games. His current research focus is on sampling algorithms for equilibrium computation and decision-making, as well as variants of Monte-Carlo Tree Search.

=Selected Publications=

2005 ...

 * Marc Lanctot (2005). Adaptive Virtual Environments in Multi-player Computer Games. MSc. Thesis, McGill University
 * Frantisek Sailer, Michael Buro, Marc Lanctot (2007). Adversarial planning through strategy simulation. Computational Intelligence and Games, pdf

2010 ...

 * Joel Veness, Marc Lanctot, Michael Bowling (2011). Variance Reduction in Monte-Carlo Tree Search. NIPS, pdf
 * Marc Lanctot, Abdallah Saffidine, Joel Veness, Christopher Archibald (2012). Sparse Sampling for Adversarial Games. ECAI CGW 2012
 * Marc Lanctot (2013). Monte Carlo Sampling and Regret Minimization for Equilibrium Computation and Decision-Making in Large Extensive Form Games. Ph.D. thesis, University of Alberta, advisor Michael Bowling
 * Marc Lanctot, Abdallah Saffidine, Joel Veness, Christopher Archibald, Mark Winands (2013). Monte Carlo *-Minimax Search. IJCAI 2013
 * Markus Esser, Michael Gras, Mark Winands, Maarten Schadd, Marc Lanctot (2013). Improving Best-Reply Search. CG 2013
 * Marc Lanctot (2013). LOA Wins Lines of Action Tournament. ICGA Journal, Vol. 36, No. 4 » 17th Computer Olympiad
 * Marc Lanctot (2013). SIA Wins Surakarta Tournament. ICGA Journal, Vol. 36, No. 4 » 17th Computer Olympiad
 * Tom Pepels, Tristan Cazenave, Mark Winands, Marc Lanctot (2014). Minimizing Simple and Cumulative Regret in Monte-Carlo Tree Search. ECAI CGW 2014
 * Marc Lanctot, Mark Winands, Tom Pepels, Nathan Sturtevant (2014). Monte Carlo Tree Search with Heuristic Evaluations using Implicit Minimax Backups. CIG 2014, arXiv:1406.0486

2015 ...

 * Johannes Heinrich, Marc Lanctot, David Silver (2015). Fictitious Self-Play in Extensive-Form Games. JMLR: W&CP, Vol. 37, pdf
 * Ziyu Wang, Nando de Freitas, Marc Lanctot (2016). Dueling Network Architectures for Deep Reinforcement Learning. arXiv:1511.06581
 * David Silver, Aja Huang, Chris J. Maddison, Arthur Guez, Laurent Sifre, George van den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, Sander Dieleman, Dominik Grewe, John Nham, Nal Kalchbrenner, Ilya Sutskever, Timothy Lillicrap, Madeleine Leach, Koray Kavukcuoglu, Thore Graepel, Demis Hassabis (2016). Mastering the game of Go with deep neural networks and tree search. Nature, Vol. 529 » AlphaGo
 * Audrūnas Gruslys, Rémi Munos, Ivo Danihelka, Marc Lanctot, Alex Graves (2016). Memory-Efficient Backpropagation Through Time. arXiv:1606.03401
 * Marc Lanctot, Vinícius Flores Zambaldi, Audrunas Gruslys, Angeliki Lazaridou, Karl Tuyls, Julien Pérolat, David Silver, Thore Graepel (2017). A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning. arXiv:1711.00832
 * David Silver, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Matthew Lai, Arthur Guez, Marc Lanctot, Laurent Sifre, Dharshan Kumaran, Thore Graepel, Timothy Lillicrap, Karen Simonyan, Demis Hassabis (2017). Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm. arXiv:1712.01815 » AlphaZero
 * David Silver, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Matthew Lai, Arthur Guez, Marc Lanctot, Laurent Sifre, Dharshan Kumaran, Thore Graepel, Timothy Lillicrap, Karen Simonyan, Demis Hassabis (2018). A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play. Science, Vol. 362, No. 6419
 * Edward Lockhart, Marc Lanctot, Julien Pérolat, Jean-Baptiste Lespiau, Dustin Morrill, Finbarr Timbers, Karl Tuyls (2019). Computing Approximate Equilibria in Sequential Adversarial Games by Exploitability Descent. arXiv:1903.05614
 * Marc Lanctot, Edward Lockhart, Jean-Baptiste Lespiau, Vinícius Flores Zambaldi, Satyaki Upadhyay, Julien Pérolat, Sriram Srinivasan, Finbarr Timbers, Karl Tuyls, Shayegan Omidshafiei, Daniel Hennes, Dustin Morrill, Paul Muller, Timo Ewalds, Ryan Faulkner, János Kramár, Bart De Vylder, Brennan Saeta, James Bradbury, David Ding, Sebastian Borgeaud, Matthew Lai, Julian Schrittwieser, Thomas Anthony, Edward Hughes, Ivo Danihelka, Jonah Ryan-Davis (2019). OpenSpiel: A Framework for Reinforcement Learning in Games. arXiv:1908.09453

2020 ...

 * Finbarr Timbers, Edward Lockhart, Martin Schmid, Marc Lanctot, Michael Bowling (2020). Approximate exploitability: Learning a best response in large games. arXiv:2004.09677
 * Samuel Sokota, Edward Lockhart, Finbarr Timbers, Elnaz Davoodi, Ryan D'Orazio, Neil Burch,Martin Schmid, Michael Bowling, Marc Lanctot (2021). Solving Common-Payoff Games with Approximate Policy Iteration. arXiv:2101.04237

=External Links=
 * Marc Lanctot's Web Page
 * Marc Lanctot - Google Scholar Citations

=References=

Up one level