Latest revision as of 16:19, 30 May 2021

Home * People * Marc Lanctot

Marc Lanctot ^[1]

Marc Lanctot,
a Canadian computer scientist at Google DeepMind involved in the AlphaZero project, and before post-doctoral researcher for the Maastricht University Games and AI Group ^[2] of Mark Winands. He holds a M.Sc. from McGill University in 2005 ^[3] , and a Ph.D. from University of Alberta in 2013 ^[4] . Marc is generally interested in AI, machine learning, and games. His current research focus is on sampling algorithms for equilibrium computation and decision-making, as well as variants of Monte-Carlo Tree Search.

Selected Publications

^[5]

2005 ...

Marc Lanctot (2005). Adaptive Virtual Environments in Multi-player Computer Games. MSc. Thesis, McGill University
Frantisek Sailer, Michael Buro, Marc Lanctot (2007). Adversarial planning through strategy simulation. Computational Intelligence and Games, pdf

2010 ...

Joel Veness, Marc Lanctot, Michael Bowling (2011). Variance Reduction in Monte-Carlo Tree Search. NIPS, pdf
Marc Lanctot, Abdallah Saffidine, Joel Veness, Christopher Archibald (2012). Sparse Sampling for Adversarial Games. ECAI CGW 2012
Marc Lanctot (2013). Monte Carlo Sampling and Regret Minimization for Equilibrium Computation and Decision-Making in Large Extensive Form Games. Ph.D. thesis, University of Alberta, advisor Michael Bowling
Marc Lanctot, Abdallah Saffidine, Joel Veness, Christopher Archibald, Mark Winands (2013). Monte Carlo *-Minimax Search. IJCAI 2013
Markus Esser, Michael Gras, Mark Winands, Maarten Schadd, Marc Lanctot (2013). Improving Best-Reply Search. CG 2013
Marc Lanctot (2013). LOA Wins Lines of Action Tournament. ICGA Journal, Vol. 36, No. 4 » 17th Computer Olympiad
Marc Lanctot (2013). SIA Wins Surakarta Tournament. ICGA Journal, Vol. 36, No. 4 » 17th Computer Olympiad
Tom Pepels, Tristan Cazenave, Mark Winands, Marc Lanctot (2014). Minimizing Simple and Cumulative Regret in Monte-Carlo Tree Search. ECAI CGW 2014
Marc Lanctot, Mark Winands, Tom Pepels, Nathan Sturtevant (2014). Monte Carlo Tree Search with Heuristic Evaluations using Implicit Minimax Backups. CIG 2014, arXiv:1406.0486

2015 ...

Johannes Heinrich, Marc Lanctot, David Silver (2015). Fictitious Self-Play in Extensive-Form Games. JMLR: W&CP, Vol. 37, pdf
Ziyu Wang, Nando de Freitas, Marc Lanctot (2016). Dueling Network Architectures for Deep Reinforcement Learning. arXiv:1511.06581
David Silver, Aja Huang, Chris J. Maddison, Arthur Guez, Laurent Sifre, George van den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, Sander Dieleman, Dominik Grewe, John Nham, Nal Kalchbrenner, Ilya Sutskever, Timothy Lillicrap, Madeleine Leach, Koray Kavukcuoglu, Thore Graepel, Demis Hassabis (2016). Mastering the game of Go with deep neural networks and tree search. Nature, Vol. 529 » AlphaGo
Audrūnas Gruslys, Rémi Munos, Ivo Danihelka, Marc Lanctot, Alex Graves (2016). Memory-Efficient Backpropagation Through Time. arXiv:1606.03401
Marc Lanctot, Vinícius Flores Zambaldi, Audrunas Gruslys, Angeliki Lazaridou, Karl Tuyls, Julien Pérolat, David Silver, Thore Graepel (2017). A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning. arXiv:1711.00832
David Silver, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Matthew Lai, Arthur Guez, Marc Lanctot, Laurent Sifre, Dharshan Kumaran, Thore Graepel, Timothy Lillicrap, Karen Simonyan, Demis Hassabis (2017). Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm. arXiv:1712.01815 » AlphaZero
David Silver, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Matthew Lai, Arthur Guez, Marc Lanctot, Laurent Sifre, Dharshan Kumaran, Thore Graepel, Timothy Lillicrap, Karen Simonyan, Demis Hassabis (2018). A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play. Science, Vol. 362, No. 6419 ^[6]
Edward Lockhart, Marc Lanctot, Julien Pérolat, Jean-Baptiste Lespiau, Dustin Morrill, Finbarr Timbers, Karl Tuyls (2019). Computing Approximate Equilibria in Sequential Adversarial Games by Exploitability Descent. arXiv:1903.05614
Marc Lanctot, Edward Lockhart, Jean-Baptiste Lespiau, Vinícius Flores Zambaldi, Satyaki Upadhyay, Julien Pérolat, Sriram Srinivasan, Finbarr Timbers, Karl Tuyls, Shayegan Omidshafiei, Daniel Hennes, Dustin Morrill, Paul Muller, Timo Ewalds, Ryan Faulkner, János Kramár, Bart De Vylder, Brennan Saeta, James Bradbury, David Ding, Sebastian Borgeaud, Matthew Lai, Julian Schrittwieser, Thomas Anthony, Edward Hughes, Ivo Danihelka, Jonah Ryan-Davis (2019). OpenSpiel: A Framework for Reinforcement Learning in Games. arXiv:1908.09453 ^[7]

2020 ...

Finbarr Timbers, Edward Lockhart, Martin Schmid, Marc Lanctot, Michael Bowling (2020). Approximate exploitability: Learning a best response in large games. arXiv:2004.09677
Samuel Sokota, Edward Lockhart, Finbarr Timbers, Elnaz Davoodi, Ryan D'Orazio, Neil Burch,Martin Schmid, Michael Bowling, Marc Lanctot (2021). Solving Common-Payoff Games with Approximate Policy Iteration. arXiv:2101.04237

External Links

References

↑ Marc Lanctot's Web Page
↑ Games and AI Group
↑ Marc Lanctot (2005). Adaptive Virtual Environments in Multi-player Computer Games. MSc. Thesis, McGill University
↑ Marc Lanctot (2013). Monte Carlo Sampling and Regret Minimization for Equilibrium Computation and Decision-Making in Large Extensive Form Games. Ph.D. thesis, University of Alberta, advisor Michael Bowling
↑ dblp: Marc Lanctot
↑ AlphaZero: Shedding new light on the grand games of chess, shogi and Go by David Silver, Thomas Hubert, Julian Schrittwieser and Demis Hassabis, DeepMind, December 03, 2018
↑ open_spiel/contributing.md at master · deepmind/open_spiel · GitHub

Up one level

[1] Marc Lanctot's Web Page

[2] Games and AI Group

[3] Marc Lanctot (2005). Adaptive Virtual Environments in Multi-player Computer Games. MSc. Thesis, McGill University

[4] Marc Lanctot (2013). Monte Carlo Sampling and Regret Minimization for Equilibrium Computation and Decision-Making in Large Extensive Form Games. Ph.D. thesis, University of Alberta, advisor Michael Bowling

[5] : Marc Lanctot

[6] AlphaZero: Shedding new light on the grand games of chess, shogi and Go by David Silver, Thomas Hubert, Julian Schrittwieser and Demis Hassabis, DeepMind, December 03, 2018

[7] _spiel/contributing.md at master · deepmind/open_spiel · GitHub

[1]

[2]

[3]

[4]

[5]

[6]

[7]

@@ Line 20: / Line 20: @@
 * [[Marc Lanctot]] ('''2013'''). ''SIA Wins Surakarta Tournament''. [[ICGA Journal#36_4|ICGA Journal, Vol. 36, No. 4]] » [[17th Computer Olympiad#Surakarta|17th Computer Olympiad]]
 * [[Tom Pepels]], [[Tristan Cazenave]], [[Mark Winands]], [[Marc Lanctot]] ('''2014'''). ''Minimizing Simple and Cumulative Regret in Monte-Carlo Tree Search''. [[ECAI CGW 2014]]
+* [[Marc Lanctot]], [[Mark Winands]], [[Tom Pepels]], [[Nathan Sturtevant]] ('''2014'''). ''Monte Carlo Tree Search with Heuristic Evaluations using Implicit Minimax Backups''. [https://dblp.uni-trier.de/db/conf/cig/cig2014.html#LanctotWPS14 CIG 2014], [https://arxiv.org/abs/1406.0486 arXiv:1406.0486]
 ==2015 ...==
 * [[Johannes Heinrich]], [[Marc Lanctot]], [[David Silver]] ('''2015'''). ''Fictitious Self-Play in Extensive-Form Games''. [http://proceedings.mlr.press/v37/ JMLR: W&CP, Vol. 37], [http://proceedings.mlr.press/v37/heinrich15.pdf pdf]
@@ Line 28: / Line 29: @@
 * [[David Silver]], [[Thomas Hubert]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Matthew Lai]], [[Arthur Guez]], [[Marc Lanctot]], [[Laurent Sifre]], [[Dharshan Kumaran]], [[Thore Graepel]], [[Timothy Lillicrap]], [[Karen Simonyan]], [[Demis Hassabis]] ('''2017'''). ''Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm''. [https://arxiv.org/abs/1712.01815 arXiv:1712.01815] » [[AlphaZero]]
 * [[David Silver]], [[Thomas Hubert]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Matthew Lai]], [[Arthur Guez]], [[Marc Lanctot]], [[Laurent Sifre]], [[Dharshan Kumaran]], [[Thore Graepel]], [[Timothy Lillicrap]], [[Karen Simonyan]], [[Demis Hassabis]] ('''2018'''). ''[http://science.sciencemag.org/content/362/6419/1140 A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play]''. [https://en.wikipedia.org/wiki/Science_(journal) Science], Vol. 362, No. 6419 <ref>[https://deepmind.com/blog/alphazero-shedding-new-light-grand-games-chess-shogi-and-go/ AlphaZero: Shedding new light on the grand games of chess, shogi and Go] by [[David Silver]], [[Thomas Hubert]], [[Julian Schrittwieser]] and [[Demis Hassabis]], [[DeepMind]], December 03, 2018</ref>
-* [[Marc Lanctot]], [[Edward Lockhart]], [[Jean-Baptiste Lespiau]], [[Vinicius Zambaldi]], [[Satyaki Upadhyay]], [[Julien Pérolat]], [[Sriram Srinivasan]], [[Finbarr Timbers]], [[Karl Tuyls]], [[Shayegan Omidshafiei]], [[Daniel Hennes]], [[Dustin Morrill]], [[Paul Muller]], [[Timo Ewalds]], [[Ryan Faulkner]], [[János Kramár]], [[Bart De Vylder]], [[Brennan Saeta]], [[James Bradbury]], [[David Ding]], [[Sebastian Borgeaud]], [[Matthew Lai]], [[Julian Schrittwieser]], [[Thomas Anthony]], [[Edward Hughes]], [[Ivo Danihelka]], [[Jonah Ryan-Davis]] ('''2019'''). ''OpenSpiel: A Framework for Reinforcement Learning in Games''. [https://arxiv.org/abs/1908.09453 arXiv:1908.09453] <ref>[https://github.com/deepmind/open_spiel/blob/master/docs/contributing.md open_spiel/contributing.md at master · deepmind/open_spiel · GitHub]</ref>
+* [[Edward Lockhart]], [[Marc Lanctot]], [[Julien Pérolat]], [[Jean-Baptiste Lespiau]], [[Dustin Morrill]], [[Finbarr Timbers]], [[Karl Tuyls]] ('''2019'''). ''Computing Approximate Equilibria in Sequential Adversarial Games by Exploitability Descent''. [https://arxiv.org/abs/1903.05614 arXiv:1903.05614]
+* [[Marc Lanctot]], [[Edward Lockhart]], [[Jean-Baptiste Lespiau]], [[Vinícius Flores Zambaldi]], [[Satyaki Upadhyay]], [[Julien Pérolat]], [[Sriram Srinivasan]], [[Finbarr Timbers]], [[Karl Tuyls]], [[Shayegan Omidshafiei]], [[Daniel Hennes]], [[Dustin Morrill]], [[Paul Muller]], [[Timo Ewalds]], [[Ryan Faulkner]], [[János Kramár]], [[Bart De Vylder]], [[Brennan Saeta]], [[James Bradbury]], [[David Ding]], [[Sebastian Borgeaud]], [[Matthew Lai]], [[Julian Schrittwieser]], [[Thomas Anthony]], [[Edward Hughes]], [[Ivo Danihelka]], [[Jonah Ryan-Davis]] ('''2019'''). ''OpenSpiel: A Framework for Reinforcement Learning in Games''. [https://arxiv.org/abs/1908.09453 arXiv:1908.09453] <ref>[https://github.com/deepmind/open_spiel/blob/master/docs/contributing.md open_spiel/contributing.md at master · deepmind/open_spiel · GitHub]</ref>
+==2020 ...==
+* [[Finbarr Timbers]], [[Edward Lockhart]], [[Mathematician#MSchmid|Martin Schmid]], [[Marc Lanctot]], [[Michael Bowling]] ('''2020'''). ''Approximate exploitability: Learning a best response in large games''. [https://arxiv.org/abs/2004.09677 arXiv:2004.09677]
+* [[Samuel Sokota]], [[Edward Lockhart]], [[Finbarr Timbers]], [[Elnaz Davoodi]], [[Ryan D'Orazio]], [[Neil Burch]],[[Mathematician#MSchmid|Martin Schmid]], [[Michael Bowling]], [[Marc Lanctot]] ('''2021'''). ''Solving Common-Payoff Games with Approximate Policy Iteration''. [https://arxiv.org/abs/2101.04237 arXiv:2101.04237]
 =External Links=
@@ Line 38: / Line 43: @@
 '''[[People|Up one level]]'''
+[[Category:Researcher|Lanctot]]

Difference between revisions of "Marc Lanctot"

Latest revision as of 16:19, 30 May 2021

Contents

Selected Publications

2005 ...

2010 ...

2015 ...

2020 ...

External Links

References

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools