Changes

Learning

595 bytes added, 11:50, 18 November 2018

no edit summary

* [[Golch]]

* [[KnightCap]]

* [[~~LCZero~~Leela Chess Zero]]

* [[Meep]]

* [[Morph]]

* [[Tempo (engine)|Tempo]]

* [[Winter]]

* [[Yace]]

=See also=

* [[Hans Berliner]] ('''1985'''). ''Goals, Plans, and Mechanisms: Non-symbolically in an Evaluation Surface.'' Presentation at Evolution, Games, and Learning, Center for Nonlinear Studies, [[Los Alamos National Laboratory]], May 21.

* [[Ryszard Michalski]], [[Jaime Carbonell]], [[Tom Mitchell]] ('''1985'''). ''Machine Learning: An Artificial Intelligence Approach''. Morgan Kaufmann, ISBN 0-934613-09-5. [http://books.google.com/books?id=TWzuUd5gsnkC&dq=isbn%3A0935382054&hl=de&source=gbs_book_other_versions google books]

* [[Igor Roizen]], [[Judea Pearl]] ('''1985'''). ''Learning Link Probabilities in Causal Trees.'' Proceedings of the Second Conference on Uncertainty in Artificial Intelligence

'''1986'''

* [[Steven Skiena]] ('''1986'''). ''An Overview of Machine Learning in Chess.'' [[ICGA Journal#9_1|ICCA Journal, Vol. 9, No. 1]]

* [[Steven Walczak]] ('''1991'''). ''Predicting Actions from Induction on Past Performance''. Proceedings of the 8th International Workshop on Machine Learning , pp. 275-279. Morgan Kaufmann

* [[Paul E. Utgoff]], [[Jeffery A. Clouse]] ('''1991'''). ''[http://scholarworks.umass.edu/cs_faculty_pubs/193/ Two Kinds of Training Information for Evaluation Function Learning]''. [https://en.wikipedia.org/wiki/University_of_Massachusetts_Amherst University of Massachusetts, Amherst], Proceedings of the [[AAAI]] 1991

* [[Byoung-Tak Zhang]], [[Gerd Veenker]] ('''1991'''). ''[http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=170480&url=http%3A%2F%2Fieeexplore.ieee.org%2Fxpls%2Fabs_all.jsp%3Farnumber%3D170480 Neural networks that teach themselves through genetic discovery of novel examples]''. [http://ieeexplore.ieee.org/xpl/conhome.jsp?punumber=1000500 IEEE IJCNN'91], [https://bi.snu.ac.kr/Publications/Conferences/International ~~Joint Conference on Neural Networks~~/IJCNN91.pdf pdf]* [[Byoung-Tak Zhang]], [[Gerd Veenker]] ('''1991'''). ''Focused incremental learning for improved generalization with reduced training sets''. ICANN'91, [https://bi.snu.ac.kr/Publications/Conferences/International/ICANN'91.pdf pdf]

'''1992'''

* [[Miroslav Kubat]] ('''1992'''). ''Introduction to Machine Learning''. [http://dblp.uni-trier.de/db/conf/ac/ai1992.html#Kubat92 Advanced Topics in Artificial Intelligence 1992]

* [[Gerald Tesauro]] ('''1995'''). ''Temporal Difference Learning and TD-Gammon''. [[ACM#Communications|Communications of the ACM]] Vol. 38, No. 3

* [[Sebastian Thrun]] ('''1995'''). ''[http://robots.stanford.edu/papers/thrun.nips7.neuro-chess.html Learning to Play the Game of Chess]''. in [[Gerald Tesauro]], [https://en.wikipedia.org/wiki/David_S._Touretzky David S. Touretzky], [http://mitpress.mit.edu/authors/todd-k-leen Todd K. Leen] (eds.) Advances in Neural Information Processing Systems 7, [https://en.wikipedia.org/wiki/MIT_Press MIT Press]

* [[Marco Wiering]] ('''1995'''). ''~~[https://scholar.google.com/citations?view_op=view_citation&hl=en&user=xVas0I8AAAAJ&cstart=20&citation_for_view=xVas0I8AAAAJ:roLk4NBRz8UC~~ TD Learning of Game Evaluation Functions with Hierarchical Neural Architectures]''. Master's thesis, [https://en.wikipedia.org/wiki/University_of_Amsterdam University of Amsterdam], [http://webber.physik.uni-freiburg.de/~hon/vorlss02/Literatur/reinforcement/GameEvaluationWithNeuronal.pdf pdf]

* [[Mathematician#MAArbib|Michael A. Arbib]] (ed.) ('''1995, 2002'''). ''[http://mitpress.mit.edu/books/handbook-brain-theory-and-neural-networks The Handbook of Brain Theory and Neural Networks]''. [https://en.wikipedia.org/wiki/MIT_Press The MIT Press]

* [[Nicol N. Schraudolph]] ('''1995'''). ''[http://nic.schraudolph.org/bib2html/b2hd-Schraudolph95 Optimization of Entropy with Neural Networks]''. Ph.D. thesis, [https://en.wikipedia.org/wiki/University_of_California,_San_Diego University of California, San Diego]

* [[Tristan Cazenave]] ('''1996'''). ''Self fuzzy learning''. [http://www.lamsade.dauphine.fr/~cazenave/papers/fuzzy.pdf pdf]

* [[Yoav Freund]], [[Robert Schapire]] ('''1996'''). ''Game Theory, On-line Prediction and Boosting''. [http://dblp.uni-trier.de/db/conf/colt/colt1996.html#FreundS96 COLT 1996], [http://www.cs.princeton.edu/~schapire/papers/FreundSc96b.pdf pdf]

* [[Christopher D. Rosin]], [https://scholar.google.com/citations?user=vqrY_hgAAAAJ&hl=en Richard K. Belew] ('''1996'''). ''A Competitive Approach in Game Learning''. [https://dblp.uni-trier.de/db/conf/colt/colt1996.html COLT 1996], [http://www.sci.brooklyn.cuny.edu/~sklar/teaching/f05/alife/papers/rosin-96competitive.pdf pdf]

'''1997'''

* [[Yoav Freund]], [[Robert Schapire]] ('''1997'''). ''A decision-theoretic generalization of on-line learning and an application to boosting''. [https://en.wikipedia.org/wiki/Journal_of_Computer_and_System_Sciences Journal of Computer and System Sciences], Vol. 55, No. 1, [http://cseweb.ucsd.edu/~yfreund/papers/adaboost.pdf 1996 pdf] » [https://en.wikipedia.org/wiki/AdaBoost AdaBoost]

* [[William Uther]], [[Manuela Veloso|Manuela M. Veloso]] ('''1997'''). ''Adversarial Reinforcement Learning''. [[Carnegie Mellon University]], [http://www.cse.unsw.edu.au/~willu/w/papers/Uther97a.ps ps]

* [[William Uther]], [[Manuela Veloso|Manuela M. Veloso]] ('''1997'''). ''Generalizing Adversarial Reinforcement Learning''. [[Carnegie Mellon University]], [http://www.cse.unsw.edu.au/~willu/w/papers/Uther97b.ps ps]

* [[Marco Wiering]], [[Jürgen Schmidhuber]] ('''1997'''). ''~~[https://scholar.google.com/citations?view_op=view_citation&hl=en&user=xVas0I8AAAAJ&citation_for_view=xVas0I8AAAAJ:u5HHmVD_uO8C~~ HQ-learning]''. [https://en.wikipedia.org/wiki/Adaptive_Behavior_%28journal%29 Adaptive Behavior], Vol. 6, No 2

'''1998'''

* [[Jonathan Baxter]], [[Andrew Tridgell]], [[Lex Weaver]] ('''1998'''). ''Knightcap: A chess program that learns by combining td(λ) with game-tree search'', Proceedings of the 15th International Conference on Machine Learning, [http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.54.8263&rep=rep1&type=pdf pdf] via [http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.54.8263 citeseerX]

* [[Ryszard Michalski]] ('''1998'''). ''Learnable Evolution: Combining Symbolic and Evolutionary Learning''. Proceedings of the Fourth International Workshop on Multistrategy Learning (MSL'98)

* [[Krzysztof Krawiec]], [http://www.informatik.uni-trier.de/~ley/pers/hd/s/Slowinski:Roman.html Roman Slowinski], [http://www.informatik.uni-trier.de/~ley/pers/hd/s/Szczesniak:Irmina.html Irmina Szczesniak] ('''1998'''). ''[http://link.springer.com/chapter/10.1007%2F3-540-69115-4_60 Pedagogical Method for Extraction of Symbolic Knowledge from Neural Networks]''. [http://link.springer.com/book/10.1007%2F3-540-69115-4 Rough Sets and Current Trends in Computing 1998]

* [[Marco Wiering]], [[Jürgen Schmidhuber]] ('''1998'''). ''~~[https://scholar.google.com/citations?view_op=view_citation&hl=en&user=xVas0I8AAAAJ&citation_for_view=xVas0I8AAAAJ:2osOgNQ5qMEC~~ Fast online Q (λ)]''. [https://en.wikipedia.org/wiki/Machine_Learning_%28journal%29 Machine Learning], Vol. 33, No. 1

'''1999'''

* [[Robert Hyatt]] ('''1999'''). ''[http://www.craftychess.com/hyatt/learning.html Book Learning - a Methodology to Tune an Opening Book Automatically]''. [[ICGA Journal#22_1|ICCA Journal, Vol. 22, No. 1]]

* [http://www.ilsp.gr/homepages/papavasiliou_eng.html Vassilis Papavassiliou], [[Stuart Russell]] ('''1999'''). ''Convergence of reinforcement learning with general function approximators.'' In Proc. IJCAI-99, Stockholm, [http://www.cs.berkeley.edu/~russell/papers/ijcai99-bridge.ps ps]

* [[Philip G. K. Reiser]], [[Patricia J. Riddle]] ('''1999'''). ''[http://link.springer.com/chapter/10.1007%2F3-540-48873-1_19 Evolving Logic Programs to Classify Chess-Endgame Positions]''. [http://link.springer.com/book/10.1007%2F3-540-48873-1 Simulated Evolution and Learning], [https://en.wikipedia.org/wiki/Canberra Canberra], Australia. [http://www.springer.com/series/1244 Lecture Notes in Artificial Intelligence], No. 1585, [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer], [http://stancomb.co.uk/Papers/seal98.pdf pdf] » [[Endgame]]

* [[Marco Wiering]] ('''1999'''). ''[~~https://scholar.google.com/citations?view_op=view_citation&hl=en&user=xVas0I8AAAAJ&pagesize=100&citation_for_view=xVas0I8AAAAJ:9yKSN-GCB0IC~~ Explorations in Efficient Reinforcement Learning]''. Ph.D. thesis, [https://en.wikipedia.org/wiki/University_of_Amsterdam University of Amsterdam], advisors [[Mathematician#FGroen|Frans Groen]] and [[Jürgen Schmidhuber]]

* [[Mathematician#GEHinton|Geoffrey E. Hinton]], [[Terrence J. Sejnowski]] (eds.) ('''1999'''). ''[https://mitpress.mit.edu/books/unsupervised-learning Unsupervised Learning: Foundations of Neural Computation]''. [https://en.wikipedia.org/wiki/MIT_Press MIT Press]

==2000 ...==

* [[Peter Auer]], [[Nicolò Cesa-Bianchi]], [[Paul Fischer]] ('''2002'''). ''[http://link.springer.com/article/10.1023%2FA%3A1013689704352 Finite-time Analysis of the Multiarmed Bandit Problem]''. [https://en.wikipedia.org/wiki/Machine_Learning_%28journal%29 Machine Learning], Vol. 47, No. 2, [http://homes.di.unimi.it/~cesabian/Pubblicazioni/ml-02.pdf pdf]

* [[Paul E. Utgoff]], [[David J. Stracuzzi]] ('''2002'''). ''Many-Layered Learning''. [https://en.wikipedia.org/wiki/Neural_Computation_%28journal%29 Neural Computation], Vol. 14, No. 10, [http://people.cs.umass.edu/~utgoff/papers/neco-stl.pdf pdf]

* [[Ryan Rifkin]] ('''2002'''). ''Everything Old Is New Again: A Fresh Look at Historical Approaches to Machine Learning''. Ph.D thesis, [[Massachusetts Institute of Technology|MIT]], [http://cbcl.mit.edu/publications/theses/thesis-rifkin.pdf pdf]

'''2003'''

* [[Levente Kocsis]], [[Jaap van den Herik]], [[Jos Uiterwijk]] ('''2003'''). ''Two Learning Algorithms for Forward Pruning''. [[ICGA Journal#26_3|ICGA Journal, Vol 26, No. 3]], [http://zaphod.aml.sztaki.hu/papers/kocsis-ICGA03.ps ps]

* [[Daniel Osman]], [[Jacek Mańdziuk]] ('''2004'''). ''Comparison of TDLeaf and TD learning in Game Playing Domain''. [http://www.informatik.uni-trier.de/~ley/db/conf/iconip/iconip2004.html#OsmanM04 11. ICONIP], [http://www.mini.pw.edu.pl/~mandziuk/PRACE/ICONIP04.pdf pdf]

* [[Albert Xin Jiang]] ('''2004'''). ''Multiagent Reinforcement Learning in Stochastic Games with Continuous Action Spaces''. [http://www.cs.ubc.ca/%7Ejiang/papers/continuous.pdf pdf]

* [[Henk Mannen]], [[Marco Wiering]] ('''2004'''). ''~~[http://scholar.google.com/citations?view_op=view_citation&hl=en&user=xVas0I8AAAAJ&cstart=20&pagesize=80&citation_for_view=xVas0I8AAAAJ:7PzlFSSx8tAC~~ Learning to play chess using TD(λ)-learning with database games]''. [http://students.uu.nl/en/hum/cognitive-artificial-intelligence Cognitive Artiﬁcial Intelligence], [https://en.wikipedia.org/wiki/Utrecht_University Utrecht University], Benelearn’04

==2005 ...==

* [[Dave Gomboc]], [[Michael Buro]], [[Tony Marsland]] ('''2005'''). ''Tuning evaluation functions by maximizing concordance'' Theoretical Computer Science, Volume 349, Issue 2, pp. 202-229, [http://www.cs.ualberta.ca/%7Emburo/ps/tcs-learn.pdf pdf]

* [[Martin Možina]] ('''2009'''). ''Argument Based Machine Learning'', PhD Thesis, [http://www.ailab.si/martin/mozina_phd.pdf pdf]

* [[David Silver]] ('''2009'''). ''Reinforcement Learning and Simulation-Based Search''. Ph.D. thesis, [[University of Alberta]]. [http://webdocs.cs.ualberta.ca/~silver/David_Silver/Publications_files/thesis.pdf pdf]

* [[Eli David|Omid David]], [[Jaap van den Herik]], [[Moshe Koppel]], [[Nathan S. Netanyahu]] ('''2009'''). ''Simulating Human Grandmasters: Evolution and Coevolution of Evaluation Functions''. [[ACM]] Genetic and Evolutionary Computation Conference ([http://www.sigevo.org/gecco-2009/ GECCO '09]), pp. 1483 - 1489, [https://en.wikipedia.org/wiki/Montreal Montreal], Canada, [http://www.omiddavid.com/pubs/gm-simul.pdf pdf]* [[Eli David|Omid David]] ('''2009'''). ''Genetic Algorithms Based Learning for Evolving Intelligent Organisms''. Ph.D. Thesis <ref>[[Dap Hartmann]] ('''2010'''). ''Mimicking the Black Box - Genetically evolving evaluation functions and search algorithms''. Review on Omid David's Ph.D. Thesis, [[ICGA Journal#33_1|ICGA Journal, Vol 33, No. 1]]</ref>

* [[Nur Merve Amil]], [[Nicolas Bredèche]], [[Christian Gagné]], [[Sylvain Gelly]], [[Marc Schoenauer]], [[Olivier Teytaud]] ('''2009'''). ''A Statistical Learning Perspective of Genetic Programming''. EuroGP 2009, [http://hal.inria.fr/docs/00/36/97/82/PDF/eurogp.pdf pdf]

* [[Richard Sutton]], [[Hamid Reza Maei]], [[Doina Precup]], [[Shalabh Bhatnagar]], [[David Silver]], [[Csaba Szepesvári]], [[Eric Wiewiora]]. ('''2009'''). ''Fast Gradient-Descent Methods for Temporal-Difference Learning with Linear Function Approximation''. In Proceedings of the 26th International Conference on Machine Learning (ICML-09). [http://www.sztaki.hu/~szcsaba/papers/GTD-ICML09.pdf pdf]

* [[Jacek Mańdziuk]] ('''2010'''). ''[http://link.springer.com/book/10.1007%2F978-3-642-11678-0 Knowledge-Free and Learning-Based Methods in Intelligent Game Playing]''. [http://link.springer.com/bookseries/7092 Studies in Computational Intelligence], Vol. 276, [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer]

* [[Joel Veness]], [[Kee Siong Ng]], [[Marcus Hutter]], [[David Silver]] ('''2010'''). ''Reinforcement Learning via AIXI Approximation''. Association for the Advancement of Artificial Intelligence (AAAI), [http://jveness.info/publications/veness_rl_via_aixi_approx.pdf pdf]

* [[Eli David|Omid David]], [[Moshe Koppel]], [[Nathan S. Netanyahu]] ('''2010'''). ''[http://www.springerlink.com/content/3346t8432n718821 Expert-Driven Genetic Algorithms for Simulating Evaluation Functions]''. [http://www.omiddavid.com/pubs/expert-driven.pdf pdf]* [[Eli David|Omid David]], [[Nathan S. Netanyahu]], Yoav Rosenberg, Moshe Shimoni ('''2010'''). ''Genetic Algorithms for Automatic Classification of Moving Objects''. [[ACM]] Genetic and Evolutionary Computation Conference ([http://www.sigevo.org/gecco-2010/ GECCO '10]), [https://en.wikipedia.org/wiki/Portland,_Oregon Portland, OR], [http://www.omiddavid.com/pubs/object-classification.pdf pdf]* [[Eli David|Omid David]], [[Moshe Koppel]], [[Nathan S. Netanyahu]] ('''2010'''). ''Genetic Algorithms for Automatic Search Tuning''. [[ICGA Journal#33_2|ICGA Journal, Vol 33, No. 2]]

* [[Mesut Kirci]] ('''2010'''). ''Feature Learning using State Differences''. Master's thesis, Department of Computing Science, [[University of Alberta]], [http://repository.library.ualberta.ca/dspace/bitstream/10048/1011/1/kirci_mesut_spring+2010.pdf pdf] » [[General Game Playing]]

* [[Amine Bourki]], [[Matthieu Coulm]], [[Philippe Rolet]], [[Olivier Teytaud]], [[Paul Vayssière]] ('''2010'''). ''[http://hal.inria.fr/inria-00467796/en/ Parameter Tuning by Simple Regret Algorithms and Multiple Simultaneous Hypothesis Testing]''. [http://hal.inria.fr/docs/00/46/77/96/PDF/tosubmit.pdf pdf]

* [[Edward P. Manning]] ('''2010'''). ''[http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=5409565 Using Resource-Limited Nash Memory to Improve an Othello Evaluation Function]''. [[IEEE#TOCIAIGAMES|IEEE Transactions on Computational Intelligence and AI in Games]], Vol. 2, No. 1 » [[Othello]]

* [[Edward P. Manning]] ('''2010'''). ''[http://dl.acm.org/citation.cfm?id=1830667 Coevolution in a Large Search Space using Resource-limited Nash Memory]''. [http://www.informatik.uni-trier.de/~ley/db/conf/gecco/gecco2010.html#Manning10 GECCO '10] » [[Othello]]

* [[Marco Wiering]] ('''2010'''). ''~~[https://scholar.google.com/citations?view_op=view_citation&hl=en&user=xVas0I8AAAAJ&cstart=20&citation_for_view=xVas0I8AAAAJ:_kc_bZDykSQC~~ Self-play and using an expert to learn to play backgammon with temporal difference learning]''. [http://www.scirp.org/journal/jilsa/ Journal of Intelligent Learning Systems and Applications], Vol. 2, No. 2

'''2011'''

* [[Joel Veness]] ('''2011'''). ''Approximate Universal Artificial Intelligence and Self-Play Learning for Games''. Ph.D. thesis, [https://en.wikipedia.org/wiki/University_of_New_South_Wales University of New South Wales], supervisors: [[Kee Siong Ng]], [[Marcus Hutter]], [[Alan Blair]], [[William Uther]], [[John Lloyd]]; [http://jveness.info/publications/veness_phd_thesis_final.pdf pdf]

* [[Hamid Reza Maei]] ('''2011'''). ''Gradient Temporal-Difference Learning Algorithms''. Ph.D. thesis, [[University of Alberta]], advisor [[Richard Sutton]], [http://webdocs.cs.ualberta.ca/~sutton/papers/maei-thesis-2011.pdf pdf]

'''2012'''

* [[Marco Wiering]], [http://martijnvanotterlo.nl/ Martijn Van Otterlo] ('''2012'''). ''~~[https://scholar.google.com/citations?view_op=view_citation&hl=en&user=xVas0I8AAAAJ&citation_for_view=xVas0I8AAAAJ:abG-DnoFyZgC~~ Reinforcement learning: State-of-the-art]''. [http://link.springer.com/book/10.1007/978-3-642-27645-3 Adaptation, Learning, and Optimization, Vol. 12], [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer]

: [[István Szita]] ('''2012'''). ''[http://link.springer.com/chapter/10.1007%2F978-3-642-27645-3_17 Reinforcement Learning in Games]''. Chapter 17

* [http://dblp.uni-trier.de/pers/hd/d/Dries:Sjoerd_van_den Sjoerd van den Dries], [[Marco Wiering]] ('''2012'''). ''[https://scholar.google.com/citations?view_op=view_citation&hl=en&user=xVas0I8AAAAJ&cstart=40&citation_for_view=xVas0I8AAAAJ:P5F9QuxV20EC Neural-fitted TD-leaf learning for playing Othello with structured neural networks]''. [[IEEE#NN|IEEE Transactions on Neural Networks and Learning Systems]], Vol. 23, No. 11

* [[Marcin Szubert]], [[Wojciech Jaśkowski]], [[Paweł Liskowski]], [[Krzysztof Krawiec]] ('''2013'''). ''Shaping Fitness Function for Evolutionary Learning of Game Strategies''. [http://www.informatik.uni-trier.de/~ley/db/conf/gecco/gecco2013.html#SzubertJLK13 GECCO 2013], [http://www.cs.put.poznan.pl/wjaskowski/pub/papers/szubert2013shaping.pdf pdf]

* [[Marcin Szubert]], [[Wojciech Jaśkowski]], [[Krzysztof Krawiec]] ('''2013'''). ''[http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6504736 On Scalability, Generalization, and Hybridization of Coevolutionary Learning: a Case Study for Othello]''. [[IEEE#TOCIAIGAMES|IEEE Transactions on Computational Intelligence and AI in Games]], Vol. 5, No. 3 » [[Othello]]

* [http://dblp.uni-trier.de/pers/hd/r/Ree:M=_van_der Michiel van der Ree], [[Marco Wiering]] ('''2013'''). ''~~[https://scholar.google.com/citations?view_op=view_citation&hl=en&user=xVas0I8AAAAJ&cstart=60&pagesize=80&citation_for_view=xVas0I8AAAAJ:K3LRdlH-MEoC~~ Reinforcement Learning in the Game of Othello: Learning Against a Fixed Opponent and Learning from Self-Play]''. [http://dblp.uni-trier.de/db/conf/adprl/adprl2013.html#ReeW13 ADPRL 2013]* [http://dblp.uni-trier.de/pers/hd/b/Bom:Luuk Luuk Bom], [http://dblp.uni-trier.de/pers/hd/h/Henken:Ruud Ruud Henken], [[Marco Wiering]] ('''2013'''). ''~~[https://scholar.google.com/citations?view_op=view_citation&hl=en&user=xVas0I8AAAAJ&cstart=40&citation_for_view=xVas0I8AAAAJ:l7t_Zn2s7bgC~~ Reinforcement Learning to Train Ms. Pac-Man Using Higher-order Action-relative Inputs]''. [http://dblp.uni-trier.de/db/conf/adprl/adprl2013.html#BomHW13 ADPRL 2013] <ref>[https://en.wikipedia.org/wiki/Ms._Pac-Man Ms. Pac-Man from Wikipedia]</ref>

* [[Peter Auer]], [[Marcus Hutter]], [[Laurent Orseau]] ('''2013'''). ''[http://drops.dagstuhl.de/opus/volltexte/2013/4340/ Reinforcement Learning]''. [http://dblp.uni-trier.de/db/journals/dagstuhl-reports/dagstuhl-reports3.html#AuerHO13 Dagstuhl Reports, Vol. 3, No. 8], DOI: [http://drops.dagstuhl.de/opus/volltexte/2013/4340/ 10.4230/DagRep.3.8.1], URN: [http://drops.dagstuhl.de/opus/volltexte/2013/4340/ urn:nbn:de:0030-drops-43409]

* [[Igor Roizen]], [[Judea Pearl]] ('''2013'''). ''Learning Link-Probabilities in Causal Trees.'' [https://arxiv.org/abs/1304.3103 arXiv:1304.3103]

* [[Volodymyr Mnih]], [[Koray Kavukcuoglu]], [[David Silver]], [[Alex Graves]], [[Ioannis Antonoglou]], [[Daan Wierstra]], [[Martin Riedmiller]] ('''2013'''). ''Playing Atari with Deep Reinforcement Learning''. [http://arxiv.org/abs/1312.5602 arXiv:1312.5602] <ref>[http://www.nervanasys.com/demystifying-deep-reinforcement-learning/ Demystifying Deep Reinforcement Learning] by [http://www.nervanasys.com/author/tambet/ Tambet Matiisen], [http://www.nervanasys.com/ Nervana], December 21, 2015</ref> <ref>[http://www.google.com/patents/US20150100530 Patent US20150100530 - Methods and apparatus for reinforcement learning - Google Patents]</ref>

'''2014'''

* [[Eli David|Omid E. David]], [[Jaap van den Herik]], [[Moshe Koppel]], [[Nathan S. Netanyahu]] ('''2014'''). ''Genetic Algorithms for Evolving Computer Chess Programs''. [[IEEE#EC|IEEE Transactions on Evolutionary Computation]], [http://www.genetic-programming.org/hc2014/David-Paper.pdf pdf] <ref>[http://www.liacs.nl/nieuws/jaap-van-den-herik-wint-humies-award-2014/ Jaap van den Herik wint Humies Award 2014 - LIACS - Leiden Institute of Advanced Computer Science]</ref>

* [[Wojciech Jaśkowski]], [[Marcin Szubert]], [[Paweł Liskowski]] ('''2014'''). ''Multi-Criteria Comparison of Coevolution and Temporal Difference Learning on Othello''. [http://www.evostar.org/2014/ EvoApplications 2014], [http://www.springer.com/computer/theoretical+computer+science/book/978-3-662-45522-7 Springer, volume 8602] » [[Othello]]

* [[Marcin Szubert]], [[Wojciech Jaśkowski]] ('''2014'''). ''Temporal Difference Learning of N-Tuple Networks for the Game 2048''. [[IEEE#CIG|IEEE Conference on Computational Intelligence and Games]], [http://www.cs.put.poznan.pl/mszubert/pub/szubert2014cig.pdf pdf] <ref>[https://en.wikipedia.org/wiki/2048_%28video_game%29 2048 (video game) from Wikipedia]</ref>

* [[Ziyu Wang]], [[Nando de Freitas]], [[Marc Lanctot]] ('''2016'''). ''Dueling Network Architectures for Deep Reinforcement Learning''. [http://arxiv.org/abs/1511.06581 arXiv:1511.06581]

* [[Jialin Liu]], [[Olivier Teytaud]], [[Tristan Cazenave]] ('''2016'''). ''Fast seed-learning algorithms for games''. [[CG 2016]]

* [[~~Omid~~ Eli David|Omid E. David]], [[Nathan S. Netanyahu]], [[Lior Wolf]] ('''2016'''). ''[http://link.springer.com/chapter/10.1007%2F978-3-319-44781-0_11 DeepChess: End-to-End Deep Neural Network for Automatic Learning in Chess]''. [http://icann2016.org/ ICAAN 2016], [https://en.wikipedia.org/wiki/Lecture_Notes_in_Computer_Science Lecture Notes in Computer Science], Vol. 9887, [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer], [http://www.cs.tau.ac.il/~wolf/papers/deepchess.pdf pdf preprint] » [[DeepChess]] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=61748 DeepChess: Another deep-learning based chess program] by [[Matthew Lai]], [[CCC]], October 17, 2016</ref> <ref>[http://icann2016.org/index.php/conference-programme/recipients-of-the-best-paper-awards/ ICANN 2016 | Recipients of the best paper awards]</ref>

* [https://www.linkedin.com/in/ian-goodfellow-b7187213 Ian Goodfellow], [https://en.wikipedia.org/wiki/Yoshua_Bengio Yoshua Bengio], [https://www.linkedin.com/in/aaron-courville-53a63459 Aaron Courville] ('''2016'''). ''[http://www.deeplearningbook.org/ Deep Learning]''. [https://en.wikipedia.org/wiki/MIT_Press MIT Press]

* [[Max Jaderberg]], [[Volodymyr Mnih]], [[Wojciech Marian Czarnecki]], [[Tom Schaul]], [[Joel Z. Leibo]], [[David Silver]], [[Koray Kavukcuoglu]] ('''2016'''). ''Reinforcement Learning with Unsupervised Auxiliary Tasks''. [https://arxiv.org/abs/1611.05397v1 arXiv:1611.05397v1]

* [[Johannes Fürnkranz]] ('''2017'''). ''Machine Learning and Game Playing''. in [https://en.wikipedia.org/wiki/Claude_Sammut Claude Sammut], [https://en.wikipedia.org/wiki/Geoff_Webb Geoffrey I. Webb] (eds) ('''2017'''). ''[https://link.springer.com/referencework/10.1007%2F978-1-4899-7687-1 Encyclopedia of Machine Learning and Data Mining]''. [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer], [https://en.wikipedia.org/wiki/Boston Boston, MA]

* [[David Silver]], [[Thomas Hubert]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Matthew Lai]], [[Arthur Guez]], [[Marc Lanctot]], [[Laurent Sifre]], [[Dharshan Kumaran]], [[Thore Graepel]], [[Timothy Lillicrap]], [[Karen Simonyan]], [[Demis Hassabis]] ('''2017'''). ''Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm''. [https://arxiv.org/abs/1712.01815 arXiv:1712.01815] » [[AlphaZero]]

'''2018'''

* [[Arthur Guez]], [[Théophane Weber]], [[Ioannis Antonoglou]], [[Karen Simonyan]], [[Oriol Vinyals]], [[Daan Wierstra]], [[Rémi Munos]], [[David Silver]] ('''2018'''). ''Learning to Search with MCTSnets''. [https://arxiv.org/abs/1802.04697 arXiv:1802.04697] » [[Monte-Carlo Tree Search]]

* [[Matthia Sabatelli]], [[Francesco Bidoia]], [[Valeriu Codreanu]], [[Marco Wiering]] ('''2018'''). ''Learning to Evaluate Chess Positions with Deep Neural Networks and Limited Lookahead''. ICPRAM 2018, [http://www.ai.rug.nl/~mwiering/GROUP/ARTICLES/ICPRAM_CHESS_DNN_2018.pdf pdf]

=Forum Posts=

GerdIsenberg

Bureaucrats, Administrators

25,161

edits

Changes

Learning

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools