Changes

Jump to: navigation, search

Marco Wiering

1,263 bytes added, 08:39, 27 May 2021
no edit summary
* [[Marco Wiering]] ('''1999'''). ''Explorations in Efficient Reinforcement Learning''. Ph.D. thesis, [https://en.wikipedia.org/wiki/University_of_Amsterdam University of Amsterdam], advisors [[Mathematician#FGroen|Frans Groen]] and [[Jürgen Schmidhuber]]
==2000 ...==
* [[Henk Mannen]], [[Marco Wiering]] ('''20002004'''). ''Multi-agent reinforcement learning for traffic light control''. ICML, [httphttps://www.dcscsemanticscholar.tudelft.nlorg/~sc4081paper/assign/pap/Reinforcement_Learning.pdf pdf]* [[Henk Learning-to-Play-Chess-using-TD(lambda)-learning-Mannen]], [[Marco -Wiering]] ('''2004'''). ''/00a6f81c8ebe8408c147841f26ed27eb13fb07f3 Learning to play chess using TD(λ)-learning with database games]''. [http://students.uu.nl/en/hum/cognitive-artificial-intelligence Cognitive Artificial Intelligence], [https://en.wikipedia.org/wiki/Utrecht_University Utrecht University], Benelearn’04, [https://www.ai.rug.nl/~mwiering/GROUP/ARTICLES/learning-chess.pdf pdf]* [httphttps://dblp.uni-trier.deorg/perspid/hd20/p/Patist:Jan_Peter 4400.html Jan Peter Patist], [[Marco Wiering]] ('''2004'''). ''Learning to Play Draughts using Temporal Difference Learning with Neural Networks and Databases''. [http://students.uu.nl/en/hum/cognitive-artificial-intelligence Cognitive Artificial Intelligence], [https://en.wikipedia.org/wiki/Utrecht_University Utrecht University], Benelearn’04
==2005 ...==
* [[Marco Wiering]], [httphttps://dblp.uni-trier.deorg/perspid/hd20/p/Patist:Jan_Peter 4400.html Jan Peter Patist], [[Henk Mannen]] ('''2005'''). ''[https://www.semanticscholar.org/paper/Learning-to-Play-Board-Games-using-Temporal-Methods-Wiering-Patist/7410e2bf16ed184db89f0e3acbbfdad473623b7a Learning to Play Board Games using Temporal Difference Methods]''. Technical Report, [https://en.wikipedia.org/wiki/Utrecht_University Utrecht University], UU-CS-2005-048, [http://wwwwebdoc.aisub.ruggwdg.nlde/ebook/~mwieringserien/GROUPah/ARTICLESUU-CS/learning_games_TR2005-048.pdf pdf]
* [[Marco Wiering]] ('''2005'''). ''QV (λ)-learning: A new on-policy reinforcement learning algorithm''. Proceedings of the 7th European Workshop on Reinforcement Learning, [http://www.ai.rug.nl/~mwiering/GROUP/ARTICLES/QV_learning_ewrl.pdf pdf]
==2010 ...==
==2015 ...==
* [[Matthia Sabatelli]], [[Francesco Bidoia]], [[Valeriu Codreanu]], [[Marco Wiering]] ('''2018'''). ''Learning to Evaluate Chess Positions with Deep Neural Networks and Limited Lookahead''. ICPRAM 2018, [http://www.ai.rug.nl/~mwiering/GROUP/ARTICLES/ICPRAM_CHESS_DNN_2018.pdf pdf]
* [[Matthia Sabatelli]], [https://github.com/glouppe Gilles Louppe], [https://scholar.google.com/citations?user=tyFTsmIAAAAJ&hl=en Pierre Geurts], [[Marco Wiering]] ('''2018'''). ''Deep Quality-Value (DQV) Learning''. [https://arxiv.org/abs/1810.00368 arXiv:1810.00368]
* [[Matthia Sabatelli]], [https://github.com/glouppe Gilles Louppe], [https://scholar.google.com/citations?user=tyFTsmIAAAAJ&hl=en Pierre Geurts], [[Marco Wiering]] ('''2019'''). ''Approximating two value functions instead of one: towards characterizing a new family of Deep Reinforcement Learning algorithms''. [https://arxiv.org/abs/1909.01779 arXiv:1909.01779]
==2020 ...==
* [[Matthia Sabatelli]], [https://github.com/glouppe Gilles Louppe], [https://scholar.google.com/citations?user=tyFTsmIAAAAJ&hl=en Pierre Geurts], [[Marco Wiering]] ('''2020'''). ''The Deep Quality-Value Family of Deep Reinforcement Learning Algorithms''. [https://dblp.org/db/conf/ijcnn/ijcnn2020.html#SabatelliLGW20 IJCNN 2020] <ref>[https://github.com/paintception/Deep-Quality-Value-DQV-Learning- GitHub - paintception/Deep-Quality-Value-DQV-Learning-: DQV-Learning: a novel faster synchronous Deep Reinforcement Learning algorithm]</ref>
=External Links=

Navigation menu