Latest revision as of 08:39, 27 May 2021

Home * People * Marco Wiering

Marco Wiering ^[1]

Marco Alexander Wiering,
a Dutch mathematician, computer scientist, and assistant professor at Faculty of Mathematics and Natural Sciences, artificial intelligence and cognitive engineering, University of Groningen with tenure track for associate professor, and until September 2007 assistant professor at Utrecht University. He holds a Ph.D. on the topic of reinforcement learning from University of Amsterdam in 1999, thesis advisors were Frans Groen and Jürgen Schmidhuber. His research interests include artificial intelligence, machine learning, neural networks, object recognition, pattern recognition, evolutionary computation, robotics, game playing, multi-agent systems, time-series analysis and computer vision ^[2].

Selected Publications

^[3] ^[4]

1995 ...

Marco Wiering (1995). Learning of Game Evaluation Functions with Hierarchical Neural Architectures. Master's thesis, University of Amsterdam, pdf
Marco Wiering, Jürgen Schmidhuber (1997). HQ-learning. Adaptive Behavior, Vol. 6, No 2
Marco Wiering, Jürgen Schmidhuber (1998). Fast online Q (λ). Machine Learning, Vol. 33, No. 1
Marco Wiering (1999). Explorations in Efficient Reinforcement Learning. Ph.D. thesis, University of Amsterdam, advisors Frans Groen and Jürgen Schmidhuber

2000 ...

Henk Mannen, Marco Wiering (2004). Learning to play chess using TD(λ)-learning with database games. Cognitive Artiﬁcial Intelligence, Utrecht University, Benelearn’04, pdf
Jan Peter Patist, Marco Wiering (2004). Learning to Play Draughts using Temporal Difference Learning with Neural Networks and Databases. Cognitive Artiﬁcial Intelligence, Utrecht University, Benelearn’04

2005 ...

Marco Wiering, Jan Peter Patist, Henk Mannen (2005). Learning to Play Board Games using Temporal Difference Methods. Technical Report, Utrecht University, UU-CS-2005-048, pdf
Marco Wiering (2005). QV (λ)-learning: A new on-policy reinforcement learning algorithm. Proceedings of the 7th European Workshop on Reinforcement Learning, pdf

2010 ...

Marco Wiering (2010). Self-play and using an expert to learn to play backgammon with temporal difference learning. Journal of Intelligent Learning Systems and Applications, Vol. 2, No. 2
Marco Wiering, Martijn Van Otterlo (eds.) (2012). Reinforcement learning: State-of-the-art. Adaptation, Learning, and Optimization, Vol. 12, Springer
Sjoerd van den Dries, Marco Wiering (2012). Neural-fitted TD-leaf learning for playing Othello with structured neural networks. IEEE Transactions on Neural Networks and Learning Systems, Vol. 23, No. 11
Michiel van der Ree, Marco Wiering (2013). Reinforcement Learning in the Game of Othello: Learning Against a Fixed Opponent and Learning from Self-Play. ADPRL 2013
Luuk Bom, Ruud Henken, Marco Wiering (2013). Reinforcement Learning to Train Ms. Pac-Man Using Higher-order Action-relative Inputs. ADPRL 2013 ^[5]

2015 ...

Matthia Sabatelli, Francesco Bidoia, Valeriu Codreanu, Marco Wiering (2018). Learning to Evaluate Chess Positions with Deep Neural Networks and Limited Lookahead. ICPRAM 2018, pdf
Matthia Sabatelli, Gilles Louppe, Pierre Geurts, Marco Wiering (2018). Deep Quality-Value (DQV) Learning. arXiv:1810.00368
Matthia Sabatelli, Gilles Louppe, Pierre Geurts, Marco Wiering (2019). Approximating two value functions instead of one: towards characterizing a new family of Deep Reinforcement Learning algorithms. arXiv:1909.01779

2020 ...

Matthia Sabatelli, Gilles Louppe, Pierre Geurts, Marco Wiering (2020). The Deep Quality-Value Family of Deep Reinforcement Learning Algorithms. IJCNN 2020 ^[6]

External Links

References

Up one level

[1] Marco Wiering | Universität Groningen

[2] Marco Wiering's Home Page

[3] Marco Wiering's publications page

[4] : Marco Wiering

[5] Ms. Pac-Man from Wikipedia

[6] GitHub - paintception/Deep-Quality-Value-DQV-Learning-: DQV-Learning: a novel faster synchronous Deep Reinforcement Learning algorithm

[1]

[2]

[3]

[4]

[5]

[6]

@@ Line 14: / Line 14: @@
 * [[Marco Wiering]] ('''1999'''). ''Explorations in Efficient Reinforcement Learning''. Ph.D. thesis, [https://en.wikipedia.org/wiki/University_of_Amsterdam University of Amsterdam], advisors [[Mathematician#FGroen|Frans Groen]] and [[Jürgen Schmidhuber]]
 ==2000 ...==
-* [[Marco Wiering]] ('''2000'''). ''Multi-agent reinforcement learning for traffic light control''. ICML, [http://www.dcsc.tudelft.nl/~sc4081/assign/pap/Reinforcement_Learning.pdf pdf]
+* [[Henk Mannen]], [[Marco Wiering]] ('''2004'''). ''[https://www.semanticscholar.org/paper/Learning-to-Play-Chess-using-TD(lambda)-learning-Mannen-Wiering/00a6f81c8ebe8408c147841f26ed27eb13fb07f3 Learning to play chess using TD(λ)-learning with database games]''. Cognitive Artiﬁcial Intelligence, [https://en.wikipedia.org/wiki/Utrecht_University Utrecht University], Benelearn’04, [https://www.ai.rug.nl/~mwiering/GROUP/ARTICLES/learning-chess.pdf pdf]
-* [[Henk Mannen]], [[Marco Wiering]] ('''2004'''). ''Learning to play chess using TD(λ)-learning with database games''. [http://students.uu.nl/en/hum/cognitive-artificial-intelligence Cognitive Artiﬁcial Intelligence], [https://en.wikipedia.org/wiki/Utrecht_University Utrecht University], Benelearn’04
+* [https://dblp.org/pid/20/4400.html Jan Peter Patist], [[Marco Wiering]] ('''2004'''). ''Learning to Play Draughts using Temporal Difference Learning with Neural Networks and Databases''. [http://students.uu.nl/en/hum/cognitive-artificial-intelligence Cognitive Artiﬁcial Intelligence], [https://en.wikipedia.org/wiki/Utrecht_University Utrecht University], Benelearn’04
-* [http://dblp.uni-trier.de/pers/hd/p/Patist:Jan_Peter Jan Peter Patist], [[Marco Wiering]] ('''2004'''). ''Learning to Play Draughts using Temporal Difference Learning with Neural Networks and Databases''. [http://students.uu.nl/en/hum/cognitive-artificial-intelligence Cognitive Artiﬁcial Intelligence], [https://en.wikipedia.org/wiki/Utrecht_University Utrecht University], Benelearn’04
 ==2005 ...==
-* [[Marco Wiering]], [http://dblp.uni-trier.de/pers/hd/p/Patist:Jan_Peter Jan Peter Patist], [[Henk Mannen]] ('''2005'''). ''Learning to Play Board Games using Temporal Difference Methods''. Technical Report, [https://en.wikipedia.org/wiki/Utrecht_University Utrecht University], UU-CS-2005-048, [http://www.ai.rug.nl/~mwiering/GROUP/ARTICLES/learning_games_TR.pdf pdf]
+* [[Marco Wiering]], [https://dblp.org/pid/20/4400.html Jan Peter Patist], [[Henk Mannen]] ('''2005'''). ''[https://www.semanticscholar.org/paper/Learning-to-Play-Board-Games-using-Temporal-Methods-Wiering-Patist/7410e2bf16ed184db89f0e3acbbfdad473623b7a Learning to Play Board Games using Temporal Difference Methods]''. Technical Report, [https://en.wikipedia.org/wiki/Utrecht_University Utrecht University], UU-CS-2005-048, [http://webdoc.sub.gwdg.de/ebook/serien/ah/UU-CS/2005-048.pdf pdf]
 * [[Marco Wiering]] ('''2005'''). ''QV (λ)-learning: A new on-policy reinforcement learning algorithm''. Proceedings of the 7th European Workshop on Reinforcement Learning, [http://www.ai.rug.nl/~mwiering/GROUP/ARTICLES/QV_learning_ewrl.pdf pdf]
 ==2010 ...==
@@ Line 28: / Line 27: @@
 ==2015 ...==
 * [[Matthia Sabatelli]], [[Francesco Bidoia]], [[Valeriu Codreanu]], [[Marco Wiering]] ('''2018'''). ''Learning to Evaluate Chess Positions with Deep Neural Networks and Limited Lookahead''. ICPRAM 2018, [http://www.ai.rug.nl/~mwiering/GROUP/ARTICLES/ICPRAM_CHESS_DNN_2018.pdf pdf]
+* [[Matthia Sabatelli]], [https://github.com/glouppe Gilles Louppe], [https://scholar.google.com/citations?user=tyFTsmIAAAAJ&hl=en Pierre Geurts], [[Marco Wiering]] ('''2018'''). ''Deep Quality-Value (DQV) Learning''. [https://arxiv.org/abs/1810.00368 arXiv:1810.00368]
+* [[Matthia Sabatelli]], [https://github.com/glouppe Gilles Louppe], [https://scholar.google.com/citations?user=tyFTsmIAAAAJ&hl=en Pierre Geurts], [[Marco Wiering]] ('''2019'''). ''Approximating two value functions instead of one: towards characterizing a new family of Deep Reinforcement Learning algorithms''. [https://arxiv.org/abs/1909.01779 arXiv:1909.01779]
+==2020 ...==
+* [[Matthia Sabatelli]], [https://github.com/glouppe Gilles Louppe], [https://scholar.google.com/citations?user=tyFTsmIAAAAJ&hl=en Pierre Geurts], [[Marco Wiering]] ('''2020'''). ''The Deep Quality-Value Family of Deep Reinforcement Learning Algorithms''. [https://dblp.org/db/conf/ijcnn/ijcnn2020.html#SabatelliLGW20 IJCNN 2020] <ref>[https://github.com/paintception/Deep-Quality-Value-DQV-Learning- GitHub - paintception/Deep-Quality-Value-DQV-Learning-: DQV-Learning: a novel faster synchronous Deep Reinforcement Learning algorithm]</ref>
 =External Links=

Difference between revisions of "Marco Wiering"

Latest revision as of 08:39, 27 May 2021

Contents

Selected Publications

1995 ...

2000 ...

2005 ...

2010 ...

2015 ...

2020 ...

External Links

References

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools