Difference between revisions of "Peter Dayan"

From Chessprogramming wiki
Jump to: navigation, search
 
(One intermediate revision by the same user not shown)
Line 9: Line 9:
  
 
=Work=
 
=Work=
Peter Dayan's work has been influential in several fields impinging on [[Cognition|cognitive science]], including [[Learning|machine learning]], [https://en.wikipedia.org/wiki/Mathematical_statistics mathematical statistics], [https://en.wikipedia.org/wiki/Neuroscience neuroscience] and [[Psychology|psychology]] - he has articulated a view in which [[Neural Networks|neural computation]] is akin to a [https://en.wikipedia.org/wiki/Bayesian_inference Bayesian inference] process <ref>[https://cognitivesciencesociety.org/rumelhart-prize/  2012 Recipient Peter Dayan] | [https://en.wikipedia.org/wiki/Rumelhart_Prize The David E. Rumelhart Prize 2012]</ref>. His research centers around [[Supervised Learning|self-supervised learning]], [[Reinforcement Learning|reinforcement learning]], [[Temporal Difference Learning|temporal difference learning]] and [https://en.wikipedia.org/wiki/Neural_coding#Population_coding population coding].  
+
Peter Dayan's work has been influential in several fields impinging on [[Cognition|cognitive science]], including [[Learning|machine learning]], [https://en.wikipedia.org/wiki/Mathematical_statistics mathematical statistics], [https://en.wikipedia.org/wiki/Neuroscience neuroscience] and [[Psychology|psychology]] - he has articulated a view in which [[Neural Networks|neural computation]] is akin to a [https://en.wikipedia.org/wiki/Bayesian_inference Bayesian inference] process <ref>[https://cognitivesciencesociety.org/rumelhart-prize/  2012 Recipient Peter Dayan] | [https://en.wikipedia.org/wiki/Rumelhart_Prize The David E. Rumelhart Prize 2012]</ref>. His research centers around [[Supervised Learning|self-supervised learning]], [[Reinforcement Learning|reinforcement learning]], [[Temporal Difference Learning|temporal difference learning]], [https://en.wikipedia.org/wiki/Neural_coding#Population_coding population coding] and [[Monte-Carlo Tree Search|Monte-Carlo tree search]]. He researched and published on [https://en.wikipedia.org/wiki/Q-learning Q-learning] with [[Chris Watkins]] <ref> [[Chris Watkins]], [[Peter Dayan]] ('''1992'''). ''[http://www.gatsby.ucl.ac.uk/~dayan/papers/wd92.html Q-learning]''. [https://en.wikipedia.org/wiki/Machine_Learning_(journal) Machine Learning], Vol. 8, No. 2</ref>,  
He researched and published on [https://en.wikipedia.org/wiki/Q-learning Q-learning] with [[Chris Watkins]] <ref> [[Chris Watkins]], [[Peter Dayan]] ('''1992'''). ''[http://www.gatsby.ucl.ac.uk/~dayan/papers/wd92.html Q-learning]''. [https://en.wikipedia.org/wiki/Machine_Learning_(journal) Machine Learning], Vol. 8, No. 2</ref>,  
 
 
and provided a proof of convergence of [[Temporal Difference Learning#TDLamba|TD(λ)]] for arbitrary λ <ref>[[Peter Dayan]] ('''1992'''). ''[https://link.springer.com/article/10.1023/A:1022632907294 The convergence of TD (λ) for general λ]''. [https://en.wikipedia.org/wiki/Machine_Learning_(journal) Machine Learning], Vol. 8, No. 3</ref>.  
 
and provided a proof of convergence of [[Temporal Difference Learning#TDLamba|TD(λ)]] for arbitrary λ <ref>[[Peter Dayan]] ('''1992'''). ''[https://link.springer.com/article/10.1023/A:1022632907294 The convergence of TD (λ) for general λ]''. [https://en.wikipedia.org/wiki/Machine_Learning_(journal) Machine Learning], Vol. 8, No. 3</ref>.  
  
Line 23: Line 22:
 
* [[Chris Watkins]], [[Peter Dayan]] ('''1992'''). ''[http://www.gatsby.ucl.ac.uk/~dayan/papers/wd92.html Q-learning]''. [https://en.wikipedia.org/wiki/Machine_Learning_(journal) Machine Learning], Vol. 8, No. 2
 
* [[Chris Watkins]], [[Peter Dayan]] ('''1992'''). ''[http://www.gatsby.ucl.ac.uk/~dayan/papers/wd92.html Q-learning]''. [https://en.wikipedia.org/wiki/Machine_Learning_(journal) Machine Learning], Vol. 8, No. 2
 
* [[Peter Dayan]] ('''1992'''). ''[https://www.researchgate.net/publication/227208155_The_Convergence_of_TDl_for_General_l The convergence of TD (λ) for general λ]''. [https://en.wikipedia.org/wiki/Machine_Learning_(journal) Machine Learning], Vol. 8, No. 3
 
* [[Peter Dayan]] ('''1992'''). ''[https://www.researchgate.net/publication/227208155_The_Convergence_of_TDl_for_General_l The convergence of TD (λ) for general λ]''. [https://en.wikipedia.org/wiki/Machine_Learning_(journal) Machine Learning], Vol. 8, No. 3
* [[Peter Dayan]], [[Mathematician#GEHinton|Geoffrey E. Hinton]] ('''1992'''). ''Feudal reinforcement learning''. [https://papers.nips.cc/book/advances-in-neural-information-processing-systems-5-1992 NIPS 1992], [http://www.gatsby.ucl.ac.uk/~Dayan/papers/dh93.pdf pdf]
+
* [[Peter Dayan]], [[Mathematician#GEHinton|Geoffrey E. Hinton]] ('''1992'''). ''[https://papers.nips.cc/paper/714-feudal-reinforcement-learning Feudal reinforcement learning]''. [https://papers.nips.cc/book/advances-in-neural-information-processing-systems-5-1992 NIPS 1992]
 
* [[Peter Dayan]] ('''1993'''). ''Improving generalisation for temporal difference learning: The successor representation''. [https://en.wikipedia.org/wiki/Neural_Computation_(journal) Neural Computation], Vol. 5, [http://www.gatsby.ucl.ac.uk/~dayan/papers/sr93.pdf pdf]
 
* [[Peter Dayan]] ('''1993'''). ''Improving generalisation for temporal difference learning: The successor representation''. [https://en.wikipedia.org/wiki/Neural_Computation_(journal) Neural Computation], Vol. 5, [http://www.gatsby.ucl.ac.uk/~dayan/papers/sr93.pdf pdf]
 
* [[Nicol N. Schraudolph]], [[Peter Dayan]], [[Terrence J. Sejnowski]] ('''1993'''). ''[https://papers.nips.cc/paper/820-temporal-difference-learning-of-position-evaluation-in-the-game-of-go Temporal Difference Learning of Position Evaluation in the Game of Go]''. [https://papers.nips.cc/book/advances-in-neural-information-processing-systems-6-1993 NIPS 1993]
 
* [[Nicol N. Schraudolph]], [[Peter Dayan]], [[Terrence J. Sejnowski]] ('''1993'''). ''[https://papers.nips.cc/paper/820-temporal-difference-learning-of-position-evaluation-in-the-game-of-go Temporal Difference Learning of Position Evaluation in the Game of Go]''. [https://papers.nips.cc/book/advances-in-neural-information-processing-systems-6-1993 NIPS 1993]

Latest revision as of 20:35, 11 June 2019

Home * People * Peter Dayan

Peter Dayan [1]

Peter Dayan,
a British mathematician, computer scientist and neuroscientist, and director at the Max Planck Institute for Biological Cybernetics in Tübingen, Germany, since early 2019 also affiliated with the SMARTStart training program of the Bernstein Network Computational Neuroscience [2] [3]. From 1998 until 2018, he was professor of computational neuroscience at University College London, and director of UCL's Gatsby Computational Neuroscience Unit [4].

Peter Dayan obtained a B.Sc. in mathematics from University of Cambridge and a Ph.D. in artificial intelligence from University of Edinburgh under David Wallace, which focused on Bayesian network and neural network models of machine learning [5]. He was postdoctoral researcher at the Salk Institute for Biological Studies working with Terrence J. Sejnowski, and at the University of Toronto with Geoffrey E. Hinton, and was further assistant professor at MIT before relocating to UCL.

Work

Peter Dayan's work has been influential in several fields impinging on cognitive science, including machine learning, mathematical statistics, neuroscience and psychology - he has articulated a view in which neural computation is akin to a Bayesian inference process [6]. His research centers around self-supervised learning, reinforcement learning, temporal difference learning, population coding and Monte-Carlo tree search. He researched and published on Q-learning with Chris Watkins [7], and provided a proof of convergence of TD(λ) for arbitrary λ [8].

Learning Go

Along with Nicol N. Schraudolph and Terrence J. Sejnowski, Peter Dayan worked and published on temporal difference learning to evaluate positions in Go [9] [10].

Selected Publications

[11]

1990 ...

2000 ...

2010 ...

External Links

References

Up one level