Changes

Jump to: navigation, search

Peter Dayan

65 bytes added, 21:35, 11 June 2019
no edit summary
'''[[Main Page|Home]] * [[People]] * Peter Dayan'''
[[FILE:Peter Dayan Royal Society.jpg|border|right|thumb|240px| Peter Dayan <ref>Peter Dayan at the [https://en.wikipedia.org/wiki/Royal_Society Royal Society] [https://en.wikipedia.org/wiki/Fellow_of_the_Royal_Society#Admission admissions day], [https://en.wikipedia.org/wiki/London London], July 13, 2018, by [https://commons.wikimedia.org/wiki/User:Duncan.Hull Duncan.Hull], [https://en.wikipedia.org/wiki/Wikimedia_Commons Wikimedia Commons]</ref> ]]
'''Peter Dayan''',<br/>
=Work=
Peter Dayan's work has been influential in several fields impinging on [[Cognition|cognitive science]], including [[Learning|machine learning]], [https://en.wikipedia.org/wiki/Mathematical_statistics mathematical statistics], [https://en.wikipedia.org/wiki/Neuroscience neuroscience] and [[Psychology|psychology]] - he has articulated a view in which [[Neural Networks|neural computation]] is akin to a [https://en.wikipedia.org/wiki/Bayesian_inference Bayesian inference] process <ref>[https://cognitivesciencesociety.org/rumelhart-prize/ 2012 Recipient Peter Dayan] | [https://en.wikipedia.org/wiki/Rumelhart_Prize The David E. Rumelhart Prize 2012]</ref>. His research centers around [[Supervised Learning|self-supervised learning]], [[Reinforcement Learning|reinforcement learning]], [[Temporal Difference Learning|temporal difference learning]] and , [https://en.wikipedia.org/wiki/Neural_coding#Population_coding population coding] and [[Monte-Carlo Tree Search|Monte-Carlo tree search]]. He researched and published on [https://en.wikipedia.org/wiki/Q-learning Q-learning] with [[Chris Watkins]] <ref> [[Chris Watkins]], [[Peter Dayan]] ('''1992'''). ''[http://www.gatsby.ucl.ac.uk/~dayan/papers/wd92.html Q-learning]''. [https://en.wikipedia.org/wiki/Machine_Learning_(journal) Machine Learning], Vol. 8, No. 2</ref>,
and provided a proof of convergence of [[Temporal Difference Learning#TDLamba|TD(λ)]] for arbitrary λ <ref>[[Peter Dayan]] ('''1992'''). ''[https://link.springer.com/article/10.1023/A:1022632907294 The convergence of TD (λ) for general λ]''. [https://en.wikipedia.org/wiki/Machine_Learning_(journal) Machine Learning], Vol. 8, No. 3</ref>.
* [[Chris Watkins]], [[Peter Dayan]] ('''1992'''). ''[http://www.gatsby.ucl.ac.uk/~dayan/papers/wd92.html Q-learning]''. [https://en.wikipedia.org/wiki/Machine_Learning_(journal) Machine Learning], Vol. 8, No. 2
* [[Peter Dayan]] ('''1992'''). ''[https://www.researchgate.net/publication/227208155_The_Convergence_of_TDl_for_General_l The convergence of TD (λ) for general λ]''. [https://en.wikipedia.org/wiki/Machine_Learning_(journal) Machine Learning], Vol. 8, No. 3
* [[Peter Dayan]], [[Mathematician#GEHinton|Geoffrey E. Hinton]] ('''1992'''). ''[https://papers.nips.cc/paper/714-feudal-reinforcement-learning Feudal reinforcement learning]''. [https://papers.nips.cc/book/advances-in-neural-information-processing-systems-5-1992 NIPS 1992], [http://www.gatsby.ucl.ac.uk/~Dayan/papers/dh93.pdf pdf]
* [[Peter Dayan]] ('''1993'''). ''Improving generalisation for temporal difference learning: The successor representation''. [https://en.wikipedia.org/wiki/Neural_Computation_(journal) Neural Computation], Vol. 5, [http://www.gatsby.ucl.ac.uk/~dayan/papers/sr93.pdf pdf]
* [[Nicol N. Schraudolph]], [[Peter Dayan]], [[Terrence J. Sejnowski]] ('''1993'''). ''[https://papers.nips.cc/paper/820-temporal-difference-learning-of-position-evaluation-in-the-game-of-go Temporal Difference Learning of Position Evaluation in the Game of Go]''. [https://papers.nips.cc/book/advances-in-neural-information-processing-systems-6-1993 NIPS 1993]

Navigation menu