Changes

Jump to: navigation, search

Peter Dayan

52 bytes added, 20:35, 11 June 2019
no edit summary
=Work=
Peter Dayan's work has been influential in several fields impinging on [[Cognition|cognitive science]], including [[Learning|machine learning]], [https://en.wikipedia.org/wiki/Mathematical_statistics mathematical statistics], [https://en.wikipedia.org/wiki/Neuroscience neuroscience] and [[Psychology|psychology]] - he has articulated a view in which [[Neural Networks|neural computation]] is akin to a [https://en.wikipedia.org/wiki/Bayesian_inference Bayesian inference] process <ref>[https://cognitivesciencesociety.org/rumelhart-prize/ 2012 Recipient Peter Dayan] | [https://en.wikipedia.org/wiki/Rumelhart_Prize The David E. Rumelhart Prize 2012]</ref>. His research centers around [[Supervised Learning|self-supervised learning]], [[Reinforcement Learning|reinforcement learning]], [[Temporal Difference Learning|temporal difference learning]] and , [https://en.wikipedia.org/wiki/Neural_coding#Population_coding population coding] and [[Monte-Carlo Tree Search|Monte-Carlo tree search]]. He researched and published on [https://en.wikipedia.org/wiki/Q-learning Q-learning] with [[Chris Watkins]] <ref> [[Chris Watkins]], [[Peter Dayan]] ('''1992'''). ''[http://www.gatsby.ucl.ac.uk/~dayan/papers/wd92.html Q-learning]''. [https://en.wikipedia.org/wiki/Machine_Learning_(journal) Machine Learning], Vol. 8, No. 2</ref>,
and provided a proof of convergence of [[Temporal Difference Learning#TDLamba|TD(λ)]] for arbitrary λ <ref>[[Peter Dayan]] ('''1992'''). ''[https://link.springer.com/article/10.1023/A:1022632907294 The convergence of TD (λ) for general λ]''. [https://en.wikipedia.org/wiki/Machine_Learning_(journal) Machine Learning], Vol. 8, No. 3</ref>.

Navigation menu