Changes

Jump to: navigation, search

Arthur Guez

15 bytes added, 12:10, 4 June 2018
no edit summary
a Canadian computer and neuro scientist, currently researcher at [[Google]] [[DeepMind]] with expertise in [[Learning|machine learning]], in particular [[Deep Learning|deep learning]], and involved in the [[AlphaGo]] and [[AlphaZero]] projects. He holds a M.Sc. in machine learning from [[McGill University]] in 2010 and a Ph.D. from ''Gatsby Computational Neuroscience Unit'' at [https://en.wikipedia.org/wiki/University_College_London University College London] in 2015 titled ''Sample-based Search Methods for Bayes-Adaptive Planning'', where he was supervised by [[Peter Dayan]] and [[David Silver]].
=Ph.D. Thesis=
In his Ph.D. thesis, Arthur Guez elaborates on [[Search|search]] and [[Planning|planning]] methods in the face of [https://en.wikipedia.org/wiki/Uncertainty uncertainty] about the environment inducing the [https://en.wikipedia.org/wiki/Exploration exploration] versus [https://en.wikipedia.org/wiki/Exploitation exploitation] trade-off of an [https://en.wikipedia.org/wiki/Agent-based_model agent-based model] to [https://en.wikipedia.org/wiki/Optimization_problem optimize] the return by maintaining a [https://en.wikipedia.org/wiki/Posterior_probability posterior distribution] over possible environments considering all possible future paths. This optimization is equivalent to solving a [https://en.wikipedia.org/wiki/Markov_decision_process Markov decision process] (MDP) whose hyperstate comprises the agent’s beliefs about the environment, as well as its current state in that environment - the corresponding process is called a [https://en.wikipedia.org/wiki/Bayes%27_theorem Bayes-Adaptive] MDP (BAMDP), also using a tailored [[Monte-Carlo Tree Search|Monte-Carlo tree search]]. In ''historical notes on Bayesian Adaptive control'', Arthur Guez mentions [[Mathematician#AWald|Abraham Wald's]] [[Match Statistics#SPRT|Sequential Probability Ratio Test (SPRT)]] <ref>[[Mathematician#AWald|Abraham Wald]] ('''1945'''). ''Sequential Tests of Statistical Hypotheses''. [https://en.wikipedia.org/wiki/Annals_of_Mathematical_Statistics Annals of Mathematical Statistics], Vol. 16, No. 2, [https://en.wikipedia.org/wiki/Digital_object_identifier doi]: [http://projecteuclid.org/euclid.aoms/1177731118 10.1214/aoms/1177731118]</ref>, and that [[Alan Turing]] assisted by [[Jack Good]] used a similar sequential testing technique to help decipher [https://en.wikipedia.org/wiki/Enigma_machine enigma codes] at [https://en.wikipedia.org/wiki/Bletchley_Park Bletchley Park] <ref>[[Jack Good]] ('''1979'''). ''[https://www.jstor.org/stable/2335677 Studies in the history of probability and statistics. XXXVII AM Turing’s statistical work in World War II]''. [https://en.wikipedia.org/wiki/Biometrika Biometrika], Vol. 66, No. 2</ref> <ref>[[Arthur Guez]] ('''2015'''). ''Sample-based Search Methods for Bayes-Adaptive Planning''. Ph.D. thesis, Gatsby Computational Neuroscience Unit, [https://en.wikipedia.org/wiki/University_College_London University College London], [http://www.gatsby.ucl.ac.uk/~aguez/files/guez_phdthesis2015.pdf pdf]</ref>.

Navigation menu