Changes

Jump to: navigation, search

Ari Weinstein

4,853 bytes added, 19:48, 8 June 2018
Created page with "'''Home * People * Ari Weinstein''' '''Ari Weinstein''',<br/> an American computer scientist at Google DeepMind. He holds a Ph.D. from [https://en.w..."
'''[[Main Page|Home]] * [[People]] * Ari Weinstein'''

'''Ari Weinstein''',<br/>
an American computer scientist at [[Google]] [[DeepMind]]. He holds a Ph.D. from [https://en.wikipedia.org/wiki/Rutgers_University Rutgers University] under advisor [[Michael L. Littman]] on ''Local Planning For Continuous Markov Decision Processes'', covering algorithms that create plans to maximize a numeric reward over time. While a general formulation of this problem in terms of
[[Reinforcement Learning|reinforcement learning]] has traditionally been restricted to small discrete domains, Weinstein thesis include both continuous and high dimensional domains, with simulations of swimming, riding a bicycle, and walking as concrete examples.

<span id="FSSS-Minimax"></span>
=FSSS-Minimax=
In their paper ''Rollout-based Game-tree Search Outprunes Traditional Alpha-beta'', along with [[Sergiu Goschin]] and his advisor Michael Littman <ref>[[Ari Weinstein]], [[Michael L. Littman]], [[Sergiu Goschin]] ('''2013'''). ''[http://proceedings.mlr.press/v24/weinstein12a.html Rollout-based Game-tree Search Outprunes Traditional Alpha-beta]''. [http://proceedings.mlr.press/ PMLR], Vol. 24</ref>, Weinstein introduce the rollout-based ''FSSS'' (Forward-search sparse sampling) <ref>[[Thomas J. Walsh]], [[Sergiu Goschin]], [[Michael L. Littman]] ('''2010'''). ''Integrating sample-based planning and model-based reinforcement learning.'' [[Conferences#AAAI-2010|AAAI-2010]], [https://pdfs.semanticscholar.org/bdc9/bfb6ecc6fb5afb684df03d7220c46ebdbf4e.pdf pdf]</ref> applied to game-tree [[Search|search]], outpruning [[Alpha-Beta|alpha-beta]] both empirically and formally. FSSS-Minimax only visits parts of the tree that alpha-beta visits, and is in terms of related work similar to the ''Score Bounded'' [[Monte-Carlo Tree Search]] introduced by [[Tristan Cazenave]] and [[Abdallah Saffidine]] <ref>[[Tristan Cazenave]], [[Abdallah Saffidine]] ('''2010'''). ''Score Bounded Monte-Carlo Tree Search''. [[CG 2010]], [http://www.lamsade.dauphine.fr/%7Ecazenave/papers/mcsolver.pdf pdf]</ref>.

Recently, rollout-based planning and search methods have emerged as an alternative to traditional tree-search methods. The fundamental operation in rollout-based tree search is the generation of trajectories in the search tree from root to leaf. Game-playing programs based on Monte-Carlo rollouts methods such as โ€œ[[UCT]]โ€ have proven remarkably effective at using information from trajectories to make state-of-the-art decisions at the root. In this paper, we show that trajectories can be used to prune more aggressively than classical alpha-beta search. We modify a rollout-based method, FSSS, to allow for use in game-tree search and show it outprunes alpha-beta both empirically and formally.

While FSSS-Minimax is guaranteed to never expand more leaves than alpha-beta, the [[Best-First|best-first]] approach comes at a cost in terms of memory requirements as well as computational cost.

=Selected Publications=
<ref>[http://dblp.uni-trier.de/pers/hd/w/Weinstein:Ari dblp: Ari Weinstein]</ref> <ref>[http://cs.brown.edu/~mlittman/theses/ Doctoral Dissertations Advised by Michael Littman]</ref>
* [[Ari Weinstein]], [[Michael L. Littman]] ('''2012'''). ''[https://www.aaai.org/ocs/index.php/ICAPS/ICAPS12/paper/view/4697 Bandit-Based Planning and Learning in Continuous-Action Markov Decision Processes]''. [http://dblp.uni-trier.de/db/conf/aips/icaps2012.html ICAPS 2012]
* [[Ari Weinstein]], [[Michael L. Littman]], [[Sergiu Goschin]] ('''2013'''). ''[http://proceedings.mlr.press/v24/weinstein12a.html Rollout-based Game-tree Search Outprunes Traditional Alpha-beta]''. [http://proceedings.mlr.press/ PMLR], Vol. 24 ยป [[Monte-Carlo Tree Search|MCTS]], [[UCT]]
* [[Sergiu Goschin]], [[Ari Weinstein]], [[Michael L. Littman]] ('''2013'''). ''The Cross-Entropy Method Optimizes for Quantiles''. [http://dblp.uni-trier.de/db/conf/icml/icml2013.html ICML 2013]
* [[Ari Weinstein]], [[Michael L. Littman]] ('''2013'''). ''Open-Loop Planning in Large-Scale Stochastic Domains''. [[Conferences#AAAI-2013|AAAI-2013]]
* [[Ari Weinstein]] ('''2013'''). ''Local Planning For Continuous Markov Decision Processes''. Ph.D. thesis, [https://en.wikipedia.org/wiki/Rutgers_University Rutgers University], advisor [[Michael L. Littman]], [http://cs.brown.edu/~mlittman/theses/weinstein.pdf pdf]
* [[Ari Weinstein]], [[Matthew Botvinick]] ('''2017'''). ''Structure Learning in Motor Control: A Deep Reinforcement Learning Model''. [https://arxiv.org/abs/1706.06827 arXiv:1706.06827]

=External Links=
* [https://scholar.google.com/citations?user=MnUboHYAAAAJ&hl=en Ari Weinstein - Google Scholar Citations]
* [https://genealogy.math.ndsu.nodak.edu/id.php?id=186285 Ari Weinstein - The Mathematics Genealogy Project]

=References=
<references />

'''[[People|Up one Level]]'''

Navigation menu