Difference between revisions of "Yasuhiro Osaki"
GerdIsenberg (talk | contribs) (Created page with "'''Home * People * Yasuhiro Osaki''' FILE:YasuhiroOsaki.png|border|right|thumb|link=https://github.com/YasuhiroOsaki/| Yasuhiro Osaki <ref>[https://github...") |
GerdIsenberg (talk | contribs) |
||
(2 intermediate revisions by the same user not shown) | |||
Line 7: | Line 7: | ||
=TD(λ)-MC= | =TD(λ)-MC= | ||
− | Yasuhiro Osaki's research was about [[Reinforcement Learning|reinforcement learning]] and the application of [[Temporal Difference Learning|TD(λ)]] based on [https://en.wikipedia.org/wiki/Monte_Carlo_method Monte-Carlo simulations] in computer games. The program committee of the [[Conferences# | + | Yasuhiro Osaki's research was about [[Reinforcement Learning|reinforcement learning]] and the application of [[Temporal Difference Learning#TDLamba|TD(λ)]] based on [https://en.wikipedia.org/wiki/Monte_Carlo_method Monte-Carlo simulations] in computer games. The program committee of the [[Conferences#GPW12|12th Game Programming Workshop 2007]] gave the best presentation award to Yasuhiro Osaki on '''TD(λ)-MC''', a reinforcement learning approach with Monte-carlo simulations <ref>[[Yasuhiro Osaki]], [[Kazutomo Shibahara]], [[Yasuhiro Tajima]], [[Yoshiyuki Kotani]] ('''2007'''). ''Reinforcement Learning of Evaluation Functions Using Temporal Difference-Monte Carlo learning method''. [[Conferences#GPW12|12th Game Programming Workshop]]</ref> <ref>[https://en.wikipedia.org/wiki/Temporal_difference_learning#Mathematical_formulation TD-Lamda from Wikipedia]</ref>. |
=Selected Publications= | =Selected Publications= | ||
<ref>[https://dblp.uni-trier.de/pers/hd/o/Osaki:Yasuhiro dblp: Yasuhiro Osaki]</ref> | <ref>[https://dblp.uni-trier.de/pers/hd/o/Osaki:Yasuhiro dblp: Yasuhiro Osaki]</ref> | ||
− | * [[Yasuhiro Osaki]], [[Kazutomo Shibahara]], [[Yasuhiro Tajima]], [[Yoshiyuki Kotani]] ('''2007'''). ''Reinforcement Learning of Evaluation Functions Using Temporal Difference-Monte Carlo learning method''. [[Conferences# | + | * [[Yasuhiro Osaki]], [[Kazutomo Shibahara]], [[Yasuhiro Tajima]], [[Yoshiyuki Kotani]] ('''2007'''). ''Reinforcement Learning of Evaluation Functions Using Temporal Difference-Monte Carlo learning method''. [[Conferences#GPW12|12th Game Programming Workshop]] |
* [[Yasuhiro Osaki]], [[Kazutomo Shibahara]], [[Yasuhiro Tajima]], [[Yoshiyuki Kotani]] ('''2008'''). ''An Othello Evaluation Function Based on Temporal Difference Learning using Probability of Winning''. [http://www.csse.uwa.edu.au/cig08/Proceedings/toc.html CIG'08], [http://www.csse.uwa.edu.au/cig08/Proceedings/papers/8010.pdf pdf] | * [[Yasuhiro Osaki]], [[Kazutomo Shibahara]], [[Yasuhiro Tajima]], [[Yoshiyuki Kotani]] ('''2008'''). ''An Othello Evaluation Function Based on Temporal Difference Learning using Probability of Winning''. [http://www.csse.uwa.edu.au/cig08/Proceedings/toc.html CIG'08], [http://www.csse.uwa.edu.au/cig08/Proceedings/papers/8010.pdf pdf] | ||
− | * [[Yasuhiro Osaki]], [[Yoshiyuki Kotani]] ('''2009'''). ''A Learning Method of Evaluation Function Based on Selective Simulations''. [[Conferences# | + | * [[Yasuhiro Osaki]], [[Yoshiyuki Kotani]] ('''2009'''). ''A Learning Method of Evaluation Function Based on Selective Simulations''. [[Conferences#GPW14|14th Game Programming Workshop]] |
=External Links= | =External Links= |
Latest revision as of 14:20, 7 March 2019
Home * People * Yasuhiro Osaki
Yasuhiro Osaki,
a Japanese software engineer and computer scientist at Sony. Until 2010, Yasuhiro Osaki was affiliated with the laboratory of professor Yoshiyuki Kotani at the Tokyo University of Agriculture and Technology.
TD(λ)-MC
Yasuhiro Osaki's research was about reinforcement learning and the application of TD(λ) based on Monte-Carlo simulations in computer games. The program committee of the 12th Game Programming Workshop 2007 gave the best presentation award to Yasuhiro Osaki on TD(λ)-MC, a reinforcement learning approach with Monte-carlo simulations [2] [3].
Selected Publications
- Yasuhiro Osaki, Kazutomo Shibahara, Yasuhiro Tajima, Yoshiyuki Kotani (2007). Reinforcement Learning of Evaluation Functions Using Temporal Difference-Monte Carlo learning method. 12th Game Programming Workshop
- Yasuhiro Osaki, Kazutomo Shibahara, Yasuhiro Tajima, Yoshiyuki Kotani (2008). An Othello Evaluation Function Based on Temporal Difference Learning using Probability of Winning. CIG'08, pdf
- Yasuhiro Osaki, Yoshiyuki Kotani (2009). A Learning Method of Evaluation Function Based on Selective Simulations. 14th Game Programming Workshop
External Links
References
- ↑ YasuhiroOsaki (Yasuhiro Osaki) · GitHub
- ↑ Yasuhiro Osaki, Kazutomo Shibahara, Yasuhiro Tajima, Yoshiyuki Kotani (2007). Reinforcement Learning of Evaluation Functions Using Temporal Difference-Monte Carlo learning method. 12th Game Programming Workshop
- ↑ TD-Lamda from Wikipedia
- ↑ dblp: Yasuhiro Osaki