Difference between revisions of "Yasuhiro Osaki"

From Chessprogramming wiki
Jump to: navigation, search
(Created page with "'''Home * People * Yasuhiro Osaki''' FILE:YasuhiroOsaki.png|border|right|thumb|link=https://github.com/YasuhiroOsaki/| Yasuhiro Osaki <ref>[https://github...")
 
Line 7: Line 7:
  
 
=TD(λ)-MC=
 
=TD(λ)-MC=
Yasuhiro Osaki's research was about [[Reinforcement Learning|reinforcement learning]] and the application of [[Temporal Difference Learning|TD(λ)]] based on [https://en.wikipedia.org/wiki/Monte_Carlo_method Monte-Carlo simulations] in computer games. The program committee of the [[Conferences#GPW|12th Game Programming Workshop 2007]] gave the best presentation award to Yasuhiro Osaki on '''TD(λ)-MC''', a reinforcement learning approach with Monte-carlo simulations <ref>[[Yasuhiro Osaki]], [[Kazutomo Shibahara]], [[Yasuhiro Tajima]], [[Yoshiyuki Kotani]] ('''2007'''). ''Reinforcement Learning of Evaluation Functions Using Temporal Difference-Monte Carlo learning method''. [[Conferences#GPW|12th Game Programming Workshop]], [http://www.tuat.ac.jp/~kotani/index.php?plugin=attach&pcmd=open&file=osaki0711TDMC-GPW%29.pdf&refer=lab%2Fpapers%2Fdepot pdf] (Japanese)</ref> <ref>[https://en.wikipedia.org/wiki/Temporal_difference_learning#Mathematical_formulation TD-Lamda from Wikipedia]</ref>.  
+
Yasuhiro Osaki's research was about [[Reinforcement Learning|reinforcement learning]] and the application of [[Temporal Difference Learning|TD(λ)]] based on [https://en.wikipedia.org/wiki/Monte_Carlo_method Monte-Carlo simulations] in computer games. The program committee of the [[Conferences#GPW|12th Game Programming Workshop 2007]] gave the best presentation award to Yasuhiro Osaki on '''TD(λ)-MC''', a reinforcement learning approach with Monte-carlo simulations <ref>[[Yasuhiro Osaki]], [[Kazutomo Shibahara]], [[Yasuhiro Tajima]], [[Yoshiyuki Kotani]] ('''2007'''). ''Reinforcement Learning of Evaluation Functions Using Temporal Difference-Monte Carlo learning method''. [[Conferences#GPW|12th Game Programming Workshop]]</ref> <ref>[https://en.wikipedia.org/wiki/Temporal_difference_learning#Mathematical_formulation TD-Lamda from Wikipedia]</ref>.  
  
 
=Selected Publications=  
 
=Selected Publications=  

Revision as of 13:11, 7 March 2019

Home * People * Yasuhiro Osaki

Yasuhiro Osaki [1]

Yasuhiro Osaki,
a Japanese software engineer and computer scientist at Sony. Until 2010, Yasuhiro Osaki was affiliated with the laboratory of professor Yoshiyuki Kotani at the Tokyo University of Agriculture and Technology.

TD(λ)-MC

Yasuhiro Osaki's research was about reinforcement learning and the application of TD(λ) based on Monte-Carlo simulations in computer games. The program committee of the 12th Game Programming Workshop 2007 gave the best presentation award to Yasuhiro Osaki on TD(λ)-MC, a reinforcement learning approach with Monte-carlo simulations [2] [3].

Selected Publications

[4]

External Links

References

  1. YasuhiroOsaki (Yasuhiro Osaki) · GitHub
  2. Yasuhiro Osaki, Kazutomo Shibahara, Yasuhiro Tajima, Yoshiyuki Kotani (2007). Reinforcement Learning of Evaluation Functions Using Temporal Difference-Monte Carlo learning method. 12th Game Programming Workshop
  3. TD-Lamda from Wikipedia
  4. dblp: Yasuhiro Osaki

Up one level