Revision as of 13:11, 7 March 2019

Home * People * Yasuhiro Osaki

Yasuhiro Osaki ^[1]

Yasuhiro Osaki,
a Japanese software engineer and computer scientist at Sony. Until 2010, Yasuhiro Osaki was affiliated with the laboratory of professor Yoshiyuki Kotani at the Tokyo University of Agriculture and Technology.

TD(λ)-MC

Yasuhiro Osaki's research was about reinforcement learning and the application of TD(λ) based on Monte-Carlo simulations in computer games. The program committee of the 12th Game Programming Workshop 2007 gave the best presentation award to Yasuhiro Osaki on TD(λ)-MC, a reinforcement learning approach with Monte-carlo simulations ^[2] ^[3].

Selected Publications

^[4]

Yasuhiro Osaki, Kazutomo Shibahara, Yasuhiro Tajima, Yoshiyuki Kotani (2007). Reinforcement Learning of Evaluation Functions Using Temporal Difference-Monte Carlo learning method. 12th Game Programming Workshop
Yasuhiro Osaki, Kazutomo Shibahara, Yasuhiro Tajima, Yoshiyuki Kotani (2008). An Othello Evaluation Function Based on Temporal Difference Learning using Probability of Winning. CIG'08, pdf
Yasuhiro Osaki, Yoshiyuki Kotani (2009). A Learning Method of Evaluation Function Based on Selective Simulations. 14th Game Programming Workshop

External Links

References

↑ YasuhiroOsaki (Yasuhiro Osaki) · GitHub
↑ Yasuhiro Osaki, Kazutomo Shibahara, Yasuhiro Tajima, Yoshiyuki Kotani (2007). Reinforcement Learning of Evaluation Functions Using Temporal Difference-Monte Carlo learning method. 12th Game Programming Workshop
↑ TD-Lamda from Wikipedia
↑ dblp: Yasuhiro Osaki

Up one level

[1] YasuhiroOsaki (Yasuhiro Osaki) · GitHub

[2] Yasuhiro Osaki, Kazutomo Shibahara, Yasuhiro Tajima, Yoshiyuki Kotani (2007). Reinforcement Learning of Evaluation Functions Using Temporal Difference-Monte Carlo learning method. 12th Game Programming Workshop

[3] TD-Lamda from Wikipedia

[4] : Yasuhiro Osaki

[1]

[2]

[3]

[4]

@@ Line 7: / Line 7: @@
 =TD(λ)-MC=
-Yasuhiro Osaki's research was about [[Reinforcement Learning|reinforcement learning]] and the application of [[Temporal Difference Learning|TD(λ)]] based on [https://en.wikipedia.org/wiki/Monte_Carlo_method Monte-Carlo simulations] in computer games. The program committee of the [[Conferences#GPW|12th Game Programming Workshop 2007]] gave the best presentation award to Yasuhiro Osaki on '''TD(λ)-MC''', a reinforcement learning approach with Monte-carlo simulations <ref>[[Yasuhiro Osaki]], [[Kazutomo Shibahara]], [[Yasuhiro Tajima]], [[Yoshiyuki Kotani]] ('''2007'''). ''Reinforcement Learning of Evaluation Functions Using Temporal Difference-Monte Carlo learning method''. [[Conferences#GPW|12th Game Programming Workshop]], [http://www.tuat.ac.jp/~kotani/index.php?plugin=attach&pcmd=open&file=osaki0711TDMC-GPW%29.pdf&refer=lab%2Fpapers%2Fdepot pdf] (Japanese)</ref> <ref>[https://en.wikipedia.org/wiki/Temporal_difference_learning#Mathematical_formulation TD-Lamda from Wikipedia]</ref>.
+Yasuhiro Osaki's research was about [[Reinforcement Learning|reinforcement learning]] and the application of [[Temporal Difference Learning|TD(λ)]] based on [https://en.wikipedia.org/wiki/Monte_Carlo_method Monte-Carlo simulations] in computer games. The program committee of the [[Conferences#GPW|12th Game Programming Workshop 2007]] gave the best presentation award to Yasuhiro Osaki on '''TD(λ)-MC''', a reinforcement learning approach with Monte-carlo simulations <ref>[[Yasuhiro Osaki]], [[Kazutomo Shibahara]], [[Yasuhiro Tajima]], [[Yoshiyuki Kotani]] ('''2007'''). ''Reinforcement Learning of Evaluation Functions Using Temporal Difference-Monte Carlo learning method''. [[Conferences#GPW|12th Game Programming Workshop]]</ref> <ref>[https://en.wikipedia.org/wiki/Temporal_difference_learning#Mathematical_formulation TD-Lamda from Wikipedia]</ref>.
 =Selected Publications=

Difference between revisions of "Yasuhiro Osaki"

Revision as of 13:11, 7 March 2019

Contents

TD(λ)-MC

Selected Publications

External Links

References

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools