Latest revision as of 14:20, 7 March 2019

Home * People * Yasuhiro Osaki

Yasuhiro Osaki ^[1]

Yasuhiro Osaki,
a Japanese software engineer and computer scientist at Sony. Until 2010, Yasuhiro Osaki was affiliated with the laboratory of professor Yoshiyuki Kotani at the Tokyo University of Agriculture and Technology.

TD(λ)-MC

Yasuhiro Osaki's research was about reinforcement learning and the application of TD(λ) based on Monte-Carlo simulations in computer games. The program committee of the 12th Game Programming Workshop 2007 gave the best presentation award to Yasuhiro Osaki on TD(λ)-MC, a reinforcement learning approach with Monte-carlo simulations ^[2] ^[3].

Selected Publications

^[4]

Yasuhiro Osaki, Kazutomo Shibahara, Yasuhiro Tajima, Yoshiyuki Kotani (2007). Reinforcement Learning of Evaluation Functions Using Temporal Difference-Monte Carlo learning method. 12th Game Programming Workshop
Yasuhiro Osaki, Kazutomo Shibahara, Yasuhiro Tajima, Yoshiyuki Kotani (2008). An Othello Evaluation Function Based on Temporal Difference Learning using Probability of Winning. CIG'08, pdf
Yasuhiro Osaki, Yoshiyuki Kotani (2009). A Learning Method of Evaluation Function Based on Selective Simulations. 14th Game Programming Workshop

External Links

References

↑ YasuhiroOsaki (Yasuhiro Osaki) · GitHub
↑ Yasuhiro Osaki, Kazutomo Shibahara, Yasuhiro Tajima, Yoshiyuki Kotani (2007). Reinforcement Learning of Evaluation Functions Using Temporal Difference-Monte Carlo learning method. 12th Game Programming Workshop
↑ TD-Lamda from Wikipedia
↑ dblp: Yasuhiro Osaki

Up one level

[1] YasuhiroOsaki (Yasuhiro Osaki) · GitHub

[2] Yasuhiro Osaki, Kazutomo Shibahara, Yasuhiro Tajima, Yoshiyuki Kotani (2007). Reinforcement Learning of Evaluation Functions Using Temporal Difference-Monte Carlo learning method. 12th Game Programming Workshop

[3] TD-Lamda from Wikipedia

[4] : Yasuhiro Osaki

[1]

[2]

[3]

[4]

@@ Line 7: / Line 7: @@
 =TD(λ)-MC=
-Yasuhiro Osaki's research was about [[Reinforcement Learning|reinforcement learning]] and the application of [[Temporal Difference Learning|TD(λ)]] based on [https://en.wikipedia.org/wiki/Monte_Carlo_method Monte-Carlo simulations] in computer games. The program committee of the [[Conferences#GPW|12th Game Programming Workshop 2007]] gave the best presentation award to Yasuhiro Osaki on '''TD(λ)-MC''', a reinforcement learning approach with Monte-carlo simulations <ref>[[Yasuhiro Osaki]], [[Kazutomo Shibahara]], [[Yasuhiro Tajima]], [[Yoshiyuki Kotani]] ('''2007'''). ''Reinforcement Learning of Evaluation Functions Using Temporal Difference-Monte Carlo learning method''. [[Conferences#GPW|12th Game Programming Workshop]], [http://www.tuat.ac.jp/~kotani/index.php?plugin=attach&pcmd=open&file=osaki0711TDMC-GPW%29.pdf&refer=lab%2Fpapers%2Fdepot pdf] (Japanese)</ref> <ref>[https://en.wikipedia.org/wiki/Temporal_difference_learning#Mathematical_formulation TD-Lamda from Wikipedia]</ref>.
+Yasuhiro Osaki's research was about [[Reinforcement Learning|reinforcement learning]] and the application of [[Temporal Difference Learning#TDLamba|TD(λ)]] based on [https://en.wikipedia.org/wiki/Monte_Carlo_method Monte-Carlo simulations] in computer games. The program committee of the [[Conferences#GPW12|12th Game Programming Workshop 2007]] gave the best presentation award to Yasuhiro Osaki on '''TD(λ)-MC''', a reinforcement learning approach with Monte-carlo simulations <ref>[[Yasuhiro Osaki]], [[Kazutomo Shibahara]], [[Yasuhiro Tajima]], [[Yoshiyuki Kotani]] ('''2007'''). ''Reinforcement Learning of Evaluation Functions Using Temporal Difference-Monte Carlo learning method''. [[Conferences#GPW12|12th Game Programming Workshop]]</ref> <ref>[https://en.wikipedia.org/wiki/Temporal_difference_learning#Mathematical_formulation TD-Lamda from Wikipedia]</ref>.
 =Selected Publications=
 <ref>[https://dblp.uni-trier.de/pers/hd/o/Osaki:Yasuhiro dblp: Yasuhiro Osaki]</ref>
-* [[Yasuhiro Osaki]], [[Kazutomo Shibahara]], [[Yasuhiro Tajima]], [[Yoshiyuki Kotani]] ('''2007'''). ''Reinforcement Learning of Evaluation Functions Using Temporal Difference-Monte Carlo learning method''. [[Conferences#GPW|12th Game Programming Workshop]]
+* [[Yasuhiro Osaki]], [[Kazutomo Shibahara]], [[Yasuhiro Tajima]], [[Yoshiyuki Kotani]] ('''2007'''). ''Reinforcement Learning of Evaluation Functions Using Temporal Difference-Monte Carlo learning method''. [[Conferences#GPW12|12th Game Programming Workshop]]
 * [[Yasuhiro Osaki]], [[Kazutomo Shibahara]], [[Yasuhiro Tajima]], [[Yoshiyuki Kotani]] ('''2008'''). ''An Othello Evaluation Function Based on Temporal Difference Learning using Probability of Winning''. [http://www.csse.uwa.edu.au/cig08/Proceedings/toc.html CIG'08], [http://www.csse.uwa.edu.au/cig08/Proceedings/papers/8010.pdf pdf]
-* [[Yasuhiro Osaki]], [[Yoshiyuki Kotani]] ('''2009'''). ''A Learning Method of Evaluation Function Based on Selective Simulations''. [[Conferences#GPW|14th Game Programming Workshop]]
+* [[Yasuhiro Osaki]], [[Yoshiyuki Kotani]] ('''2009'''). ''A Learning Method of Evaluation Function Based on Selective Simulations''. [[Conferences#GPW14|14th Game Programming Workshop]]
 =External Links=

Difference between revisions of "Yasuhiro Osaki"

Latest revision as of 14:20, 7 March 2019

Contents

TD(λ)-MC

Selected Publications

External Links

References

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools