Changes

Jump to: navigation, search

Jonathan Baxter

183 bytes added, 12:25, 23 June 2018
no edit summary
* [[Jonathan Baxter]], [[Peter Bartlett]] ('''2000'''). ''Reinforcement Learning on POMDPs via Direct Gradient Ascent''. [http://dblp.uni-trier.de/db/conf/icml/icml2000.html ICML 2000], [https://pdfs.semanticscholar.org/b874/98f0879d312c308889135203b17069aa0486.pdf pdf]
* [[Jonathan Baxter]], [[Peter Bartlett]] ('''2001'''). ''Infinite-Horizon Policy-Gradient Estimation''. [https://en.wikipedia.org/wiki/Journal_of_Artificial_Intelligence_Research Journal of Artificial Intelligence Research] 15, [http://neuro.bstu.by/ai/To-dom/My_research/Papers-2.0/STDP/Reinforcement-L/A/Refs/baxter01a.pdf pdf]
* [[Lex Weaver]], [[Jonathan Baxter]] ('''2001'''). ''STD (λ): learning state differences with TD (λ)''. [http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.20.7737 CiteSeerX]
* [https://www.linkedin.com/in/dougaberdeen/ Douglas Aberdeen], [[Jonathan Baxter]] ('''2001'''). ''[https://onlinelibrary.wiley.com/doi/abs/10.1002/cpe.549 Emmerald: a fast matrix-matrix multiply using Intel's SSE instructions]''. [https://onlinelibrary.wiley.com/journal/15320634 Concurrency and Computation: Practice and Experience], Vol. 13, No. 2
==2010 ...==

Navigation menu