Changes

Jump to: navigation, search

Tor Lattimore

1 byte added, 17:52, 17 January 2019
no edit summary
* [[Tor Lattimore]], [[Marcus Hutter]] ('''2012'''). ''PAC Bounds for Discounted MDPs''. [http://www.informatik.uni-trier.de/~ley/db/conf/alt/alt2012.htm Algorithmic Learning Theory], [https://en.wikipedia.org/wiki/Lecture_Notes_in_Computer_Science Lecture Notes in Computer Science], [https://en.wikipedia.org/wiki/Springer-Verlag Springer] <ref>[https://en.wikipedia.org/wiki/Markov_decision_process Markov decision process from Wikipedia]</ref>
* [[Tor Lattimore]], [[Marcus Hutter]] ('''2014'''). ''[https://link.springer.com/chapter/10.1007/978-3-319-11662-4_13 Bayesian Reinforcement Learning with Exploration]''. [http://dblp.uni-trier.de/db/conf/alt/alt2014.html Algorithmic Learning Theory], [https://en.wikipedia.org/wiki/Lecture_Notes_in_Computer_Science Lecture Notes in Computer Science] 8776, [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer]
* [[Tor Lattimore]], [[Remi Rémi Munos]] ('''2014'''). ''Bounded Regret for Finite-Armed Structured Bandits''. [https://arxiv.org/abs/1411.2919 arXiv:1411.2919]
==2015 ...==
* [[Tor Lattimore]] ('''2015'''). ''Optimally Confident UCB: Improved Regret for Finite-Armed Bandits''. [https://arxiv.org/abs/1507.07880 arXiv:1507.07880]

Navigation menu