Latest revision as of 22:41, 23 May 2019

Home * People * Lex Weaver

Lex Weaver,
an Australian computer scientist, since 2004 manager at the Australian Government, and before lecturer in the Department of Computer Science at the Australian National University, Canberra, Australia, where he already defended his B.Sc. and Ph.D. degrees in 1994 and 2003 respectively ^[1]. Lex Weaver researched on machine learning and in particular Temporal Difference Learning, and co-authored of the chess program KnightCap along with Jonathan Baxter and Andrew Tridgell ^[2].

Selected Publications

^[3]

1997 ...

Jonathan Baxter, Andrew Tridgell, Lex Weaver (1997). Knightcap: A chess program that learns by combining td(λ) with minimax search. 15th International Conference on Machine Learning, pdf via citeseerX
Lex Weaver, Terry Bossomaier (1998). Evolution of Neural Networks to Play the Game of Dots-and-Boxes. arXiv:cs/9809111
Jonathan Baxter, Andrew Tridgell, Lex Weaver (1998). Experiments in Parameter Learning Using Temporal Differences. ICCA Journal, Vol. 21, No. 2
Jonathan Baxter, Andrew Tridgell, Lex Weaver (1999). TDLeaf(lambda): Combining Temporal Difference Learning with Game-Tree Search. Australian Journal of Intelligent Information Processing Systems, Vol. 5 No. 1, arXiv:cs/9901001
Jonathan Baxter, Andrew Tridgell, Lex Weaver (1999). KnightCap: A chess program that learns by combining TD(lambda) with game-tree search. arXiv:cs/9901002

2000 ...

Jonathan Baxter, Andrew Tridgell, Lex Weaver (2000). Learning to Play Chess Using Temporal Differences. Machine Learning, Vol 40, No. 3, pdf
Lex Weaver, Jonathan Baxter (2001). STD (λ): learning state differences with TD (λ). CiteSeerX

2010 ...

Peter Bartlett, Jonathan Baxter, Lex Weaver (2011). Experiments with Infinite-Horizon, Policy-Gradient Estimation. arXiv:1106.0666
Lex Weaver, Nigel Tao (2013). The Optimal Reward Baseline for Gradient-Based Reinforcement Learning. arXiv:1301.2315

External Links

References

Up one level

[1] Lex Weaver | LinkedIn

[2] Welcome to the KnightCap home page

[3] : Lex Weaver

[1]

[2]

[3]

@@ Line 8: / Line 8: @@
 ==1997 ...==
 * [[Jonathan Baxter]], [[Andrew Tridgell]], [[Lex Weaver]] ('''1997'''). ''Knightcap: A chess program that learns by combining td(λ) with minimax search''. 15th International Conference on Machine Learning, [http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.54.8263&rep=rep1&type=pdf pdf] via [http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.54.8263 citeseerX]
+* [[Lex Weaver]], [https://bjbs.csu.edu.au/schools/computing-and-mathematics/staff/profiles/professorial-staff/terry-bossomaier Terry Bossomaier] ('''1998'''). ''Evolution of Neural Networks to Play the Game of Dots-and-Boxes''. [https://arxiv.org/abs/cs/9809111 arXiv:cs/9809111]
 * [[Jonathan Baxter]], [[Andrew Tridgell]], [[Lex Weaver]] ('''1998'''). ''Experiments in Parameter Learning Using Temporal Differences''. [[ICGA Journal#21_2|ICCA Journal, Vol. 21,  No. 2]]
 * [[Jonathan Baxter]], [[Andrew Tridgell]], [[Lex Weaver]] ('''1999'''). ''TDLeaf(lambda): Combining Temporal Difference Learning with Game-Tree Search''. [https://www.chatbots.org/journal/australian_journal_of_intelligent_information_processing_systems/ Australian Journal of Intelligent Information Processing Systems], Vol. 5 No. 1, [http://arxiv.org/abs/cs/9901001 arXiv:cs/9901001]
@@ Line 15: / Line 16: @@
 * [[Lex Weaver]], [[Jonathan Baxter]] ('''2001'''). ''STD (λ): learning state differences with TD (λ)''. [http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.20.7737 CiteSeerX]
 ==2010 ...==
-* [[Peter Bartlett]], [[Jonathan Baxter]], [[Lex Weaver]] ('''2011'''). ''Experiments with Infinite-Horizon, Policy-Gradient Estimation''. [https://arxiv.org/abs/1106.0666 arXiv:1106.0666]
+* [[Mathematician#PBartlett|Peter Bartlett]], [[Jonathan Baxter]], [[Lex Weaver]] ('''2011'''). ''Experiments with Infinite-Horizon, Policy-Gradient Estimation''. [https://arxiv.org/abs/1106.0666 arXiv:1106.0666]
 * [[Lex Weaver]], [https://dblp.uni-trier.de/pers/hd/t/Tao:Nigel Nigel Tao] ('''2013'''). ''The Optimal Reward Baseline for Gradient-Based Reinforcement Learning''. [https://arxiv.org/abs/1301.2315 arXiv:1301.2315]
@@ Line 27: / Line 28: @@
 '''[[People|Up one level]]'''
+[[Category:Chess Programmer|Weaver]]

Difference between revisions of "Lex Weaver"

Latest revision as of 22:41, 23 May 2019

Contents

Selected Publications

1997 ...

2000 ...

2010 ...

External Links

References

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools