Difference between revisions of "Neural Networks"

From Chessprogramming wiki
Jump to: navigation, search
(46 intermediate revisions by the same user not shown)
Line 12: Line 12:
  
 
=ANNs=
 
=ANNs=
[https://en.wikipedia.org/wiki/Artificial_neural_network Artificial Neural Networks] ('''ANNs''') are a family of [https://en.wikipedia.org/wiki/Machine_learning statistical learning] devices or algorithms used in [https://en.wikipedia.org/wiki/Regression_analysis regression], and [https://en.wikipedia.org/wiki/Binary_classification binary] or [[multiclass classification|multiclass classification]], implemented in [[Hardware|hardware]] or [[Software|software]] inspired by their biological counterparts. The [https://en.wikipedia.org/wiki/Artificial_neuron artificial neurons] of one or more layers receive one or more inputs (representing dendrites), and after being weighted, sum them to produce an output (representing a neuron's axon). The sum is passed through a [https://en.wikipedia.org/wiki/Nonlinear_system nonlinear] function known as an [https://en.wikipedia.org/wiki/Activation_function activation function] or transfer function. The transfer functions usually have a [https://en.wikipedia.org/wiki/Sigmoid_function sigmoid shape], but they may also take the form of other non-linear functions, [https://en.wikipedia.org/wiki/Piecewise piecewise] linear functions, or [https://en.wikipedia.org/wiki/Artificial_neuron#Step_function step functions] <ref>[https://en.wikipedia.org/wiki/Artificial_neuron Artificial neuron from Wikipedia]</ref>. The weights of the inputs of each layer are tuned to minimize a [https://en.wikipedia.org/wiki/Loss_function cost or loss function], which is a task in [https://en.wikipedia.org/wiki/Mathematical_optimization mathematical optimization] and machine learning.
+
[https://en.wikipedia.org/wiki/Artificial_neural_network Artificial Neural Networks] ('''ANNs''') are a family of [https://en.wikipedia.org/wiki/Machine_learning statistical learning] devices or algorithms used in [https://en.wikipedia.org/wiki/Regression_analysis regression], and [https://en.wikipedia.org/wiki/Binary_classification binary] or [https://en.wikipedia.org/wiki/Multiclass_classification multiclass classification], implemented in [[Hardware|hardware]] or [[Software|software]] inspired by their biological counterparts. The [https://en.wikipedia.org/wiki/Artificial_neuron artificial neurons] of one or more layers receive one or more inputs (representing dendrites), and after being weighted, sum them to produce an output (representing a neuron's axon). The sum is passed through a [https://en.wikipedia.org/wiki/Nonlinear_system nonlinear] function known as an [https://en.wikipedia.org/wiki/Activation_function activation function] or transfer function. The transfer functions usually have a [https://en.wikipedia.org/wiki/Sigmoid_function sigmoid shape], but they may also take the form of other non-linear functions, [https://en.wikipedia.org/wiki/Piecewise piecewise] linear functions, or [https://en.wikipedia.org/wiki/Artificial_neuron#Step_function step functions] <ref>[https://en.wikipedia.org/wiki/Artificial_neuron Artificial neuron from Wikipedia]</ref>. The weights of the inputs of each layer are tuned to minimize a [https://en.wikipedia.org/wiki/Loss_function cost or loss function], which is a task in [https://en.wikipedia.org/wiki/Mathematical_optimization mathematical optimization] and machine learning.
  
 
==Perceptron==  
 
==Perceptron==  
Line 18: Line 18:
 
The [https://en.wikipedia.org/wiki/Perceptron perceptron] is an algorithm for [[Supervised Learning|supervised learning]] of [https://en.wikipedia.org/wiki/Binary_classification binary classifiers]. It was the first artificial neural network, introduced in 1957 by [https://en.wikipedia.org/wiki/Frank_Rosenblatt Frank Rosenblatt] <ref>[https://en.wikipedia.org/wiki/Frank_Rosenblatt Frank Rosenblatt] ('''1957'''). ''The Perceptron - a Perceiving and Recognizing Automaton''. Report 85-460-1, [https://en.wikipedia.org/wiki/Calspan#History Cornell Aeronautical Laboratory]</ref>, implemented in custom hardware. In its basic form it consists of a single neuron with multiple inputs and associated weights.
 
The [https://en.wikipedia.org/wiki/Perceptron perceptron] is an algorithm for [[Supervised Learning|supervised learning]] of [https://en.wikipedia.org/wiki/Binary_classification binary classifiers]. It was the first artificial neural network, introduced in 1957 by [https://en.wikipedia.org/wiki/Frank_Rosenblatt Frank Rosenblatt] <ref>[https://en.wikipedia.org/wiki/Frank_Rosenblatt Frank Rosenblatt] ('''1957'''). ''The Perceptron - a Perceiving and Recognizing Automaton''. Report 85-460-1, [https://en.wikipedia.org/wiki/Calspan#History Cornell Aeronautical Laboratory]</ref>, implemented in custom hardware. In its basic form it consists of a single neuron with multiple inputs and associated weights.
  
[[Supervised learning]] is applied using a set D of labeled [https://en.wikipedia.org/wiki/Test_set training data] with pairs of [https://en.wikipedia.org/wiki/Feature_vector feature vectors] (x) and given results as desired output (d), usually started with cleared or randomly initialized weight vector w. The output is calculated by all inputs of a sample, multiplied by its corresponding weights, passing the sum to the activation function f. The difference of desired and actual value is then immediately used modify the weights for all features using a learning rate 0.0 < α <= 1.0:
+
[[Supervised Learning|Supervised learning]] is applied using a set D of labeled [https://en.wikipedia.org/wiki/Test_set training data] with pairs of [https://en.wikipedia.org/wiki/Feature_vector feature vectors] (x) and given results as desired output (d), usually started with cleared or randomly initialized weight vector w. The output is calculated by all inputs of a sample, multiplied by its corresponding weights, passing the sum to the activation function f. The difference of desired and actual value is then immediately used modify the weights for all features using a learning rate 0.0 < α <= 1.0:
 
<pre>
 
<pre>
 
   for (j=0, Σ = 0.0; j < nSamples; ++j) {
 
   for (j=0, Σ = 0.0; j < nSamples; ++j) {
Line 65: Line 65:
 
Typical CNN <ref>Typical [https://en.wikipedia.org/wiki/Convolutional_neural_network CNN] architecture, Image by Aphex34,  December 16, 2015, [https://creativecommons.org/licenses/by-sa/4.0/deed.en CC BY-SA 4.0], [https://en.wikipedia.org/wiki/Wikimedia_Commons Wikimedia Commons]</ref>   
 
Typical CNN <ref>Typical [https://en.wikipedia.org/wiki/Convolutional_neural_network CNN] architecture, Image by Aphex34,  December 16, 2015, [https://creativecommons.org/licenses/by-sa/4.0/deed.en CC BY-SA 4.0], [https://en.wikipedia.org/wiki/Wikimedia_Commons Wikimedia Commons]</ref>   
 
<span id="Residual"></span>
 
<span id="Residual"></span>
==Residual Nets==
+
==Residual Net==
 
[[FILE:ResiDualBlock.png|border|right|thumb|link=https://arxiv.org/abs/1512.03385| A residual block <ref>The fundamental building block of residual networks. Figure 2 in [https://scholar.google.com/citations?user=DhtAFkwAAAAJ Kaiming He], [https://scholar.google.com/citations?user=yuB-cfoAAAAJ&hl=en Xiangyu Zhang], [http://shaoqingren.com/ Shaoqing Ren], [http://www.jiansun.org/ Jian Sun] ('''2015'''). ''Deep Residual Learning for Image Recognition''. [https://arxiv.org/abs/1512.03385 arXiv:1512.03385]</ref> <ref>[https://blog.waya.ai/deep-residual-learning-9610bb62c355 Understand Deep Residual Networks — a simple, modular learning framework that has redefined state-of-the-art] by [https://blog.waya.ai/@waya.ai Michael Dietz], [https://blog.waya.ai/ Waya.ai], May 02, 2017</ref> ]]  
 
[[FILE:ResiDualBlock.png|border|right|thumb|link=https://arxiv.org/abs/1512.03385| A residual block <ref>The fundamental building block of residual networks. Figure 2 in [https://scholar.google.com/citations?user=DhtAFkwAAAAJ Kaiming He], [https://scholar.google.com/citations?user=yuB-cfoAAAAJ&hl=en Xiangyu Zhang], [http://shaoqingren.com/ Shaoqing Ren], [http://www.jiansun.org/ Jian Sun] ('''2015'''). ''Deep Residual Learning for Image Recognition''. [https://arxiv.org/abs/1512.03385 arXiv:1512.03385]</ref> <ref>[https://blog.waya.ai/deep-residual-learning-9610bb62c355 Understand Deep Residual Networks — a simple, modular learning framework that has redefined state-of-the-art] by [https://blog.waya.ai/@waya.ai Michael Dietz], [https://blog.waya.ai/ Waya.ai], May 02, 2017</ref> ]]  
'''Residual nets''' add the input of a layer, typically composed of a convolutional layer and of a [https://en.wikipedia.org/wiki/Rectifier_(neural_networks) ReLU] layer, to its output. This modification, like convolutional nets inspired from image classification, enables faster training and deeper networks <ref>[[Tristan Cazenave]] ('''2017'''). ''[http://ieeexplore.ieee.org/document/7875402/ Residual Networks for Computer Go]''. [[IEEE#TOCIAIGAMES|IEEE Transactions on Computational Intelligence and AI in Games]], Vol. PP, No. 99, [http://www.lamsade.dauphine.fr/~cazenave/papers/resnet.pdf pdf]</ref> <ref>[https://wiki.tum.de/display/lfdv/Deep+Residual+Networks Deep Residual Networks] from [https://wiki.tum.de/ TUM Wiki], [[Technical University of Munich]]</ref>.
+
A '''Residual net''' (ResNet) adds the input of a layer, typically composed of a convolutional layer and of a [https://en.wikipedia.org/wiki/Rectifier_(neural_networks) ReLU] layer, to its output. This modification, like convolutional nets inspired from image classification, enables faster training and deeper networks <ref>[[Tristan Cazenave]] ('''2017'''). ''[http://ieeexplore.ieee.org/document/7875402/ Residual Networks for Computer Go]''. [[IEEE#TOCIAIGAMES|IEEE Transactions on Computational Intelligence and AI in Games]], Vol. PP, No. 99, [http://www.lamsade.dauphine.fr/~cazenave/papers/resnet.pdf pdf]</ref> <ref>[https://wiki.tum.de/display/lfdv/Deep+Residual+Networks Deep Residual Networks] from [https://wiki.tum.de/ TUM Wiki], [[Technical University of Munich]]</ref> <ref>[https://towardsdatascience.com/understanding-and-visualizing-resnets-442284831be8 Understanding and visualizing ResNets] by Pablo Ruiz, October 8, 2018</ref>.
  
 
=ANNs in Games=
 
=ANNs in Games=
Line 95: Line 95:
 
<span id="AlphaZero"></span>
 
<span id="AlphaZero"></span>
 
===Alpha Zero===
 
===Alpha Zero===
In December 2017, the [[Google]] [[DeepMind]] team along with former [[Giraffe]] author [[Matthew Lai]] reported on their generalized [[AlphaZero]] algorithm, combining [[Deep Learning|Deep learning]] with [[Monte-Carlo Tree Search]]. AlphaZero can achieve, tabula rasa, superhuman performance in many challenging domains with some training effort. Starting from random play, and given no domain knowledge except the game rules, AlphaZero achieved a superhuman level of play in the games of chess and [[Shogi]] as well as Go, and convincingly defeated a world-champion program in each case <ref>[[David Silver]], [[Thomas Hubert]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Matthew Lai]], [[Arthur Guez]], [[Marc Lanctot]], [[Laurent Sifre]], [[Dharshan Kumaran]], [[Thore Graepel]], [[Timothy Lillicrap]], [[Karen Simonyan]], [[Demis Hassabis]] ('''2017'''). ''Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm''. [https://arxiv.org/abs/1712.01815 arXiv:1712.01815]</ref>.  
+
In December 2017, the [[Google]] [[DeepMind]] team along with former [[Giraffe]] author [[Matthew Lai]] reported on their generalized [[AlphaZero]] algorithm, combining [[Deep Learning|Deep learning]] with [[Monte-Carlo Tree Search]]. AlphaZero can achieve, tabula rasa, superhuman performance in many challenging domains with some training effort. Starting from random play, and given no domain knowledge except the game rules, AlphaZero achieved a superhuman level of play in the games of chess and [[Shogi]] as well as Go, and convincingly defeated a world-champion program in each case <ref>[[David Silver]], [[Thomas Hubert]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Matthew Lai]], [[Arthur Guez]], [[Marc Lanctot]], [[Laurent Sifre]], [[Dharshan Kumaran]], [[Thore Graepel]], [[Timothy Lillicrap]], [[Karen Simonyan]], [[Demis Hassabis]] ('''2017'''). ''Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm''. [https://arxiv.org/abs/1712.01815 arXiv:1712.01815]</ref>. The open souece projects [[Leela Zero]] (Go) and its chess adaptation [[Leela Chess Zero]] successfully re-implemented the ideas of DeepMind.
 +
===NNUE===
 +
[[NNUE]] reverse of &#398;U&#1048;&#1048; - Efficiently Updatable Neural Networks, is an NN architecture intended to replace the [[Evaluation|evaluation]] of [[Shogi]], [[Chess|chess]] and other board game playing [[Alpha-Beta|alpha-beta]] searchers. NNUE was introduced in 2018 by [[Yu Nasu]] <ref>[[Yu Nasu]] ('''2018'''). ''&#398;U&#1048;&#1048; Efficiently Updatable Neural-Network based Evaluation Functions for Computer Shogi''.  Ziosoft Computer Shogi Club, [https://github.com/ynasu87/nnue/blob/master/docs/nnue.pdf pdf] (Japanese with English abstract)</ref>,
 +
and was used in Shogi adaptations of [[Stockfish]] such as [[YaneuraOu]] <ref>[https://github.com/yaneurao/YaneuraOu GitHub - yaneurao/YaneuraOu: YaneuraOu is the World's Strongest Shogi engine(AI player), WCSC29 1st winner, educational and USI compliant engine]</ref> ,
 +
and [[Kristallweizen]] <ref>[https://github.com/Tama4649/Kristallweizen/ GitHub - Tama4649/Kristallweizen: 第29回世界コンピュータ将棋選手権 準優勝のKristallweizenです。]</ref>, apparently with [[AlphaZero]] strength <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=2&t=72754 The Stockfish of shogi] by [[Larry Kaufman]], [[CCC]], January 07, 2020</ref>. [[Hisayori Noda|Nodchip]] incorporated NNUE into the chess playing Stockfish 10 as a proof of concept <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=2&t=74059 Stockfish NN release (NNUE)] by [[Henk Drost]], [[CCC]], May 31, 2020</ref>, yielding in the hype about [[Stockfish NNUE]] in summer 2020 <ref>[http://yaneuraou.yaneu.com/2020/06/19/stockfish-nnue-the-complete-guide/ Stockfish NNUE – The Complete Guide], June 19, 2020 (Japanese and English)</ref>.
 +
Its  heavily over parametrized computational most expensive input layer is efficiently [[Incremental Updates|incremental updated]] in [[Make Move|make]] and [[Unmake Move|unmake move]].
 +
<span id="engines"></span>
 +
===NN Chess Programs===
 +
* [[:Category:NN]]
  
 
=See also=
 
=See also=
Line 110: Line 118:
 
* [[Memory]]
 
* [[Memory]]
 
* [[Neural MoveMap Heuristic]]
 
* [[Neural MoveMap Heuristic]]
 +
* [[NNUE]]
 
* [[Pattern Recognition]]
 
* [[Pattern Recognition]]
 
* [[Temporal Difference Learning]]
 
* [[Temporal Difference Learning]]
<span id="engines"></span>
 
=NN Chess Programs=
 
* [[Alexs]]
 
* [[AlphaZero]]
 
* [[Arminius]]
 
* [[Blondie25]]
 
* [[ChessMaps]]
 
* [[Chessterfield]]
 
* [[Deep Pink]]
 
* [[Giraffe]]
 
* [[Golch]]
 
* [[Hermann]]
 
* [[Leela Chess Zero]]
 
* [[Morph]]
 
* [[Nebiyu]]
 
* [[NeuroChess]]
 
* [[Octavius]]
 
* [[RamJet]]
 
* [[SAL]]
 
* [[Scorpio]]
 
* [[Spawkfish]]
 
* [[Stoofvlees]]
 
* [[Tempo (engine)|Tempo]]
 
* [[Zurichess]]
 
  
 
=Selected Publications=
 
=Selected Publications=
Line 149: Line 134:
 
* [[John von Neumann]] ('''1956'''). ''Probabilistic Logic and the Synthesis of Reliable Organisms From Unreliable Components''. in  
 
* [[John von Neumann]] ('''1956'''). ''Probabilistic Logic and the Synthesis of Reliable Organisms From Unreliable Components''. in  
 
: [[Claude Shannon]], [[John McCarthy]] (eds.) ('''1956'''). ''Automata Studies''.  [http://press.princeton.edu/math/series/amh.html Annals of Mathematics Studies], No. 34, [http://www.dna.caltech.edu/courses/cs191/paperscs191/VonNeumann56.pdf pdf]
 
: [[Claude Shannon]], [[John McCarthy]] (eds.) ('''1956'''). ''Automata Studies''.  [http://press.princeton.edu/math/series/amh.html Annals of Mathematics Studies], No. 34, [http://www.dna.caltech.edu/courses/cs191/paperscs191/VonNeumann56.pdf pdf]
* [[Nathaniel Rochester]], [[Mathematician#Holland|John H. Holland]], [http://dblp.uni-trier.de/pers/hd/h/Haibt:L=_H= L. H. Haibt], [http://dblp.uni-trier.de/pers/hd/d/Duda:W=_L= W. L. Duda] ('''1956'''). ''Tests on a Cell Assembly Theory of the Action of the Brain, Using a Large Digital Computer''. [http://dblp.uni-trier.de/db/journals/tit/tit2n.html#RochesterHHD56 IRE Transactions on Information Theory, Vol. 2], No. 3
+
* [[Nathaniel Rochester]], [[Mathematician#Holland|John H. Holland]], [https://dblp.uni-trier.de/pers/hd/h/Haibt:L=_H= L. H. Haibt], [https://dblp.uni-trier.de/pers/hd/d/Duda:William_L= William L. Duda] ('''1956'''). ''[https://www.semanticscholar.org/paper/Tests-on-a-cell-assembly-theory-of-the-action-of-a-Rochester-Holland/878d615b84cf779e162f62c4a9192d6bddeefbf9 Tests on a Cell Assembly Theory of the Action of the Brain, Using a Large Digital Computer]''. [https://dblp.uni-trier.de/db/journals/tit/tit2n.html IRE Transactions on Information Theory, Vol. 2], No. 3
 
* [https://en.wikipedia.org/wiki/Frank_Rosenblatt Frank Rosenblatt] ('''1957'''). ''The Perceptron - a Perceiving and Recognizing Automaton''. Report 85-460-1, [https://en.wikipedia.org/wiki/Calspan#History Cornell Aeronautical Laboratory] <ref>[http://csis.pace.edu/~ctappert/srd2011/rosenblatt-contributions.htm Rosenblatt's Contributions]</ref>
 
* [https://en.wikipedia.org/wiki/Frank_Rosenblatt Frank Rosenblatt] ('''1957'''). ''The Perceptron - a Perceiving and Recognizing Automaton''. Report 85-460-1, [https://en.wikipedia.org/wiki/Calspan#History Cornell Aeronautical Laboratory] <ref>[http://csis.pace.edu/~ctappert/srd2011/rosenblatt-contributions.htm Rosenblatt's Contributions]</ref>
 
==1960 ...==
 
==1960 ...==
Line 189: Line 174:
 
* [[Eric B. Baum]] ('''1989'''). ''[http://papers.nips.cc/paper/226-the-perceptron-algorithm-is-fast-for-non-malicious-distributions The Perceptron Algorithm Is Fast for Non-Malicious Distributions]''. [http://papers.nips.cc/book/advances-in-neural-information-processing-systems-2-1989 NIPS 1989]
 
* [[Eric B. Baum]] ('''1989'''). ''[http://papers.nips.cc/paper/226-the-perceptron-algorithm-is-fast-for-non-malicious-distributions The Perceptron Algorithm Is Fast for Non-Malicious Distributions]''. [http://papers.nips.cc/book/advances-in-neural-information-processing-systems-2-1989 NIPS 1989]
 
* [[Eric B. Baum]] ('''1989'''). ''[http://www.mitpressjournals.org/doi/abs/10.1162/neco.1989.1.2.201#.VfGX0JdpluM A Proposal for More Powerful Learning Algorithms]''. [https://en.wikipedia.org/wiki/Neural_Computation_%28journal%29 Neural Computation], Vol. 1, No. 2
 
* [[Eric B. Baum]] ('''1989'''). ''[http://www.mitpressjournals.org/doi/abs/10.1162/neco.1989.1.2.201#.VfGX0JdpluM A Proposal for More Powerful Learning Algorithms]''. [https://en.wikipedia.org/wiki/Neural_Computation_%28journal%29 Neural Computation], Vol. 1, No. 2
* [http://www.informatik.uni-trier.de/~ley/db/indices/a-tree/i/Irani:E=_A=.html Erach A. Irani], [http://www.informatik.uni-trier.de/~ley/db/indices/a-tree/m/Matts:John_P=.html John P. Matts], [http://www.informatik.uni-trier.de/~ley/db/indices/a-tree/l/Long:John_M=.html John M. Long], [[James R. Slagle]], POSCH group ('''1989'''). ''Using Artificial Neural Nets for Statistical Discovery: Observations after Using Backpropogation, Expert Systems, and Multiple-Linear Regression on Clinical Trial Data''. University of Minnesota, Minneapolis, MN 55455, USA, Complex Systems 3, [http://www.complex-systems.com/pdf/03-3-5.pdf pdf]
+
* [http://www.informatik.uni-trier.de/~ley/db/indices/a-tree/i/Irani:E=_A=.html Erach A. Irani], [http://www.informatik.uni-trier.de/~ley/db/indices/a-tree/m/Matts:John_P=.html John P. Matts], [http://www.informatik.uni-trier.de/~ley/db/indices/a-tree/l/Long:John_M=.html John M. Long], [[James R. Slagle]], POSCH group ('''1989'''). ''Using Artificial Neural Nets for Statistical Discovery: Observations after Using Backpropogation, Expert Systems, and Multiple-Linear Regression on Clinical Trial Data''. [[University of Minnesota]], Minneapolis, MN 55455, USA, Complex Systems 3, [http://www.complex-systems.com/pdf/03-3-5.pdf pdf]
 
* [[Gerald Tesauro]], [[Terrence J. Sejnowski]] ('''1989'''). ''A Parallel Network that Learns to Play Backgammon''. [https://en.wikipedia.org/wiki/Artificial_Intelligence_%28journal%29 Artificial Intelligence], Vol. 39, No. 3
 
* [[Gerald Tesauro]], [[Terrence J. Sejnowski]] ('''1989'''). ''A Parallel Network that Learns to Play Backgammon''. [https://en.wikipedia.org/wiki/Artificial_Intelligence_%28journal%29 Artificial Intelligence], Vol. 39, No. 3
 
* [[Mathematician#EGelenbe|Erol Gelenbe]] ('''1989'''). ''[http://cognet.mit.edu/journal/10.1162/neco.1989.1.4.502 Random Neural Networks with Negative and Positive Signals and Product Form Solution]''. [https://en.wikipedia.org/wiki/Neural_Computation_(journal) Neural Computation], Vol. 1, No. 4
 
* [[Mathematician#EGelenbe|Erol Gelenbe]] ('''1989'''). ''[http://cognet.mit.edu/journal/10.1162/neco.1989.1.4.502 Random Neural Networks with Negative and Positive Signals and Product Form Solution]''. [https://en.wikipedia.org/wiki/Neural_Computation_(journal) Neural Computation], Vol. 1, No. 4
 +
* [[Mathematician#XZhang|Xiru Zhang]], [https://dblp.uni-trier.de/pers/hd/m/McKenna:Michael Michael McKenna], [[Mathematician#JPMesirov|Jill P. Mesirov]], [[David Waltz]] ('''1989'''). ''[http://papers.neurips.cc/paper/281-an-efficient-implementation-of-the-back-propagation-algorithm-on-the-connection-machine-cm-2 An Efficient Implementation of the Back-propagation Algorithm on the Connection Machine CM-2]''. [https://dblp.uni-trier.de/db/conf/nips/nips1989.html NIPS 1989]
 
==1990 ...==
 
==1990 ...==
 
* [[Mathematician#PWerbos|Paul Werbos]] ('''1990'''). ''Backpropagation Through Time: What It Does and How to Do It''. Proceedings of the [[IEEE]], Vol. 78, No. 10, [http://deeplearning.cs.cmu.edu/pdfs/Werbos.backprop.pdf pdf]
 
* [[Mathematician#PWerbos|Paul Werbos]] ('''1990'''). ''Backpropagation Through Time: What It Does and How to Do It''. Proceedings of the [[IEEE]], Vol. 78, No. 10, [http://deeplearning.cs.cmu.edu/pdfs/Werbos.backprop.pdf pdf]
Line 199: Line 185:
 
* [https://dblp.uni-trier.de/pers/hd/r/Ruck:Dennis_W= Dennis W. Ruck], [http://spie.org/profile/Steven.Rogers-5480?SSO=1 Steven K. Rogers], [https://dblp.uni-trier.de/pers/hd/k/Kabrisky:Matthew Matthew Kabrisky], [[Mathematician#MEOxley|Mark E. Oxley]], [[Bruce W. Suter]] ('''1990'''). ''[https://ieeexplore.ieee.org/document/80266 The multilayer perceptron as an approximation to a Bayes optimal discriminant function]''.  [[IEEE#NN|IEEE Transactions on Neural Networks]], Vol. 1, No. 4
 
* [https://dblp.uni-trier.de/pers/hd/r/Ruck:Dennis_W= Dennis W. Ruck], [http://spie.org/profile/Steven.Rogers-5480?SSO=1 Steven K. Rogers], [https://dblp.uni-trier.de/pers/hd/k/Kabrisky:Matthew Matthew Kabrisky], [[Mathematician#MEOxley|Mark E. Oxley]], [[Bruce W. Suter]] ('''1990'''). ''[https://ieeexplore.ieee.org/document/80266 The multilayer perceptron as an approximation to a Bayes optimal discriminant function]''.  [[IEEE#NN|IEEE Transactions on Neural Networks]], Vol. 1, No. 4
 
* [https://dblp.uni-trier.de/pers/hd/h/Hellstrom:Benjamin_J= Benjamin J. Hellstrom], [[Laveen Kanal|Laveen N. Kanal]] ('''1990'''). ''[https://ieeexplore.ieee.org/document/5726889 The definition of necessary hidden units in neural networks for combinatorial optimization]''. [https://dblp.uni-trier.de/db/conf/ijcnn/ijcnn1990.html IJCNN 1990]
 
* [https://dblp.uni-trier.de/pers/hd/h/Hellstrom:Benjamin_J= Benjamin J. Hellstrom], [[Laveen Kanal|Laveen N. Kanal]] ('''1990'''). ''[https://ieeexplore.ieee.org/document/5726889 The definition of necessary hidden units in neural networks for combinatorial optimization]''. [https://dblp.uni-trier.de/db/conf/ijcnn/ijcnn1990.html IJCNN 1990]
 +
* [[Mathematician#XZhang|Xiru Zhang]], [https://dblp.uni-trier.de/pers/hd/m/McKenna:Michael Michael McKenna], [[Mathematician#JPMesirov|Jill P. Mesirov]], [[David Waltz]] ('''1990'''). ''[https://www.sciencedirect.com/science/article/pii/016781919090084M The backpropagation algorithm on grid and hypercube architectures]''. [https://www.journals.elsevier.com/parallel-computing Parallel Computing], Vol. 14, No. 3
 +
* [[Simon Lucas]], [https://dblp.uni-trier.de/pers/hd/d/Damper:Robert_I= Robert I. Damper] ('''1990'''). ''[https://www.tandfonline.com/doi/abs/10.1080/09540099008915669 Syntactic Neural Networks]''. [https://www.tandfonline.com/toc/ccos20/current Connection Science], Vol. 2, No. 3
 
'''1991'''
 
'''1991'''
 
* [[Mathematician#SHochreiter|Sepp Hochreiter]] ('''1991'''). ''Untersuchungen zu dynamischen neuronalen Netzen''. Diploma thesis, [[Technical University of Munich|TU Munich]], advisor [[Jürgen Schmidhuber]], [http://people.idsia.ch/~juergen/SeppHochreiter1991ThesisAdvisorSchmidhuber.pdf pdf] (German) <ref>[http://people.idsia.ch/~juergen/fundamentaldeeplearningproblem.html Sepp Hochreiter's Fundamental Deep Learning Problem (1991)] by [[Jürgen Schmidhuber]], 2013</ref>
 
* [[Mathematician#SHochreiter|Sepp Hochreiter]] ('''1991'''). ''Untersuchungen zu dynamischen neuronalen Netzen''. Diploma thesis, [[Technical University of Munich|TU Munich]], advisor [[Jürgen Schmidhuber]], [http://people.idsia.ch/~juergen/SeppHochreiter1991ThesisAdvisorSchmidhuber.pdf pdf] (German) <ref>[http://people.idsia.ch/~juergen/fundamentaldeeplearningproblem.html Sepp Hochreiter's Fundamental Deep Learning Problem (1991)] by [[Jürgen Schmidhuber]], 2013</ref>
 
* [[Alex van Tiggelen]] ('''1991'''). ''Neural Networks as a Guide to Optimization - The Chess Middle Game Explored''. [[ICGA Journal#14_3|ICCA Journal, Vol. 14, No. 3]]
 
* [[Alex van Tiggelen]] ('''1991'''). ''Neural Networks as a Guide to Optimization - The Chess Middle Game Explored''. [[ICGA Journal#14_3|ICCA Journal, Vol. 14, No. 3]]
* [[Jürgen Schmidhuber]], [[Rudolf Huber]] ('''1991'''). ''[https://www.researchgate.net/publication/2290900_Using_Adaptive_Sequential_Neurocontrol_For_Efficient_Learning_Of_Translation_And_Rotation_Invariance Using sequential adaptive Neuro-control for efficient Learning of Rotation and Translation Invariance]''. In [https://en.wikipedia.org/wiki/Teuvo_Kohonen Teuvo Kohonen], [https://dblp.uni-trier.de/pers/hd/m/Makisara:Kai Kai Mäkisara], [http://users.ics.tkk.fi/ollis/ Olli Simula], [http://cis.legacy.ics.tkk.fi/jari/ Jari Kangas] (eds.) ('''1991'''). ''[https://www.sciencedirect.com/book/9780444891785/artificial-neural-networks#book-description Artificial Neural Networks]''. [https://en.wikipedia.org/wiki/Elsevier Elsevier]
+
* [[Mathematician#TMartinetz|Thomas Martinetz]], [[Mathematician#KSchulten|Klaus Schulten]] ('''1991'''). ''A "Neural-Gas" Network Learns Topologies''. In [[Mathematician#TKohonen|Teuvo Kohonen]], [https://dblp.uni-trier.de/pers/hd/m/Makisara:Kai Kai Mäkisara], [http://users.ics.tkk.fi/ollis/ Olli Simula], [http://cis.legacy.ics.tkk.fi/jari/ Jari Kangas] (eds.) ('''1991'''). ''[https://www.elsevier.com/books/artificial-neural-networks/makisara/978-0-444-89178-5 Artificial Neural Networks]''. [https://en.wikipedia.org/wiki/Elsevier Elsevier], [http://www.ks.uiuc.edu/Publications/Papers/PDF/MART91B/MART91B.pdf pdf]
 +
* [[Jürgen Schmidhuber]], [[Rudolf Huber]] ('''1991'''). ''[https://www.researchgate.net/publication/2290900_Using_Adaptive_Sequential_Neurocontrol_For_Efficient_Learning_Of_Translation_And_Rotation_Invariance Using sequential adaptive Neuro-control for efficient Learning of Rotation and Translation Invariance]''. In [[Mathematician#TKohonen|Teuvo Kohonen]], [https://dblp.uni-trier.de/pers/hd/m/Makisara:Kai Kai Mäkisara], [http://users.ics.tkk.fi/ollis/ Olli Simula], [http://cis.legacy.ics.tkk.fi/jari/ Jari Kangas] (eds.) ('''1991'''). ''[https://www.sciencedirect.com/book/9780444891785/artificial-neural-networks#book-description Artificial Neural Networks]''. [https://en.wikipedia.org/wiki/Elsevier Elsevier]
 
* [[Jürgen Schmidhuber]] ('''1991'''). ''[http://www.idsia.ch/%7Ejuergen/promotion/ Dynamische neuronale Netze und das fundamentale raumzeitliche Lernproblem]'' (Dynamic Neural Nets and the Fundamental Spatio-Temporal Credit Assignment Problem). Ph.D. thesis
 
* [[Jürgen Schmidhuber]] ('''1991'''). ''[http://www.idsia.ch/%7Ejuergen/promotion/ Dynamische neuronale Netze und das fundamentale raumzeitliche Lernproblem]'' (Dynamic Neural Nets and the Fundamental Spatio-Temporal Credit Assignment Problem). Ph.D. thesis
 
* [[Yoav Freund]], [[Mathematician#DHHaussler|David Haussler]] ('''1991'''). ''Unsupervised Learning of Distributions of Binary Vectors Using 2-Layer Networks''. [http://dblp.uni-trier.de/db/conf/nips/nips1991.html#FreundH91 NIPS 1991]
 
* [[Yoav Freund]], [[Mathematician#DHHaussler|David Haussler]] ('''1991'''). ''Unsupervised Learning of Distributions of Binary Vectors Using 2-Layer Networks''. [http://dblp.uni-trier.de/db/conf/nips/nips1991.html#FreundH91 NIPS 1991]
 
* [[Byoung-Tak Zhang]], [[Gerd Veenker]] ('''1991'''). ''[http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=170480&url=http%3A%2F%2Fieeexplore.ieee.org%2Fxpls%2Fabs_all.jsp%3Farnumber%3D170480 Neural networks that teach themselves through genetic discovery of novel examples]''. [http://ieeexplore.ieee.org/xpl/conhome.jsp?punumber=1000500 IEEE IJCNN'91], [https://bi.snu.ac.kr/Publications/Conferences/International/IJCNN91.pdf pdf]
 
* [[Byoung-Tak Zhang]], [[Gerd Veenker]] ('''1991'''). ''[http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=170480&url=http%3A%2F%2Fieeexplore.ieee.org%2Fxpls%2Fabs_all.jsp%3Farnumber%3D170480 Neural networks that teach themselves through genetic discovery of novel examples]''. [http://ieeexplore.ieee.org/xpl/conhome.jsp?punumber=1000500 IEEE IJCNN'91], [https://bi.snu.ac.kr/Publications/Conferences/International/IJCNN91.pdf pdf]
 +
* [[Simon Lucas]], [https://dblp.uni-trier.de/pers/hd/d/Damper:Robert_I= Robert I. Damper] ('''1991'''). ''[https://link.springer.com/chapter/10.1007/978-1-4615-3752-6_30 Syntactic neural networks in VLSI]''. [https://link.springer.com/book/10.1007/978-1-4615-3752-6 VLSI for Artificial Intelligence and Neural Networks]
 +
* [[Simon Lucas]] ('''1991'''). ''[https://eprints.soton.ac.uk/256263/ Connectionist architectures for syntactic pattern recognition]''. Ph.D. thesis, [https://en.wikipedia.org/wiki/University_of_Southampton University of Southampton]
 
'''1992'''
 
'''1992'''
 
* [[Michael Reiss]] ('''1992'''). ''Temporal Sequence Processing in Neural Networks''. Ph.D. thesis, [https://en.wikipedia.org/wiki/King%27s_College_London King's College London], advisor [[Mathematician#JGTaylor|John G. Taylor]], [http://www.reiss.demon.co.uk/misc/m_reiss_phd.pdf pdf]
 
* [[Michael Reiss]] ('''1992'''). ''Temporal Sequence Processing in Neural Networks''. Ph.D. thesis, [https://en.wikipedia.org/wiki/King%27s_College_London King's College London], advisor [[Mathematician#JGTaylor|John G. Taylor]], [http://www.reiss.demon.co.uk/misc/m_reiss_phd.pdf pdf]
Line 225: Line 216:
 
'''1994'''
 
'''1994'''
 
* [[Mathematician#PWerbos|Paul Werbos]] ('''1994'''). ''The Roots of Backpropagation. From Ordered Derivatives to Neural Networks and Political Forecasting''. [https://en.wikipedia.org/wiki/John_Wiley_%26_Sons John Wiley & Sons]
 
* [[Mathematician#PWerbos|Paul Werbos]] ('''1994'''). ''The Roots of Backpropagation. From Ordered Derivatives to Neural Networks and Political Forecasting''. [https://en.wikipedia.org/wiki/John_Wiley_%26_Sons John Wiley & Sons]
* [[David E. Moriarty]], [[Risto Miikkulainen]] ('''1994'''). ''Evolving Neural Networks to focus Minimax Search''. [[AAAI|AAAI-94]], [http://www.cs.utexas.edu/~ai-lab/pubs/moriarty.focus.pdf pdf]
+
* [[David E. Moriarty]], [[Risto Miikkulainen]] ('''1994'''). ''[http://nn.cs.utexas.edu/?moriarty:aaai94 Evolving Neural Networks to focus Minimax Search]''. [[Conferences#AAAI-94|AAAI-94]] » [[Othello]]
 
* [[Eric Postma]] ('''1994'''). ''SCAN: A Neural Model of Covert Attention''. Ph.D. thesis, [[Maastricht University]], advisor [[Jaap van den Herik]]
 
* [[Eric Postma]] ('''1994'''). ''SCAN: A Neural Model of Covert Attention''. Ph.D. thesis, [[Maastricht University]], advisor [[Jaap van den Herik]]
 
* [[Sebastian Thrun]] ('''1994'''). ''Neural Network Learning in the Domain of Chess''. Machines That Learn, [http://snowbird.djvuzone.org/ Snowbird], Extended abstract
 
* [[Sebastian Thrun]] ('''1994'''). ''Neural Network Learning in the Domain of Chess''. Machines That Learn, [http://snowbird.djvuzone.org/ Snowbird], Extended abstract
Line 235: Line 226:
 
'''1995'''
 
'''1995'''
 
* [https://peterbraspenning.wordpress.com/ Peter J. Braspenning], [[Frank Thuijsman]], [https://scholar.google.com/citations?user=Ba9L7CAAAAAJ Ton Weijters] (eds) ('''1995'''). ''[http://link.springer.com/book/10.1007%2FBFb0027019 Artificial neural networks: an introduction to ANN theory and practice]''. [https://de.wikipedia.org/wiki/Lecture_Notes_in_Computer_Science LNCS] 931, [https://de.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer]
 
* [https://peterbraspenning.wordpress.com/ Peter J. Braspenning], [[Frank Thuijsman]], [https://scholar.google.com/citations?user=Ba9L7CAAAAAJ Ton Weijters] (eds) ('''1995'''). ''[http://link.springer.com/book/10.1007%2FBFb0027019 Artificial neural networks: an introduction to ANN theory and practice]''. [https://de.wikipedia.org/wiki/Lecture_Notes_in_Computer_Science LNCS] 931, [https://de.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer]
 +
* [[David E. Moriarty]], [[Risto Miikkulainen]] ('''1995'''). ''[http://nn.cs.utexas.edu/?moriarty:connsci95 Discovering Complex Othello Strategies Through Evolutionary Neural Networks]''. [https://www.scimagojr.com/journalsearch.php?q=24173&tip=sid Connection Science], Vol. 7
 
* [[Anton Leouski]] ('''1995'''). ''Learning of Position Evaluation in the Game of Othello''. Master's Project, [https://en.wikipedia.org/wiki/University_of_Massachusetts University of Massachusetts], [https://en.wikipedia.org/wiki/Amherst,_Massachusetts Amherst, Massachusetts], [http://people.ict.usc.edu/~leuski/publications/papers/UM-CS-1995-023.pdf pdf]  
 
* [[Anton Leouski]] ('''1995'''). ''Learning of Position Evaluation in the Game of Othello''. Master's Project, [https://en.wikipedia.org/wiki/University_of_Massachusetts University of Massachusetts], [https://en.wikipedia.org/wiki/Amherst,_Massachusetts Amherst, Massachusetts], [http://people.ict.usc.edu/~leuski/publications/papers/UM-CS-1995-023.pdf pdf]  
 
* [[Mathematician#SHochreiter|Sepp Hochreiter]], [[Jürgen Schmidhuber]] ('''1995'''). ''[http://www.idsia.ch/%7Ejuergen/nipsfm/ Simplifying Neural Nets by Discovering Flat Minima]''. In [[Gerald Tesauro]], [http://www.cs.cmu.edu/%7Edst/home.html David S. Touretzky] and [http://www.bme.ogi.edu/%7Etleen/ Todd K. Leen] (eds.), ''[http://mitpress.mit.edu/catalog/item/default.asp?ttype=2&tid=8420 Advances in Neural Information Processing Systems 7]'', NIPS'7, pages 529-536. [https://en.wikipedia.org/wiki/MIT_Press MIT Press]
 
* [[Mathematician#SHochreiter|Sepp Hochreiter]], [[Jürgen Schmidhuber]] ('''1995'''). ''[http://www.idsia.ch/%7Ejuergen/nipsfm/ Simplifying Neural Nets by Discovering Flat Minima]''. In [[Gerald Tesauro]], [http://www.cs.cmu.edu/%7Edst/home.html David S. Touretzky] and [http://www.bme.ogi.edu/%7Etleen/ Todd K. Leen] (eds.), ''[http://mitpress.mit.edu/catalog/item/default.asp?ttype=2&tid=8420 Advances in Neural Information Processing Systems 7]'', NIPS'7, pages 529-536. [https://en.wikipedia.org/wiki/MIT_Press MIT Press]
Line 257: Line 249:
 
* [[Pieter Spronck]] ('''1996'''). ''Elegance: Genetic Algorithms in Neural Reinforcement Control''. Master thesis, [[Delft University of Technology]], [http://ticc.uvt.nl/~pspronck/pubs/Elegance.pdf pdf]
 
* [[Pieter Spronck]] ('''1996'''). ''Elegance: Genetic Algorithms in Neural Reinforcement Control''. Master thesis, [[Delft University of Technology]], [http://ticc.uvt.nl/~pspronck/pubs/Elegance.pdf pdf]
 
* [[Raúl Rojas]] ('''1996'''). ''Neural Networks - A Systematic Introduction''. Springer, available as [http://www.inf.fu-berlin.de/inst/ag-ki/rojas_home/documents/1996/NeuralNetworks/neuron.pdf pdf ebook]
 
* [[Raúl Rojas]] ('''1996'''). ''Neural Networks - A Systematic Introduction''. Springer, available as [http://www.inf.fu-berlin.de/inst/ag-ki/rojas_home/documents/1996/NeuralNetworks/neuron.pdf pdf ebook]
 +
* [[Ida Sprinkhuizen-Kuyper]], [https://dblp.org/pers/hd/b/Boers:Egbert_J=_W= Egbert J. W. Boers] ('''1996'''). ''[https://ieeexplore.ieee.org/abstract/document/6796246 The Error Surface of the Simplest XOR Network Has Only Global Minima]''. [https://en.wikipedia.org/wiki/Neural_Computation_(journal) Neural Computation], Vol. 8, No. 6, [http://www.socsci.ru.nl/idak/publications/papers/NeuralComputation.pdf pdf]
 
'''1997'''
 
'''1997'''
 
* [[Mathematician#SHochreiter|Sepp Hochreiter]], [[Jürgen Schmidhuber]] ('''1997'''). ''Long short-term memory''. [https://en.wikipedia.org/wiki/Neural_Computation_%28journal%29 Neural Computation], Vol. 9, No. 8, [http://deeplearning.cs.cmu.edu/pdfs/Hochreiter97_lstm.pdf pdf] <ref>[https://en.wikipedia.org/wiki/Long_short_term_memory Long short term memory from Wikipedia]</ref>
 
* [[Mathematician#SHochreiter|Sepp Hochreiter]], [[Jürgen Schmidhuber]] ('''1997'''). ''Long short-term memory''. [https://en.wikipedia.org/wiki/Neural_Computation_%28journal%29 Neural Computation], Vol. 9, No. 8, [http://deeplearning.cs.cmu.edu/pdfs/Hochreiter97_lstm.pdf pdf] <ref>[https://en.wikipedia.org/wiki/Long_short_term_memory Long short term memory from Wikipedia]</ref>
Line 262: Line 255:
 
* [[Don Beal]], [[Martin C. Smith]] ('''1997'''). ''Learning Piece Values Using Temporal Differences''. [[ICGA Journal#20_3|ICCA Journal, Vol. 20, No. 3]]
 
* [[Don Beal]], [[Martin C. Smith]] ('''1997'''). ''Learning Piece Values Using Temporal Differences''. [[ICGA Journal#20_3|ICCA Journal, Vol. 20, No. 3]]
 
* [https://dblp.uni-trier.de/pers/hd/t/Thiesing:Frank_M= Frank M. Thiesing], [[Oliver Vornberger]] ('''1997'''). ''Forecasting Sales Using Neural Networks''. [https://dblp.uni-trier.de/db/conf/fuzzy/fuzzy1997.html Fuzzy Days 1997], [http://www2.inf.uos.de/papers_pdf/fuzzydays_97.pdf pdf]
 
* [https://dblp.uni-trier.de/pers/hd/t/Thiesing:Frank_M= Frank M. Thiesing], [[Oliver Vornberger]] ('''1997'''). ''Forecasting Sales Using Neural Networks''. [https://dblp.uni-trier.de/db/conf/fuzzy/fuzzy1997.html Fuzzy Days 1997], [http://www2.inf.uos.de/papers_pdf/fuzzydays_97.pdf pdf]
 +
* [[Simon Lucas]] ('''1997'''). ''[https://link.springer.com/chapter/10.1007/BFb0032531 Forward-Backward Building Blocks for Evolving Neural Networks with Intrinsic Learning Behaviors]''. [https://dblp.uni-trier.de/db/conf/iwann/iwann1997.html IWANN 1997]
 
'''1998'''
 
'''1998'''
 
* [[Kieran Greer]] ('''1998'''). ''A Neural Network Based Search Heuristic and its Application to Computer Chess''. D.Phil. Thesis, [https://en.wikipedia.org/wiki/University_of_Ulster University of Ulster]
 
* [[Kieran Greer]] ('''1998'''). ''A Neural Network Based Search Heuristic and its Application to Computer Chess''. D.Phil. Thesis, [https://en.wikipedia.org/wiki/University_of_Ulster University of Ulster]
Line 273: Line 267:
 
* <span id="FundamentalsNAI1st"></span>[[Toshinori Munakata]] ('''1998'''). ''[http://cis.csuohio.edu/~munakata/publs/book/sp.html Fundamentals of the New Artificial Intelligence: Beyond Traditional Paradigms]''. 1st edition, [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer], [[Neural Networks#FundamentalsNAI2nd|2nd edition 2008]]
 
* <span id="FundamentalsNAI1st"></span>[[Toshinori Munakata]] ('''1998'''). ''[http://cis.csuohio.edu/~munakata/publs/book/sp.html Fundamentals of the New Artificial Intelligence: Beyond Traditional Paradigms]''. 1st edition, [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer], [[Neural Networks#FundamentalsNAI2nd|2nd edition 2008]]
 
* [[Lex Weaver]], [https://bjbs.csu.edu.au/schools/computing-and-mathematics/staff/profiles/professorial-staff/terry-bossomaier Terry Bossomaier] ('''1998'''). ''Evolution of Neural Networks to Play the Game of Dots-and-Boxes''. [https://arxiv.org/abs/cs/9809111 arXiv:cs/9809111]
 
* [[Lex Weaver]], [https://bjbs.csu.edu.au/schools/computing-and-mathematics/staff/profiles/professorial-staff/terry-bossomaier Terry Bossomaier] ('''1998'''). ''Evolution of Neural Networks to Play the Game of Dots-and-Boxes''. [https://arxiv.org/abs/cs/9809111 arXiv:cs/9809111]
 +
* [[Norman Richards]], [[David E. Moriarty]], [[Risto Miikkulainen]] ('''1998'''). ''[http://nn.cs.utexas.edu/?richards:apin98 Evolving Neural Networks to Play Go]''. [https://www.springer.com/journal/10489 Applied Intelligence], Vol. 8, No. 1
 
'''1999'''
 
'''1999'''
 
* [[Kumar Chellapilla]], [[David B. Fogel]] ('''1999'''). ''[http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=784222 Evolution, Neural Networks, Games, and Intelligence]''. Proceedings of the IEEE, September, pp. 1471-1496. [http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.99.979 CiteSeerX]
 
* [[Kumar Chellapilla]], [[David B. Fogel]] ('''1999'''). ''[http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=784222 Evolution, Neural Networks, Games, and Intelligence]''. Proceedings of the IEEE, September, pp. 1471-1496. [http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.99.979 CiteSeerX]
Line 283: Line 278:
 
* [[Mathematician#GEHinton|Geoffrey E. Hinton]], [[Terrence J. Sejnowski]] (eds.) ('''1999'''). ''[https://mitpress.mit.edu/books/unsupervised-learning Unsupervised Learning: Foundations of Neural Computation]''.  [https://en.wikipedia.org/wiki/MIT_Press MIT Press]
 
* [[Mathematician#GEHinton|Geoffrey E. Hinton]], [[Terrence J. Sejnowski]] (eds.) ('''1999'''). ''[https://mitpress.mit.edu/books/unsupervised-learning Unsupervised Learning: Foundations of Neural Computation]''.  [https://en.wikipedia.org/wiki/MIT_Press MIT Press]
 
* [[Peter Dayan]] ('''1999'''). ''Recurrent Sampling Models for the Helmholtz Machine''. [https://en.wikipedia.org/wiki/Neural_Computation_(journal) Neural Computation], Vol. 11, No. 3, [http://www.gatsby.ucl.ac.uk/~dayan/papers/rechelm99.pdf pdf] <ref>[https://en.wikipedia.org/wiki/Helmholtz_machine Helmholtz machine from Wikipedia]</ref>
 
* [[Peter Dayan]] ('''1999'''). ''Recurrent Sampling Models for the Helmholtz Machine''. [https://en.wikipedia.org/wiki/Neural_Computation_(journal) Neural Computation], Vol. 11, No. 3, [http://www.gatsby.ucl.ac.uk/~dayan/papers/rechelm99.pdf pdf] <ref>[https://en.wikipedia.org/wiki/Helmholtz_machine Helmholtz machine from Wikipedia]</ref>
 +
* [[Ida Sprinkhuizen-Kuyper]], [https://dblp.org/pers/hd/b/Boers:Egbert_J=_W= Egbert J. W. Boers] ('''1999'''). ''[https://ieeexplore.ieee.org/document/774274 A local minimum for the 2-3-1 XOR network]''.  [[IEEE#NN|IEEE Transactions on Neural Networks]], Vol. 10, No. 4
 
==2000 ...==
 
==2000 ...==
 
* [[Levente Kocsis]], [[Jos Uiterwijk]], [[Jaap van den Herik]] ('''2000'''). ''[http://link.springer.com/chapter/10.1007%2F3-540-45579-5_11 Learning Time Allocation using Neural Networks]''. [[CG 2000]]
 
* [[Levente Kocsis]], [[Jos Uiterwijk]], [[Jaap van den Herik]] ('''2000'''). ''[http://link.springer.com/chapter/10.1007%2F3-540-45579-5_11 Learning Time Allocation using Neural Networks]''. [[CG 2000]]
Line 296: Line 292:
 
* [[Levente Kocsis]], [[Jos Uiterwijk]], [[Jaap van den Herik]] ('''2001'''). ''Move Ordering using Neural Networks''. IEA/AIE 2001, [https://en.wikipedia.org/wiki/Lecture_Notes_in_Computer_Science LNCS] 2070, [http://www.pradu.us/old/Nov27_2008/Buzz/research/parallel/fulltext.pdf pdf]
 
* [[Levente Kocsis]], [[Jos Uiterwijk]], [[Jaap van den Herik]] ('''2001'''). ''Move Ordering using Neural Networks''. IEA/AIE 2001, [https://en.wikipedia.org/wiki/Lecture_Notes_in_Computer_Science LNCS] 2070, [http://www.pradu.us/old/Nov27_2008/Buzz/research/parallel/fulltext.pdf pdf]
 
* [[Kee Siong Ng]] ('''2001'''). ''Neural Networks for Structured Data''. BSc-Thesis, [http://users.cecs.anu.edu.au/~kee/hon-thesis.ps.gz zipped ps]
 
* [[Kee Siong Ng]] ('''2001'''). ''Neural Networks for Structured Data''. BSc-Thesis, [http://users.cecs.anu.edu.au/~kee/hon-thesis.ps.gz zipped ps]
* [[Jonathan Schaeffer]], [[Markian Hlynka]], [[Vili Jussila]] ('''2001'''). ''Temporal Difference Learning Applied to a High-Performance Game-Playing Program''. [http://www.informatik.uni-trier.de/~ley/db/conf/ijcai/ijcai2001.html#SchaefferHJ01 IJCAI 2001]
 
* [[Don Beal]], [[Martin C. Smith]] ('''2001'''). ''Temporal difference learning applied to game playing and the results of application to Shogi''. Theoretical Computer Science, Volume 252, Issues 1-2, pp. 105-119
 
* [[Nicol N. Schraudolph]], [[Peter Dayan]], [[Terrence J. Sejnowski]]  ('''2001'''). ''[http://nic.schraudolph.org/bib2html/b2hd-SchDaySej01.html Learning to Evaluate Go Positions via Temporal Difference Methods]''. [http://jasss.soc.surrey.ac.uk/7/1/reviews/takama.html Computational Intelligence in Games, Studies in Fuzziness and Soft Computing], [http://www.springer.com/economics?SGWID=1-165-6-73481-0 Physica-Verlag]
 
 
* [[Peter Dayan]], [https://en.wikipedia.org/wiki/Larry_Abbott Laurence F. Abbott] ('''2001, 2005'''). ''[http://www.gatsby.ucl.ac.uk/~dayan/book/index.html Theoretical Neuroscience: Computational and Mathematical Modeling of Neural Systems]''. [https://en.wikipedia.org/wiki/MIT_Press MIT Press]  
 
* [[Peter Dayan]], [https://en.wikipedia.org/wiki/Larry_Abbott Laurence F. Abbott] ('''2001, 2005'''). ''[http://www.gatsby.ucl.ac.uk/~dayan/book/index.html Theoretical Neuroscience: Computational and Mathematical Modeling of Neural Systems]''. [https://en.wikipedia.org/wiki/MIT_Press MIT Press]  
 
'''2002'''
 
'''2002'''
Line 308: Line 301:
 
* [[Moshe Sipper]] ('''2002''') ''[http://books.google.com/books/about/Machine_Nature.html?id=fbFQAAAAMAAJ&redir_esc=y Machine Nature: The Coming Age of Bio-Inspired Computing]''. [https://en.wikipedia.org/wiki/McGraw-Hill_Financial McGraw-Hill, New York]
 
* [[Moshe Sipper]] ('''2002''') ''[http://books.google.com/books/about/Machine_Nature.html?id=fbFQAAAAMAAJ&redir_esc=y Machine Nature: The Coming Age of Bio-Inspired Computing]''. [https://en.wikipedia.org/wiki/McGraw-Hill_Financial McGraw-Hill, New York]
 
* [[Paul E. Utgoff]], [[David J. Stracuzzi]] ('''2002'''). ''Many-Layered Learning''. [https://en.wikipedia.org/wiki/Neural_Computation_%28journal%29 Neural Computation], Vol. 14, No. 10, [http://people.cs.umass.edu/~utgoff/papers/neco-stl.pdf pdf]
 
* [[Paul E. Utgoff]], [[David J. Stracuzzi]] ('''2002'''). ''Many-Layered Learning''. [https://en.wikipedia.org/wiki/Neural_Computation_%28journal%29 Neural Computation], Vol. 14, No. 10, [http://people.cs.umass.edu/~utgoff/papers/neco-stl.pdf pdf]
* [[Michael I. Jordan]], [[Terrence J. Sejnowski]] (eds.) ('''2002'''). ''[https://mitpress.mit.edu/books/graphical-models Graphical Models: Foundations of Neural Computation]''. [https://en.wikipedia.org/wiki/MIT_Press MIT Press]
+
* [[Mathematician#MIJordan|Michael I. Jordan]], [[Terrence J. Sejnowski]] (eds.) ('''2002'''). ''[https://mitpress.mit.edu/books/graphical-models Graphical Models: Foundations of Neural Computation]''. [https://en.wikipedia.org/wiki/MIT_Press MIT Press]
 +
* [[Kenneth O. Stanley]], [[Risto Miikkulainen]] ('''2002'''). ''[http://nn.cs.utexas.edu/?stanley:ec02 Evolving Neural Networks Through Augmenting Topologies]''. [https://en.wikipedia.org/wiki/Evolutionary_Computation_(journal) Evolutionary Computation], Vol. 10, No. 2
 
'''2003'''
 
'''2003'''
 
* [[Levente Kocsis]] ('''2003'''). ''Learning Search Decisions''. Ph.D thesis, [[Maastricht University]], [https://project.dke.maastrichtuniversity.nl/games/files/phd/Kocsis_thesis.pdf pdf]
 
* [[Levente Kocsis]] ('''2003'''). ''Learning Search Decisions''. Ph.D thesis, [[Maastricht University]], [https://project.dke.maastrichtuniversity.nl/games/files/phd/Kocsis_thesis.pdf pdf]
Line 352: Line 346:
 
* [[Mathematician#GMontavon|Grégoire Montavon]] ('''2013'''). ''[https://opus4.kobv.de/opus4-tuberlin/frontdoor/index/index/docId/4467 On Layer-Wise Representations in Deep Neural Networks]''. Ph.D. Thesis, [https://en.wikipedia.org/wiki/Technical_University_of_Berlin TU Berlin], advisor [[Mathematician#KRMueller|Klaus-Robert Müller]]
 
* [[Mathematician#GMontavon|Grégoire Montavon]] ('''2013'''). ''[https://opus4.kobv.de/opus4-tuberlin/frontdoor/index/index/docId/4467 On Layer-Wise Representations in Deep Neural Networks]''. Ph.D. Thesis, [https://en.wikipedia.org/wiki/Technical_University_of_Berlin TU Berlin], advisor [[Mathematician#KRMueller|Klaus-Robert Müller]]
 
* [[Volodymyr Mnih]], [[Koray Kavukcuoglu]], [[David Silver]], [[Alex Graves]], [[Ioannis Antonoglou]], [[Daan Wierstra]], [[Martin Riedmiller]] ('''2013'''). ''Playing Atari with Deep Reinforcement Learning''. [http://arxiv.org/abs/1312.5602 arXiv:1312.5602] <ref>[http://www.nervanasys.com/demystifying-deep-reinforcement-learning/ Demystifying Deep Reinforcement Learning] by [http://www.nervanasys.com/author/tambet/ Tambet Matiisen], [http://www.nervanasys.com/ Nervana], December 21, 2015</ref>
 
* [[Volodymyr Mnih]], [[Koray Kavukcuoglu]], [[David Silver]], [[Alex Graves]], [[Ioannis Antonoglou]], [[Daan Wierstra]], [[Martin Riedmiller]] ('''2013'''). ''Playing Atari with Deep Reinforcement Learning''. [http://arxiv.org/abs/1312.5602 arXiv:1312.5602] <ref>[http://www.nervanasys.com/demystifying-deep-reinforcement-learning/ Demystifying Deep Reinforcement Learning] by [http://www.nervanasys.com/author/tambet/ Tambet Matiisen], [http://www.nervanasys.com/ Nervana], December 21, 2015</ref>
 +
* [[Risto Miikkulainen]] ('''2013'''). ''Evolving Neural Networks''. [https://dblp.org/db/conf/ijcnn/ijcnn2013 IJCNN 2013], [http://nn.cs.utexas.edu/downloads/slides/miikkulainen.ijcnn13.pdf pdf]
 
'''2014'''
 
'''2014'''
* [[Ian Goodfellow]], [[Jean Pouget-Abadie]], [[Mehdi Mirza]], [[Bing Xu]], [[David Warde-Farley]], [[Sherjil Ozair]], [[Aaron Courville]], [[Yoshua Bengio]] ('''2014'''). ''Generative Adversarial Networks''. [https://arxiv.org/abs/1406.2661v1 arXiv:1406.2661v1] <ref>[https://en.wikipedia.org/wiki/Generative_adversarial_networks Generative adversarial networks from Wikipedia]</ref>
+
* [[Mathematician#YDauphin|Yann Dauphin]], [[Mathematician#RPascanu|Razvan Pascanu]], [[Mathematician#CGulcehre|Caglar Gulcehre]], [[Mathematician#KCho|Kyunghyun Cho]], [[Mathematician#SGanguli|Surya Ganguli]], [[Mathematician#YBengio|Yoshua Bengio]] ('''2014'''). ''Identifying and attacking the saddle point problem in high-dimensional non-convex optimization''. [https://arxiv.org/abs/1406.2572 arXiv:1406.2572] <ref>[https://groups.google.com/d/msg/fishcooking/wOfRuzTSi_8/VgjN8MmSBQAJ high dimensional optimization] by [[Warren D. Smith]], [[Computer Chess Forums|FishCooking]], December 27, 2019</ref>
 +
* [[Mathematician#IGoodfellow|Ian Goodfellow]], [[Jean Pouget-Abadie]], [[Mehdi Mirza]], [[Bing Xu]], [[David Warde-Farley]], [[Sherjil Ozair]], [[Mathematician#ACourville|Aaron Courville]], [[Mathematician#YBengio|Yoshua Bengio]] ('''2014'''). ''Generative Adversarial Networks''. [https://arxiv.org/abs/1406.2661v1 arXiv:1406.2661v1] <ref>[https://en.wikipedia.org/wiki/Generative_adversarial_networks Generative adversarial networks from Wikipedia]</ref>
 
* [[Christopher Clark]], [[Amos Storkey]] ('''2014'''). ''Teaching Deep Convolutional Neural Networks to Play Go''. [http://arxiv.org/abs/1412.3409 arXiv:1412.3409] <ref>[http://computer-go.org/pipermail/computer-go/2014-December/007010.html Teaching Deep Convolutional Neural Networks to Play Go] by [[Hiroshi Yamashita]], [http://computer-go.org/pipermail/computer-go/ The Computer-go Archives], December 14, 2014</ref> <ref>[http://www.talkchess.com/forum/viewtopic.php?t=54663 Teaching Deep Convolutional Neural Networks to Play Go] by [[Michel Van den Bergh]], [[CCC]], December 16, 2014</ref>  
 
* [[Christopher Clark]], [[Amos Storkey]] ('''2014'''). ''Teaching Deep Convolutional Neural Networks to Play Go''. [http://arxiv.org/abs/1412.3409 arXiv:1412.3409] <ref>[http://computer-go.org/pipermail/computer-go/2014-December/007010.html Teaching Deep Convolutional Neural Networks to Play Go] by [[Hiroshi Yamashita]], [http://computer-go.org/pipermail/computer-go/ The Computer-go Archives], December 14, 2014</ref> <ref>[http://www.talkchess.com/forum/viewtopic.php?t=54663 Teaching Deep Convolutional Neural Networks to Play Go] by [[Michel Van den Bergh]], [[CCC]], December 16, 2014</ref>  
 
* [[Chris J. Maddison]],  [[Shih-Chieh Huang|Aja Huang]], [[Ilya Sutskever]], [[David Silver]] ('''2014'''). ''Move Evaluation in Go Using Deep Convolutional Neural Networks''. [http://arxiv.org/abs/1412.6564v1 arXiv:1412.6564v1] » [[Go]]
 
* [[Chris J. Maddison]],  [[Shih-Chieh Huang|Aja Huang]], [[Ilya Sutskever]], [[David Silver]] ('''2014'''). ''Move Evaluation in Go Using Deep Convolutional Neural Networks''. [http://arxiv.org/abs/1412.6564v1 arXiv:1412.6564v1] » [[Go]]
Line 364: Line 360:
 
* [[James L. McClelland]] ('''2015'''). ''[https://web.stanford.edu/group/pdplab/pdphandbook/handbook3.html#handbookch10.html Explorations in Parallel Distributed Processing: A Handbook of Models, Programs, and Exercises]''. Second Edition, [https://web.stanford.edu/group/pdplab/pdphandbook/handbookli1.html Contents]
 
* [[James L. McClelland]] ('''2015'''). ''[https://web.stanford.edu/group/pdplab/pdphandbook/handbook3.html#handbookch10.html Explorations in Parallel Distributed Processing: A Handbook of Models, Programs, and Exercises]''. Second Edition, [https://web.stanford.edu/group/pdplab/pdphandbook/handbookli1.html Contents]
 
* [[Gábor Melis]] ('''2015'''). ''[http://jmlr.org/proceedings/papers/v42/meli14.html Dissecting the Winning Solution of the HiggsML Challenge]''. [https://nips.cc/Conferences/2014 NIPS 2014]
 
* [[Gábor Melis]] ('''2015'''). ''[http://jmlr.org/proceedings/papers/v42/meli14.html Dissecting the Winning Solution of the HiggsML Challenge]''. [https://nips.cc/Conferences/2014 NIPS 2014]
* [[Volodymyr Mnih]], [[Koray Kavukcuoglu]], [[David Silver]], [[Andrei A. Rusu]], [[Joel Veness]], [[Marc G. Bellemare]], [[Alex Graves]], [[Martin Riedmiller]], [[Andreas K. Fidjeland]], [[Georg Ostrovski]], [[Stig Petersen]], [[Charles Beattie]], [[Amir Sadik]], [[Ioannis Antonoglou]], [[Helen King]], [[Dharshan Kumaran]], [[Daan Wierstra]], [[Shane Legg]], [[Demis Hassabis]] ('''2015'''). ''[http://www.nature.com/nature/journal/v518/n7540/abs/nature14236.html Human-level control through deep reinforcement learning]''. [https://en.wikipedia.org/wiki/Nature_%28journal%29 Nature], Vol. 518
+
* [[Volodymyr Mnih]], [[Koray Kavukcuoglu]], [[David Silver]], [[Mathematician#AARusu|Andrei A. Rusu]], [[Joel Veness]], [[Marc G. Bellemare]], [[Alex Graves]], [[Martin Riedmiller]], [[Andreas K. Fidjeland]], [[Georg Ostrovski]], [[Stig Petersen]], [[Charles Beattie]], [[Amir Sadik]], [[Ioannis Antonoglou]], [[Helen King]], [[Dharshan Kumaran]], [[Daan Wierstra]], [[Shane Legg]], [[Demis Hassabis]] ('''2015'''). ''[http://www.nature.com/nature/journal/v518/n7540/abs/nature14236.html Human-level control through deep reinforcement learning]''. [https://en.wikipedia.org/wiki/Nature_%28journal%29 Nature], Vol. 518
 
* [[Jürgen Schmidhuber]] ('''2015'''). ''[http://people.idsia.ch/~juergen/deep-learning-overview.html Deep Learning in Neural Networks: An Overview]''. [https://en.wikipedia.org/wiki/Neural_Networks_(journal) Neural Networks], Vol. 61
 
* [[Jürgen Schmidhuber]] ('''2015'''). ''[http://people.idsia.ch/~juergen/deep-learning-overview.html Deep Learning in Neural Networks: An Overview]''. [https://en.wikipedia.org/wiki/Neural_Networks_(journal) Neural Networks], Vol. 61
 
* [https://scholar.google.fr/citations?user=MN9Kfg8AAAAJ&hl=en Zachary C. Lipton], [https://www.linkedin.com/in/john-berkowitz-92b24a7b John Berkowitz], [[Charles Elkan]] ('''2015'''). ''A Critical Review of Recurrent Neural Networks for Sequence Learning''. [https://arxiv.org/abs/1506.00019 arXiv:1506.00019v4]
 
* [https://scholar.google.fr/citations?user=MN9Kfg8AAAAJ&hl=en Zachary C. Lipton], [https://www.linkedin.com/in/john-berkowitz-92b24a7b John Berkowitz], [[Charles Elkan]] ('''2015'''). ''A Critical Review of Recurrent Neural Networks for Sequence Learning''. [https://arxiv.org/abs/1506.00019 arXiv:1506.00019v4]
* [[Guillaume Desjardins]], [[Karen Simonyan]], [[Razvan Pascanu]], [[Koray Kavukcuoglu]] ('''2015'''). ''Natural Neural Networks''. [https://arxiv.org/abs/1507.00210 arXiv:1507.00210]
+
* [[Guillaume Desjardins]], [[Karen Simonyan]], [[Mathematician#RPascanu|Razvan Pascanu]], [[Koray Kavukcuoglu]] ('''2015'''). ''Natural Neural Networks''. [https://arxiv.org/abs/1507.00210 arXiv:1507.00210]
 
* [[Barak Oshri]], [[Nishith Khandwala]] ('''2015'''). ''Predicting Moves in Chess using Convolutional Neural Networks''. [http://cs231n.stanford.edu/reports/ConvChess.pdf pdf] <ref>[https://github.com/BarakOshri/ConvChess GitHub - BarakOshri/ConvChess: Predicting Moves in Chess Using Convolutional Neural Networks]</ref> <ref>[http://www.talkchess.com/forum/viewtopic.php?t=63458 ConvChess CNN] by [[Brian Richardson]], [[CCC]], March 15, 2017</ref>
 
* [[Barak Oshri]], [[Nishith Khandwala]] ('''2015'''). ''Predicting Moves in Chess using Convolutional Neural Networks''. [http://cs231n.stanford.edu/reports/ConvChess.pdf pdf] <ref>[https://github.com/BarakOshri/ConvChess GitHub - BarakOshri/ConvChess: Predicting Moves in Chess Using Convolutional Neural Networks]</ref> <ref>[http://www.talkchess.com/forum/viewtopic.php?t=63458 ConvChess CNN] by [[Brian Richardson]], [[CCC]], March 15, 2017</ref>
* [https://en.wikipedia.org/wiki/Yann_LeCun Yann LeCun], [[Yoshua Bengio]], [[Mathematician#GEHinton|Geoffrey E. Hinton]] ('''2015'''). ''[http://www.nature.com/nature/journal/v521/n7553/full/nature14539.html Deep Learning]''. [https://en.wikipedia.org/wiki/Nature_%28journal%29 Nature], Vol. 521 <ref>[[Jürgen Schmidhuber]] ('''2015''') ''[http://people.idsia.ch/~juergen/deep-learning-conspiracy.html Critique of Paper by "Deep Learning Conspiracy" (Nature 521 p 436)]''.</ref>
+
* [[Mathematician#YLeCun|Yann LeCun]], [[Mathematician#YBengio|Yoshua Bengio]], [[Mathematician#GEHinton|Geoffrey E. Hinton]] ('''2015'''). ''[http://www.nature.com/nature/journal/v521/n7553/full/nature14539.html Deep Learning]''. [https://en.wikipedia.org/wiki/Nature_%28journal%29 Nature], Vol. 521 <ref>[[Jürgen Schmidhuber]] ('''2015''') ''[http://people.idsia.ch/~juergen/deep-learning-conspiracy.html Critique of Paper by "Deep Learning Conspiracy" (Nature 521 p 436)]''.</ref>
 
* [[Matthew Lai]] ('''2015'''). ''Giraffe: Using Deep Reinforcement Learning to Play Chess''. M.Sc. thesis, [https://en.wikipedia.org/wiki/Imperial_College_London Imperial College London],  [http://arxiv.org/abs/1509.01549v1 arXiv:1509.01549v1] » [[Giraffe]]
 
* [[Matthew Lai]] ('''2015'''). ''Giraffe: Using Deep Reinforcement Learning to Play Chess''. M.Sc. thesis, [https://en.wikipedia.org/wiki/Imperial_College_London Imperial College London],  [http://arxiv.org/abs/1509.01549v1 arXiv:1509.01549v1] » [[Giraffe]]
 
* [[Nikolai Yakovenko]], [[Liangliang Cao]], [[Colin Raffel]], [[James Fan]] ('''2015'''). ''Poker-CNN: A Pattern Learning Strategy for Making Draws and Bets in Poker Games''. [https://arxiv.org/abs/1509.06731 arXiv:1509.06731]
 
* [[Nikolai Yakovenko]], [[Liangliang Cao]], [[Colin Raffel]], [[James Fan]] ('''2015'''). ''Poker-CNN: A Pattern Learning Strategy for Making Draws and Bets in Poker Games''. [https://arxiv.org/abs/1509.06731 arXiv:1509.06731]
Line 387: Line 383:
 
* [[Eli David|Omid E. David]], [[Nathan S. Netanyahu]], [[Lior Wolf]] ('''2016'''). ''[http://link.springer.com/chapter/10.1007%2F978-3-319-44781-0_11 DeepChess: End-to-End Deep Neural Network for Automatic Learning in Chess]''. [http://icann2016.org/ ICAAN 2016], [https://en.wikipedia.org/wiki/Lecture_Notes_in_Computer_Science Lecture Notes in Computer Science], Vol. 9887, [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer], [http://www.cs.tau.ac.il/~wolf/papers/deepchess.pdf pdf preprint] » [[DeepChess]] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=61748 DeepChess: Another deep-learning based chess program] by [[Matthew Lai]], [[CCC]], October 17, 2016</ref> <ref>[http://icann2016.org/index.php/conference-programme/recipients-of-the-best-paper-awards/ ICANN 2016 | Recipients of the best paper awards]</ref>
 
* [[Eli David|Omid E. David]], [[Nathan S. Netanyahu]], [[Lior Wolf]] ('''2016'''). ''[http://link.springer.com/chapter/10.1007%2F978-3-319-44781-0_11 DeepChess: End-to-End Deep Neural Network for Automatic Learning in Chess]''. [http://icann2016.org/ ICAAN 2016], [https://en.wikipedia.org/wiki/Lecture_Notes_in_Computer_Science Lecture Notes in Computer Science], Vol. 9887, [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer], [http://www.cs.tau.ac.il/~wolf/papers/deepchess.pdf pdf preprint] » [[DeepChess]] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=61748 DeepChess: Another deep-learning based chess program] by [[Matthew Lai]], [[CCC]], October 17, 2016</ref> <ref>[http://icann2016.org/index.php/conference-programme/recipients-of-the-best-paper-awards/ ICANN 2016 | Recipients of the best paper awards]</ref>
 
* [[Dror Sholomon]], [[Eli David|Omid E. David]], [[Nathan S. Netanyahu]] ('''2016'''). ''[http://link.springer.com/chapter/10.1007/978-3-319-44781-0_21 DNN-Buddies: A Deep Neural Network-Based Estimation Metric for the Jigsaw Puzzle Problem]''. [http://icann2016.org/ ICAAN 2016], [https://en.wikipedia.org/wiki/Lecture_Notes_in_Computer_Science Lecture Notes in Computer Science], Vol. 9887, [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer] <ref>[https://en.wikipedia.org/wiki/Jigsaw_puzzle Jigsaw puzzle from Wikipedia]</ref>
 
* [[Dror Sholomon]], [[Eli David|Omid E. David]], [[Nathan S. Netanyahu]] ('''2016'''). ''[http://link.springer.com/chapter/10.1007/978-3-319-44781-0_21 DNN-Buddies: A Deep Neural Network-Based Estimation Metric for the Jigsaw Puzzle Problem]''. [http://icann2016.org/ ICAAN 2016], [https://en.wikipedia.org/wiki/Lecture_Notes_in_Computer_Science Lecture Notes in Computer Science], Vol. 9887, [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer] <ref>[https://en.wikipedia.org/wiki/Jigsaw_puzzle Jigsaw puzzle from Wikipedia]</ref>
* [[Ian Goodfellow]], [[Yoshua Bengio]], [[Aaron Courville]] ('''2016'''). ''[http://www.deeplearningbook.org/ Deep Learning]''. [https://en.wikipedia.org/wiki/MIT_Press MIT Press]
+
* [[Mathematician#IGoodfellow|Ian Goodfellow]], [[Mathematician#YBengio|Yoshua Bengio]], [[Mathematician#ACourville|Aaron Courville]] ('''2016'''). ''[http://www.deeplearningbook.org/ Deep Learning]''. [https://en.wikipedia.org/wiki/MIT_Press MIT Press]
 
* [[Volodymyr Mnih]], [[Adrià Puigdomènech Badia]], [[Mehdi Mirza]], [[Alex Graves]], [[Timothy Lillicrap]], [[Tim Harley]], [[David Silver]], [[Koray Kavukcuoglu]] ('''2016'''). ''Asynchronous Methods for Deep Reinforcement Learning''.  [https://arxiv.org/abs/1602.01783 arXiv:1602.01783v2]
 
* [[Volodymyr Mnih]], [[Adrià Puigdomènech Badia]], [[Mehdi Mirza]], [[Alex Graves]], [[Timothy Lillicrap]], [[Tim Harley]], [[David Silver]], [[Koray Kavukcuoglu]] ('''2016'''). ''Asynchronous Methods for Deep Reinforcement Learning''.  [https://arxiv.org/abs/1602.01783 arXiv:1602.01783v2]
 
* [https://scholar.google.ca/citations?user=mZfgLA4AAAAJ&hl=en Vincent Dumoulin], [https://scholar.google.it/citations?user=kaAnZw0AAAAJ&hl=en Francesco Visin] ('''2016'''). ''A guide to convolution arithmetic for deep learning''. [https://arxiv.org/abs/1603.07285 arXiv:1603.07285]
 
* [https://scholar.google.ca/citations?user=mZfgLA4AAAAJ&hl=en Vincent Dumoulin], [https://scholar.google.it/citations?user=kaAnZw0AAAAJ&hl=en Francesco Visin] ('''2016'''). ''A guide to convolution arithmetic for deep learning''. [https://arxiv.org/abs/1603.07285 arXiv:1603.07285]
 
* [https://en.wikipedia.org/wiki/Patricia_Churchland Patricia Churchland], [[Terrence J. Sejnowski]] ('''2016'''). ''[https://mitpress.mit.edu/books/computational-brain-0 The Computational Brain, 25th Anniversary Edition]''. [https://en.wikipedia.org/wiki/MIT_Press MIT Press]  
 
* [https://en.wikipedia.org/wiki/Patricia_Churchland Patricia Churchland], [[Terrence J. Sejnowski]] ('''2016'''). ''[https://mitpress.mit.edu/books/computational-brain-0 The Computational Brain, 25th Anniversary Edition]''. [https://en.wikipedia.org/wiki/MIT_Press MIT Press]  
 +
* [[Ilya Loshchilov]], [[Frank Hutter]] ('''2016'''). ''CMA-ES for Hyperparameter Optimization of Deep Neural Networks''. [https://arxiv.org/abs/1604.07269 arXiv:1604.07269] <ref>[https://en.wikipedia.org/wiki/CMA-ES CMA-ES from Wikipedia]</ref>
 
* [[Audrūnas Gruslys]], [[Rémi Munos]], [[Ivo Danihelka]], [[Marc Lanctot]], [[Alex Graves]] ('''2016'''). ''Memory-Efficient Backpropagation Through Time''. [https://arxiv.org/abs/1606.03401v1 arXiv:1606.03401]
 
* [[Audrūnas Gruslys]], [[Rémi Munos]], [[Ivo Danihelka]], [[Marc Lanctot]], [[Alex Graves]] ('''2016'''). ''Memory-Efficient Backpropagation Through Time''. [https://arxiv.org/abs/1606.03401v1 arXiv:1606.03401]
* [[Andrei A. Rusu]], [[Neil C. Rabinowitz]], [[Guillaume Desjardins]], [[Hubert Soyer]], [[James Kirkpatrick]], [[Koray Kavukcuoglu]], [[Razvan Pascanu]], [[Raia Hadsell]] ('''2016'''). ''Progressive Neural Networks''. [https://arxiv.org/abs/1606.04671 arXiv:1606.04671]
+
* [[Mathematician#AARusu|Andrei A. Rusu]], [[Neil C. Rabinowitz]], [[Guillaume Desjardins]], [[Hubert Soyer]], [[James Kirkpatrick]], [[Koray Kavukcuoglu]], [[Mathematician#RPascanu|Razvan Pascanu]], [[Mathematician#RHadsell|Raia Hadsell]] ('''2016'''). ''Progressive Neural Networks''. [https://arxiv.org/abs/1606.04671 arXiv:1606.04671]
 
* [[George Rajna]] ('''2016'''). ''Deep Neural Networks''. [http://vixra.org/abs/1609.0126 viXra:1609.0126]
 
* [[George Rajna]] ('''2016'''). ''Deep Neural Networks''. [http://vixra.org/abs/1609.0126 viXra:1609.0126]
* [[James Kirkpatrick]], [[Razvan Pascanu]], [[Neil C. Rabinowitz]], [[Joel Veness]], [[Guillaume Desjardins]], [[Andrei A. Rusu]], [[Kieran Milan]], [[John Quan]], [[Tiago Ramalho]],  [[Agnieszka Grabska-Barwinska]], [[Demis Hassabis]], [[Claudia Clopath]], [[Dharshan Kumaran]], [[Raia Hadsell]] ('''2016'''). ''Overcoming catastrophic forgetting in neural networks''. [https://arxiv.org/abs/1612.00796 arXiv:1612.00796] <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=70704 catastrophic forgetting] by [[Daniel Shawul]], [[CCC]], May 09, 2019</ref>
+
* [[James Kirkpatrick]], [[Mathematician#RPascanu|Razvan Pascanu]], [[Neil C. Rabinowitz]], [[Joel Veness]], [[Guillaume Desjardins]], [[Mathematician#AARusu|Andrei A. Rusu]], [[Kieran Milan]], [[John Quan]], [[Tiago Ramalho]],  [[Agnieszka Grabska-Barwinska]], [[Demis Hassabis]], [[Claudia Clopath]], [[Dharshan Kumaran]], [[Mathematician#RHadsell|Raia Hadsell]] ('''2016'''). ''Overcoming catastrophic forgetting in neural networks''. [https://arxiv.org/abs/1612.00796 arXiv:1612.00796] <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=70704 catastrophic forgetting] by [[Daniel Shawul]], [[CCC]], May 09, 2019</ref>
 +
* [https://dblp.uni-trier.de/pers/hd/n/Niu:Zhenxing Zhenxing Niu], [https://dblp.uni-trier.de/pers/hd/z/Zhou:Mo Mo Zhou], [https://dblp.uni-trier.de/pers/hd/w/Wang_0003:Le Le Wang], [[Xinbo Gao]], [https://dblp.uni-trier.de/pers/hd/h/Hua_0001:Gang Gang Hua] ('''2016'''). ''Ordinal Regression with Multiple Output CNN for Age Estimation''. [https://dblp.uni-trier.de/db/conf/cvpr/cvpr2016.html CVPR 2016], [https://www.cv-foundation.org/openaccess/content_cvpr_2016/app/S21-20.pdf pdf]
 +
* [[Li Jing]], [[Yichen Shen]], [[Tena Dubček]], [[John Peurifoy]], [[Scott Skirlo]], [[Mathematician#YLeCun|Yann LeCun]], [[Max Tegmark]], [[Marin Soljačić]] ('''2016'''). ''Tunable Efficient Unitary Neural Networks (EUNN) and their application to RNNs''. [https://arxiv.org/abs/1612.05231 arXiv:1612.05231]
 
'''2017'''
 
'''2017'''
 
* [[Yutian Chen]], [[Matthew W. Hoffman]], [[Sergio Gomez Colmenarejo]], [[Misha Denil]], [[Timothy Lillicrap]], [[Matthew Botvinick]], [[Nando de Freitas]] ('''2017'''). ''Learning to Learn without Gradient Descent by Gradient Descent''. [https://arxiv.org/abs/1611.03824v6 arXiv:1611.03824v6], [http://dblp.uni-trier.de/db/conf/icml/icml2017.html ICML 2017]
 
* [[Yutian Chen]], [[Matthew W. Hoffman]], [[Sergio Gomez Colmenarejo]], [[Misha Denil]], [[Timothy Lillicrap]], [[Matthew Botvinick]], [[Nando de Freitas]] ('''2017'''). ''Learning to Learn without Gradient Descent by Gradient Descent''. [https://arxiv.org/abs/1611.03824v6 arXiv:1611.03824v6], [http://dblp.uni-trier.de/db/conf/icml/icml2017.html ICML 2017]
Line 413: Line 412:
 
* [https://dblp.org/pers/hd/s/Serb:Alexander Alexantrou Serb], [[Edoardo Manino]], [https://dblp.org/pers/hd/m/Messaris:Ioannis Ioannis Messaris], [https://dblp.org/pers/hd/t/Tran=Thanh:Long Long Tran-Thanh], [https://www.orc.soton.ac.uk/people/tp1f12 Themis Prodromakis] ('''2017'''). ''[https://eprints.soton.ac.uk/425616/ Hardware-level Bayesian inference]''. [https://nips.cc/Conferences/2017 NIPS 2017] » [[Analog Evaluation]]
 
* [https://dblp.org/pers/hd/s/Serb:Alexander Alexantrou Serb], [[Edoardo Manino]], [https://dblp.org/pers/hd/m/Messaris:Ioannis Ioannis Messaris], [https://dblp.org/pers/hd/t/Tran=Thanh:Long Long Tran-Thanh], [https://www.orc.soton.ac.uk/people/tp1f12 Themis Prodromakis] ('''2017'''). ''[https://eprints.soton.ac.uk/425616/ Hardware-level Bayesian inference]''. [https://nips.cc/Conferences/2017 NIPS 2017] » [[Analog Evaluation]]
 
'''2018'''
 
'''2018'''
 +
* [[Yu Nasu]] ('''2018'''). ''&#398;U&#1048;&#1048; Efficiently Updatable Neural-Network based Evaluation Functions for Computer Shogi''.  Ziosoft Computer Shogi Club, [https://github.com/ynasu87/nnue/blob/master/docs/nnue.pdf pdf] (Japanese with English abstract) » [[NNUE]]
 
* [[Kei Takada]], [[Hiroyuki Iizuka]], [[Masahito Yamamoto]] ('''2018'''). ''[https://link.springer.com/chapter/10.1007%2F978-3-319-75931-9_2 Computer Hex Algorithm Using a Move Evaluation Method Based on a Convolutional Neural Network]''. [https://link.springer.com/bookseries/7899 Communications in Computer and Information Science] » [[Hex]]
 
* [[Kei Takada]], [[Hiroyuki Iizuka]], [[Masahito Yamamoto]] ('''2018'''). ''[https://link.springer.com/chapter/10.1007%2F978-3-319-75931-9_2 Computer Hex Algorithm Using a Move Evaluation Method Based on a Convolutional Neural Network]''. [https://link.springer.com/bookseries/7899 Communications in Computer and Information Science] » [[Hex]]
 
* [[Matthia Sabatelli]], [[Francesco Bidoia]], [[Valeriu Codreanu]], [[Marco Wiering]] ('''2018'''). ''Learning to Evaluate Chess Positions with Deep Neural Networks and Limited Lookahead''. ICPRAM 2018, [http://www.ai.rug.nl/~mwiering/GROUP/ARTICLES/ICPRAM_CHESS_DNN_2018.pdf pdf]
 
* [[Matthia Sabatelli]], [[Francesco Bidoia]], [[Valeriu Codreanu]], [[Marco Wiering]] ('''2018'''). ''Learning to Evaluate Chess Positions with Deep Neural Networks and Limited Lookahead''. ICPRAM 2018, [http://www.ai.rug.nl/~mwiering/GROUP/ARTICLES/ICPRAM_CHESS_DNN_2018.pdf pdf]
Line 421: Line 421:
 
'''2019'''
 
'''2019'''
 
* [[Marius Lindauer]], [[Frank Hutter]] ('''2019'''). ''Best Practices for Scientific Research on Neural Architecture Search''. [https://arxiv.org/abs/1909.02453 arXiv:1909.02453]
 
* [[Marius Lindauer]], [[Frank Hutter]] ('''2019'''). ''Best Practices for Scientific Research on Neural Architecture Search''. [https://arxiv.org/abs/1909.02453 arXiv:1909.02453]
 +
* [[Guy Haworth]] ('''2019'''). ''Chess endgame news: an endgame challenge for neural nets''. [[ICGA Journal#41_3|ICGA Journal, Vol. 41, No. 3]] » [[Endgame]]
 +
==2020 ...==
 +
* [[Oisín Carroll]], [[Joeran Beel]] ('''2020'''). ''Finite Group Equivariant Neural Networks for Games''. [https://arxiv.org/abs/2009.05027 arXiv:2009.05027]
  
 
=Blog & Forum Posts=
 
=Blog & Forum Posts=
Line 496: Line 499:
 
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68119 Instruction for running Scorpio with neural network on linux] by [[Daniel Shawul]], [[CCC]], August 01, 2018 » [[Scorpio]]
 
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68119 Instruction for running Scorpio with neural network on linux] by [[Daniel Shawul]], [[CCC]], August 01, 2018 » [[Scorpio]]
 
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=69069 Are draws hard to predict?] by [[Daniel Shawul]], [[CCC]], November 27, 2018 » [[Draw]]
 
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=69069 Are draws hard to predict?] by [[Daniel Shawul]], [[CCC]], November 27, 2018 » [[Draw]]
 +
* [https://groups.google.com/d/msg/lczero/EGcJSrZYLiw/netJ4S38CgAJ use multiple neural nets?] by [[Warren D. Smith]], [[Computer Chess Forums|LCZero Forum]], December 25, 2018 » [[Leela Chess Zero]]
 
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=69393 neural network architecture] by jackd, [[CCC]], December 26, 2018
 
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=69393 neural network architecture] by jackd, [[CCC]], December 26, 2018
 
'''2019'''
 
'''2019'''
Line 505: Line 509:
 
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=71301 A question to MCTS + NN experts] by [[Maksim Korzh]], [[CCC]], July 17, 2019 » [[Monte-Carlo Tree Search]]
 
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=71301 A question to MCTS + NN experts] by [[Maksim Korzh]], [[CCC]], July 17, 2019 » [[Monte-Carlo Tree Search]]
 
: [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=71301&start=3 Re: A question to MCTS + NN experts] by [[Daniel Shawul]], [[CCC]], July 17, 2019  
 
: [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=71301&start=3 Re: A question to MCTS + NN experts] by [[Daniel Shawul]], [[CCC]], July 17, 2019  
 +
* [https://groups.google.com/d/msg/fishcooking/wOfRuzTSi_8/VgjN8MmSBQAJ high dimensional optimization] by [[Warren D. Smith]], [[Computer Chess Forums|FishCooking]], December 27, 2019 <ref>[[Mathematician#YDauphin|Yann Dauphin]], [[Mathematician#RPascanu|Razvan Pascanu]], [[Mathematician#CGulcehre|Caglar Gulcehre]], [[Mathematician#KCho|Kyunghyun Cho]], [[Mathematician#SGanguli|Surya Ganguli]], [[Mathematician#YBengio|Yoshua Bengio]] ('''2014'''). ''Identifying and attacking the saddle point problem in high-dimensional non-convex optimization''. [https://arxiv.org/abs/1406.2572 arXiv:1406.2572]</ref>
 +
==2020 ...==
 +
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=74077 How to work with batch size in neural network] by Gertjan Brouwer, [[CCC]], June 02, 2020
 +
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=74531 NNUE accessible explanation] by [[Martin Fierz]], [[CCC]], July 21, 2020 [[NNUE]]
 +
: [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=74531&start=1 Re: NNUE accessible explanation] by [[Jonathan Rosenthal]], [[CCC]], July 23, 2020
 +
: [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=74531&start=5 Re: NNUE accessible explanation] by [[Jonathan Rosenthal]], [[CCC]], July 24, 2020
 +
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=74607 LC0 vs. NNUE - some tech details...] by [[Srdja Matovic]], [[CCC]], July 29, 2020 » [[Leela Chess Zero#Lc0|Lc0]]
 +
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=74771 AB search with NN on GPU...] by [[Srdja Matovic]], [[CCC]], August 13, 2020 » [[GPU]] <ref>[https://forums.developer.nvidia.com/t/kernel-launch-latency/62455 kernel launch latency - CUDA / CUDA Programming and Performance - NVIDIA Developer Forums] by LukeCuda, June 18, 2018</ref>
 +
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=74777 Neural Networks weights type] by [[Fabio Gobbato]], [[CCC]], August 13, 2020 » [[Stockfish NNUE]]
 +
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=74955 Train a neural network evaluation] by [[Fabio Gobbato]], [[CCC]], September 01, 2020 » [[Automated Tuning]], [[NNUE]]
 +
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75042 Neural network quantization] by [[Fabio Gobbato]], [[CCC]], September 08, 2020 » [[NNUE]]
 +
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75190 First success with neural nets] by [[Jonathan Kreuzer]], [[CCC]], September 23, 2020
  
 
=External Links=
 
=External Links=
Line 560: Line 576:
 
: [https://towardsdatascience.com/types-of-convolutions-in-deep-learning-717013397f4d An Introduction to different Types of Convolutions in Deep Learning] by [http://plpp.de/ Paul-Louis Pröve], July 22, 2017
 
: [https://towardsdatascience.com/types-of-convolutions-in-deep-learning-717013397f4d An Introduction to different Types of Convolutions in Deep Learning] by [http://plpp.de/ Paul-Louis Pröve], July 22, 2017
 
: [https://towardsdatascience.com/squeeze-and-excitation-networks-9ef5e71eacd7 Squeeze-and-Excitation Networks] by [http://plpp.de/ Paul-Louis Pröve], October 17, 2017
 
: [https://towardsdatascience.com/squeeze-and-excitation-networks-9ef5e71eacd7 Squeeze-and-Excitation Networks] by [http://plpp.de/ Paul-Louis Pröve], October 17, 2017
 +
* [https://towardsdatascience.com/deep-convolutional-neural-networks-ccf96f830178 Deep Convolutional Neural Networks] by Pablo Ruiz, October 11, 2018
 +
===ResNet===
 +
* [https://en.wikipedia.org/wiki/Residual_neural_network Residual neural network from Wikipedia]
 +
* [https://wiki.tum.de/display/lfdv/Deep+Residual+Networks Deep Residual Networks] from [https://wiki.tum.de/ TUM Wiki], [[Technical University of Munich]]
 +
* [https://towardsdatascience.com/understanding-and-visualizing-resnets-442284831be8 Understanding and visualizing ResNets] by Pablo Ruiz, October 8, 2018
 
===RNNs===
 
===RNNs===
 
* [https://en.wikipedia.org/wiki/Recurrent_neural_network Recurrent neural network from Wikipedia]
 
* [https://en.wikipedia.org/wiki/Recurrent_neural_network Recurrent neural network from Wikipedia]
Line 575: Line 596:
 
* [https://en.wikipedia.org/wiki/Rectifier_(neural_networks) Rectifier (neural networks) from Wikipedia]
 
* [https://en.wikipedia.org/wiki/Rectifier_(neural_networks) Rectifier (neural networks) from Wikipedia]
 
* [https://en.wikipedia.org/wiki/Sigmoid_function Sigmoid function from Wikipedia]
 
* [https://en.wikipedia.org/wiki/Sigmoid_function Sigmoid function from Wikipedia]
 +
* [https://en.wikipedia.org/wiki/Softmax_function Softmax function from Wikipedia]
 
==Backpropagation==
 
==Backpropagation==
 
* [https://en.wikipedia.org/wiki/Backpropagation Backpropagation from Wikipedia]
 
* [https://en.wikipedia.org/wiki/Backpropagation Backpropagation from Wikipedia]
Line 630: Line 652:
 
: [https://www.youtube.com/watch?v=9KM9Td6RVgQ Part 6: Training]
 
: [https://www.youtube.com/watch?v=9KM9Td6RVgQ Part 6: Training]
 
: [https://www.youtube.com/watch?v=S4ZUwgesjS8 Part 7: Overfitting, Testing, and Regularization]
 
: [https://www.youtube.com/watch?v=S4ZUwgesjS8 Part 7: Overfitting, Testing, and Regularization]
 +
* [https://www.youtube.com/playlist?list=PLgomWLYGNl1dL1Qsmgumhcg4HOcWZMd3k NN - Fully Connected Tutorial], [https://en.wikipedia.org/wiki/YouTube YouTube] Videos by [[Finn Eggers]]
 
* [https://www.youtube.com/watch?v=UdSK7nnJKHU Deep Learning Master Class] by [[Ilya Sutskever]], [https://en.wikipedia.org/wiki/YouTube YouTube] Video
 
* [https://www.youtube.com/watch?v=UdSK7nnJKHU Deep Learning Master Class] by [[Ilya Sutskever]], [https://en.wikipedia.org/wiki/YouTube YouTube] Video
 
* [https://www.youtube.com/watch?v=Ih5Mr93E-2c&hd=1 Lecture 10 - Neural Networks] from [http://work.caltech.edu/telecourse.html Learning From Data - Online Course (MOOC)] by [https://en.wikipedia.org/wiki/Yaser_Abu-Mostafa Yaser Abu-Mostafa], [https://en.wikipedia.org/wiki/California_Institute_of_Technology Caltech], [https://en.wikipedia.org/wiki/YouTube YouTube] Video
 
* [https://www.youtube.com/watch?v=Ih5Mr93E-2c&hd=1 Lecture 10 - Neural Networks] from [http://work.caltech.edu/telecourse.html Learning From Data - Online Course (MOOC)] by [https://en.wikipedia.org/wiki/Yaser_Abu-Mostafa Yaser Abu-Mostafa], [https://en.wikipedia.org/wiki/California_Institute_of_Technology Caltech], [https://en.wikipedia.org/wiki/YouTube YouTube] Video

Revision as of 22:21, 24 September 2020

Home * Learning * Neural Networks

Artificial Neural Network [1]

Neural Networks,
a series of connected neurons which communicate due to neurotransmission. The interface through which neurons interact with their neighbors consists of axon terminals connected via synapses to dendrites on other neurons. If the sum of the input signals into one neuron surpasses a certain threshold, the neuron sends an action potential at the axon hillock and transmits this electrical signal along the axon.

In 1949, Donald O. Hebb introduced his theory in The Organization of Behavior, stating that learning is about to adapt weight vectors (persistent synaptic plasticity) of the neuron pre-synaptic inputs, whose dot-product activates or controls the post-synaptic output, which is the base of Neural network learning [2].

AN

Already in the early 40s, Warren S. McCulloch and Walter Pitts introduced the artificial neuron as a logical element with multiple analogue inputs and a single digital output with a boolean result. The output fired "true", if the sum of the inputs exceed a threshold. In their 1943 paper A Logical Calculus of the Ideas Immanent in Nervous Activity [3], they attempted to demonstrate that a Turing machine program could be implemented in a finite network of such neurons of combinatorial logic functions of AND, OR and NOT.

ANNs

Artificial Neural Networks (ANNs) are a family of statistical learning devices or algorithms used in regression, and binary or multiclass classification, implemented in hardware or software inspired by their biological counterparts. The artificial neurons of one or more layers receive one or more inputs (representing dendrites), and after being weighted, sum them to produce an output (representing a neuron's axon). The sum is passed through a nonlinear function known as an activation function or transfer function. The transfer functions usually have a sigmoid shape, but they may also take the form of other non-linear functions, piecewise linear functions, or step functions [4]. The weights of the inputs of each layer are tuned to minimize a cost or loss function, which is a task in mathematical optimization and machine learning.

Perceptron

Perceptron [5]

The perceptron is an algorithm for supervised learning of binary classifiers. It was the first artificial neural network, introduced in 1957 by Frank Rosenblatt [6], implemented in custom hardware. In its basic form it consists of a single neuron with multiple inputs and associated weights.

Supervised learning is applied using a set D of labeled training data with pairs of feature vectors (x) and given results as desired output (d), usually started with cleared or randomly initialized weight vector w. The output is calculated by all inputs of a sample, multiplied by its corresponding weights, passing the sum to the activation function f. The difference of desired and actual value is then immediately used modify the weights for all features using a learning rate 0.0 < α <= 1.0:

   for (j=0, Σ = 0.0; j < nSamples; ++j) {
    for (i=0, X = bias; i < nFeatures; ++i) 
      X += w[i]*x[j][i];
    y = f ( X );
    Σ += abs(Δ = d[j] - y);
    for (i=0; i < nFeatures; ++i) 
      w[i] += α*Δ*x[j][i];
   }

AI Winter

Three layer, XOR capable Perceptron [7]

Although the perceptron initially seemed promising, it was proved that perceptrons could not be trained to recognise many classes of patterns. This led to neural network research stagnating for many years, the AI-winter, before it was recognised that a feedforward neural network with two or more layers had far greater processing power than with one layer. Single layer perceptrons are only capable of learning linearly separable patterns. In their 1969 book Perceptrons, Marvin Minsky and Seymour Papert wrote that it was impossible for these classes of network to learn the XOR function. It is often believed that they also conjectured (incorrectly) that a similar result would hold for a multilayer perceptron [8]. However, this is not true, as both Minsky and Papert already knew that multilayer perceptrons were capable of producing an XOR function [9]-

Backpropagation

In 1974, Paul Werbos started to end the AI winter concerning neural networks, when he first described the mathematical process of training multilayer perceptrons through backpropagation of errors [10], derived in the context of control theory by Henry J. Kelley in 1960 [11] and by Arthur E. Bryson in 1961 [12] using principles of dynamic programming, simplified by Stuart E. Dreyfus in 1961 applying the chain rule [13]. It was in 1982, when Werbos applied a automatic differentiation method described in 1970 by Seppo Linnainmaa [14] to neural networks in the way that is widely used today [15] [16] [17] [18].

Backpropagation is a generalization of the delta rule to multilayered feedforward networks, made possible by using the chain rule to iteratively compute gradients for each layer. Backpropagation requires that the activation function used by the artificial neurons be differentiable, which is true for the common sigmoid logistic function or its softmax generalization in multiclass classification.

Along with an optimization method such as gradient descent, it calculates the gradient of a cost or loss function with respect to all the weights in the neural network. The gradient is fed to the optimization method which in turn uses it to update the weights, in an attempt to minimize the loss function, which choice depends on the learning type (supervised, unsupervised, reinforcement) and the activation function - mean squared error or cross-entropy error function are used in binary classification [19]. The gradient is almost always used in a simple stochastic gradient descent algorithm. In 1983, Yurii Nesterov contributed an accelerated version of gradient descent that converges considerably faster than ordinary gradient descent [20] [21] [22] [23].

Backpropagation algorithm for a 3-layer network [24]:

   initialize the weights in the network (often small random values)
   do
      for each example e in the training set
         O = neural-net-output(network, e)  // forward pass
         T = teacher output for e
         compute error (T - O) at the output units
         compute delta_wh for all weights from hidden layer to output layer  // backward pass
         compute delta_wi for all weights from input layer to hidden layer   // backward pass continued
         update the weights in the network
   until all examples classified correctly or stopping criterion satisfied
   return the network

Deep Learning

Deep learning has been characterized as a buzzword, or a rebranding of neural networks. A deep neural network (DNN) is an ANN with multiple hidden layers of units between the input and output layers which can be discriminatively trained with the standard backpropagation algorithm. Two common issues if naively trained are overfitting and computation time.

Convolutional NNs

Convolutional neural networks (CNN) form a subclass of feedforward neural networks that have special weight constraints, individual neurons are tiled in such a way that they respond to overlapping regions. A neuron of a convolutional layer is connected to a correspondent receptive field of the previous layer, a small subset of their neurons. A distinguishing feature of CNNs is that many neurons share the same bias and vector of weights, dubbed filter. This reduces memory footprint because a single bias and a single vector of weights is used across all receptive fields sharing that filter, rather than each receptive field having its own bias and vector of weights. Convolutional NNs are suited for deep learning and are highly suitable for parallelization on GPUs [25]. They were research topic in the game of Go since 2008 [26], and along with the residual modification successful applied in Go and other games, most spectacular due to AlphaGo in 2015 and AlphaZero in 2017.

Typical cnn.png

Typical CNN [27]

Residual Net

A residual block [28] [29]

A Residual net (ResNet) adds the input of a layer, typically composed of a convolutional layer and of a ReLU layer, to its output. This modification, like convolutional nets inspired from image classification, enables faster training and deeper networks [30] [31] [32].

ANNs in Games

Applications of neural networks in computer games and chess are learning of evaluation and search control. Evaluation topics include feature selection and automated tuning, search control move ordering, selectivity and time management. The perceptron looks like the ideal learning algorithm for automated evaluation tuning.

Backgammon

In the late 80s, Gerald Tesauro pioneered in applying ANNs to the game of Backgammon. His program Neurogammon won the Gold medal at the 1st Computer Olympiad 1989 - and was further improved by TD-Lambda based Temporal Difference Learning within TD-Gammon [33]. Today all strong backgammon programs rely on heavily trained neural networks.

Go

In 2014, two teams independently investigated whether deep convolutional neural networks could be used to directly represent and learn a move evaluation function for the game of Go. Christopher Clark and Amos Storkey trained an 8-layer convolutional neural network by supervised learning from a database of human professional games, which without any search, defeated the traditional search program Gnu Go in 86% of the games [34] [35] [36] [37]. In their paper Move Evaluation in Go Using Deep Convolutional Neural Networks [38], Chris J. Maddison, Aja Huang, Ilya Sutskever, and David Silver report they trained a large 12-layer convolutional neural network in a similar way, to beat Gnu Go in 97% of the games, and matched the performance of a state-of-the-art Monte-Carlo Tree Search that simulates a million positions per move [39].

In 2015, a team affiliated with Google DeepMind around David Silver and Aja Huang, supported by Google researchers John Nham and Ilya Sutskever, build a Go playing program dubbed AlphaGo [40], combining Monte-Carlo tree search with their 12-layer networks [41].

Chess

Logistic regression as applied in Texel's Tuning Method may be interpreted as supervised learning application of the single-layer perceptron with one neuron. This is also true for reinforcement learning approaches, such as TD-Leaf in KnightCap or Meep's TreeStrap, where the evaluation consists of a weighted linear combination of features. Despite these similarities with the perceptron, these engines are not considered using ANNs - since they use manually selected chess specific feature construction concepts like material, piece square tables, pawn structure, mobility etc..

More sophisticated attempts to replace static evaluation by neural networks and perceptrons feeding in more unaffiliated feature sets like board representation and attack tables etc., where not yet that successful like in other games. Chess evaluation seems not that well suited for neural nets, but there are also aspects of too weak models and feature recognizers as addressed by Gian-Carlo Pascutto with Stoofvlees [42], huge training effort, and weak floating point performance - but there is still hope due to progress in hardware and parallelization using SIMD instructions and GPUs, and deeper and more powerful neural network structures and methods successful in other domains. In December 2017, Google DeepMind published about their generalized AlphaZero algorithm.

Move Ordering

Concerning move ordering - there were interesting NN proposals like the Chessmaps Heuristic by Kieran Greer et al. [43], and the Neural MoveMap Heuristic by Levente Kocsis et al. [44].

Giraffe & Zurichess

In 2015, Matthew Lai trained Giraffe's deep neural network by TD-Leaf [45]. Zurichess by Alexandru Moșoi uses the TensorFlow library for automated tuning - in a two layers neural network, the second layer is responsible for a tapered eval to phase endgame and middlegame scores [46].

DeepChess

In 2016, Omid E. David, Nathan S. Netanyahu, and Lior Wolf introduced DeepChess obtaining a grandmaster-level chess playing performance using a learning method incorporating two deep neural networks, which are trained using a combination of unsupervised pretraining and supervised training. The unsupervised training extracts high level features from a given chess position, and the supervised training learns to compare two chess positions to select the more favorable one. In order to use DeepChess inside a chess program, a novel version of alpha-beta is used that does not require bounds but positions αpos and βpos [47].

Alpha Zero

In December 2017, the Google DeepMind team along with former Giraffe author Matthew Lai reported on their generalized AlphaZero algorithm, combining Deep learning with Monte-Carlo Tree Search. AlphaZero can achieve, tabula rasa, superhuman performance in many challenging domains with some training effort. Starting from random play, and given no domain knowledge except the game rules, AlphaZero achieved a superhuman level of play in the games of chess and Shogi as well as Go, and convincingly defeated a world-champion program in each case [48]. The open souece projects Leela Zero (Go) and its chess adaptation Leela Chess Zero successfully re-implemented the ideas of DeepMind.

NNUE

NNUE reverse of ƎUИИ - Efficiently Updatable Neural Networks, is an NN architecture intended to replace the evaluation of Shogi, chess and other board game playing alpha-beta searchers. NNUE was introduced in 2018 by Yu Nasu [49], and was used in Shogi adaptations of Stockfish such as YaneuraOu [50] , and Kristallweizen [51], apparently with AlphaZero strength [52]. Nodchip incorporated NNUE into the chess playing Stockfish 10 as a proof of concept [53], yielding in the hype about Stockfish NNUE in summer 2020 [54]. Its heavily over parametrized computational most expensive input layer is efficiently incremental updated in make and unmake move.

NN Chess Programs

See also

AlphaGo
Keynote Lecture CG 2016 Conference by Aja Huang

Selected Publications

1940 ...

1950 ...

Claude Shannon, John McCarthy (eds.) (1956). Automata Studies. Annals of Mathematics Studies, No. 34
Claude Shannon, John McCarthy (eds.) (1956). Automata Studies. Annals of Mathematics Studies, No. 34, pdf

1960 ...

1970 ...

1980 ...

1987

1988

1989

1990 ...

1991

1992

1993

1994

1995

1996

1997

1998

1999

2000 ...

2001

2002

2003

2004

2006

2007

2008

2009

  • Daniel Abdi, Simon Levine, Girma T. Bitsuamlak (2009). Application of an Artificial Neural Network Model for Boundary Layer Wind Tunnel Profile Development. 11th Americas conference on wind Engineering, pdf

2010 ...

2011

2012

Nicol N. Schraudolph (2012). Centering Neural Network Gradient Factors.
Léon Bottou (2012). Stochastic Gradient Descent Tricks. Microsoft Research, pdf
Ronan Collobert, Koray Kavukcuoglu, Clément Farabet (2012). Implementing Neural Networks Efficiently. [71]

2013

2014

2015

2016

2017

2018

2019

2020 ...

Blog & Forum Posts

1996 ...

Re: Evaluation by neural network ? by Jay Scott, CCC, November 10, 1997 [93]

2000 ...

Re: Whatever happened to Neural Network Chess programs? by Andy Walker, rgcc, March 28, 2000  » Advances in Computer Chess 1, Ron Atkin
Combining Neural Networks and Alpha-Beta by Matthias Lüscher, rgcc, April 01, 2000 » Chessterfield
Neural nets in backgammon by Albert Silver, CCC, April 07, 2004

2005 ...

2010 ...

Re: Chess program with Artificial Neural Networks (ANN)? by Gian-Carlo Pascutto, CCC, January 07, 2010 » Stoofvlees
Re: Chess program with Artificial Neural Networks (ANN)? by Gian-Carlo Pascutto, CCC, January 08, 2010
Re: Chess program with Artificial Neural Networks (ANN)? by Volker Annuss, CCC, January 08, 2010 » Hermann

2015 ...

2016

Re: Deep Learning Chess Engine ? by Alexandru Mosoi, CCC, July 21, 2016 » Zurichess
Re: Deep Learning Chess Engine ? by Matthew Lai, CCC, August 04, 2016 » Giraffe [97]

2017

Re: Is AlphaGo approach unsuitable to chess? by Peter Österlund, CCC, May 31, 2017 » Texel

2018

2019

Re: A question to MCTS + NN experts by Daniel Shawul, CCC, July 17, 2019

2020 ...

Re: NNUE accessible explanation by Jonathan Rosenthal, CCC, July 23, 2020
Re: NNUE accessible explanation by Jonathan Rosenthal, CCC, July 24, 2020

External Links

Biological

ANNs

Topics

Neurogrid from Wikipedia

Perceptron

History of the Perceptron

CNNs

Convolutional Neural Networks
Deep Residual Networks
An Introduction to different Types of Convolutions in Deep Learning by Paul-Louis Pröve, July 22, 2017
Squeeze-and-Excitation Networks by Paul-Louis Pröve, October 17, 2017

ResNet

RNNs

Restricted Boltzmann machine from Wikipedia

Activation Functions

Backpropagation

Gradient

Momentum from Wikipedia

Software

Neural Lab from Wikipedia
SNNS from Wikipedia

Libraries

Blogs

The Single Layer Perceptron
The Sigmoid Function in C#
Hidden Neurons and Feature Space
Training Neural Networks Using Back Propagation in C#
Data Mining with Artificial Neural Networks (ANN)
Neural Net in C++ Tutorial on Vimeo (also on YouTube)

Courses

Part 1: Data and Architecture, YouTube Videos
Part 2: Forward Propagation
Part 3: Gradient Descent
Part 4: Backpropagation
Part 5: Numerical Gradient Checking
Part 6: Training
Part 7: Overfitting, Testing, and Regularization
But what *is* a Neural Network? | Chapter 1
Gradient descent, how neural networks learn | Chapter 2
What is backpropagation really doing? | Chapter 3
Backpropagation calculus | Appendix to Chapter 3
Lecture 1 | Introduction to Convolutional Neural Networks for Visual Recognition by Justin Johnson, slides
Lecture 2 | Image Classification by Justin Johnson, slides
Lecture 3 | Loss Functions and Optimization by Justin Johnson, slides
Lecture 4 | Introduction to Neural Networks by Serena Yeung, slides
Lecture 5 | Convolutional Neural Networks by Serena Yeung, slides
Lecture 6 | Training Neural Networks I by Serena Yeung, slides
Lecture 7 | Training Neural Networks II by Justin Johnson, slides
Lecture 8 | Deep Learning Software by Justin Johnson, slides
Lecture 9 | CNN Architectures by Serena Yeung, slides
Lecture 10 | Recurrent Neural Networks by Justin Johnson, slides
Lecture 11 | Detection and Segmentation by Justin Johnson, slides
Lecture 12 | Visualizing and Understanding by Justin Johnson, slides
Lecture 13 | Generative Models by Serena Yeung, slides
Lecture 14 | Deep Reinforcement Learning by Serena Yeung, slides
Lecture 15 | Efficient Methods and Hardware for Deep Learning by Song Han, slides

Music

Marc Ribot, Kenny Wollesen, Joey Baron, Jamie Saft, Trevor Dunn, Cyro Baptista, John Zorn

References

  1. An example artificial neural network with a hidden layer, Image by Colin M.L. Burnett with Inkscape, December 27, 2006, CC BY-SA 3.0, Artificial Neural Networks/Neural Network Basics - Wikibooks, Wikimedia Commons
  2. Biological neural network - Early study - from Wikipedia
  3. Warren S. McCulloch, Walter Pitts (1943). A Logical Calculus of the Ideas Immanent in Nervous Activity. Bulletin of Mathematical Biology, Vol. 5, No. 1, pdf
  4. Artificial neuron from Wikipedia
  5. The appropriate weights are applied to the inputs, and the resulting weighted sum passed to a function that produces the output y, image created by mat_the_w, based on raster image Perceptron.gif by 'Paskari', using Inkscape 0.46 for OSX, Wikimedia Commons, Perceptron from Wikipedia
  6. Frank Rosenblatt (1957). The Perceptron - a Perceiving and Recognizing Automaton. Report 85-460-1, Cornell Aeronautical Laboratory
  7. A two-layer neural network capable of calculating XOR. The numbers within the neurons represent each neuron's explicit threshold (which can be factored out so that all neurons have the same threshold, usually 1). The numbers that annotate arrows represent the weight of the inputs. This net assumes that if the threshold is not reached, zero (not -1) is output. Note that the bottom layer of inputs is not always considered a real neural network layer, Feedforward neural network from Wikipedia
  8. multilayer perceptron is a misnomer for a more complicated neural network
  9. Perceptron from Wikipedia
  10. Paul Werbos (1974). Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences. Ph. D. thesis, Harvard University
  11. Henry J. Kelley (1960). Gradient Theory of Optimal Flight Paths. [http://arc.aiaa.org/loi/arsj ARS Journal, Vol. 30, No. 10
  12. Arthur E. Bryson (1961). A gradient method for optimizing multi-stage allocation processes. In Proceedings of the Harvard University Symposium on digital computers and their applications
  13. Stuart E. Dreyfus (1961). The numerical solution of variational problems. RAND paper P-2374
  14. Seppo Linnainmaa (1970). The representation of the cumulative rounding error of an algorithm as a Taylor expansion of the local rounding errors. Master's thesis, University of Helsinki
  15. Paul Werbos (1982). Applications of advances in nonlinear sensitivity analysis. System Modeling and Optimization, Springer, pdf
  16. Paul Werbos (1994). The Roots of Backpropagation. From Ordered Derivatives to Neural Networks and Political Forecasting. John Wiley & Sons
  17. Deep Learning - Scholarpedia | Backpropagation by Jürgen Schmidhuber
  18. Who Invented Backpropagation? by Jürgen Schmidhuber (2014, 2015)
  19. "Using cross-entropy error function instead of sum of squares leads to faster training and improved generalization", from Sargur Srihari, Neural Network Training (pdf)
  20. Yurii Nesterov from Wikipedia
  21. ORF523: Nesterov’s Accelerated Gradient Descent by Sébastien Bubeck, I’m a bandit, April 1, 2013
  22. Nesterov’s Accelerated Gradient Descent for Smooth and Strongly Convex Optimization by Sébastien Bubeck, I’m a bandit, March 6, 2014
  23. Revisiting Nesterov’s Acceleration by Sébastien Bubeck, I’m a bandit, June 30, 2015
  24. Backpropagation algorithm from Wikipedia
  25. PARsE | Education | GPU Cluster | Efficient mapping of the training of Convolutional Neural Networks to a CUDA-based cluster
  26. Ilya Sutskever, Vinod Nair (2008). Mimicking Go Experts with Convolutional Neural Networks. ICANN 2008, pdf
  27. Typical CNN architecture, Image by Aphex34, December 16, 2015, CC BY-SA 4.0, Wikimedia Commons
  28. The fundamental building block of residual networks. Figure 2 in Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun (2015). Deep Residual Learning for Image Recognition. arXiv:1512.03385
  29. Understand Deep Residual Networks — a simple, modular learning framework that has redefined state-of-the-art by Michael Dietz, Waya.ai, May 02, 2017
  30. Tristan Cazenave (2017). Residual Networks for Computer Go. IEEE Transactions on Computational Intelligence and AI in Games, Vol. PP, No. 99, pdf
  31. Deep Residual Networks from TUM Wiki, Technical University of Munich
  32. Understanding and visualizing ResNets by Pablo Ruiz, October 8, 2018
  33. Richard Sutton, Andrew Barto (1998). Reinforcement Learning: An Introduction. MIT Press, 11.1 TD-Gammon
  34. Christopher Clark, Amos Storkey (2014). Teaching Deep Convolutional Neural Networks to Play Go. arXiv:1412.3409
  35. Teaching Deep Convolutional Neural Networks to Play Go by Hiroshi Yamashita, The Computer-go Archives, December 14, 2014
  36. Why Neural Networks Look Set to Thrash the Best Human Go Players for the First Time | MIT Technology Review, December 15, 2014
  37. Teaching Deep Convolutional Neural Networks to Play Go by Michel Van den Bergh, CCC, December 16, 2014
  38. Chris J. Maddison, Aja Huang, Ilya Sutskever, David Silver (2014). Move Evaluation in Go Using Deep Convolutional Neural Networks. arXiv:1412.6564v1
  39. Move Evaluation in Go Using Deep Convolutional Neural Networks by Aja Huang, The Computer-go Archives, December 19, 2014
  40. AlphaGo | Google DeepMind
  41. David Silver, Aja Huang, Chris J. Maddison, Arthur Guez, Laurent Sifre, George van den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, Sander Dieleman, Dominik Grewe, John Nham, Nal Kalchbrenner, Ilya Sutskever, Timothy Lillicrap, Madeleine Leach, Koray Kavukcuoglu, Thore Graepel, Demis Hassabis (2016). Mastering the game of Go with deep neural networks and tree search. Nature, Vol. 529
  42. Re: Chess program with Artificial Neural Networks (ANN)? by Gian-Carlo Pascutto, CCC, January 07, 2010
  43. Kieran Greer, Piyush Ojha, David A. Bell (1999). A Pattern-Oriented Approach to Move Ordering: the Chessmaps Heuristic. ICCA Journal, Vol. 22, No. 1
  44. Levente Kocsis, Jos Uiterwijk, Eric Postma, Jaap van den Herik (2002). The Neural MoveMap Heuristic in Chess. CG 2002
  45. *First release* Giraffe, a new engine based on deep learning by Matthew Lai, CCC, July 08, 2015
  46. Re: Deep Learning Chess Engine ? by Alexandru Mosoi, CCC, July 21, 2016
  47. Omid E. David, Nathan S. Netanyahu, Lior Wolf (2016). DeepChess: End-to-End Deep Neural Network for Automatic Learning in Chess. ICAAN 2016, Lecture Notes in Computer Science, Vol. 9887, Springer, pdf preprint
  48. David Silver, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Matthew Lai, Arthur Guez, Marc Lanctot, Laurent Sifre, Dharshan Kumaran, Thore Graepel, Timothy Lillicrap, Karen Simonyan, Demis Hassabis (2017). Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm. arXiv:1712.01815
  49. Yu Nasu (2018). ƎUИИ Efficiently Updatable Neural-Network based Evaluation Functions for Computer Shogi. Ziosoft Computer Shogi Club, pdf (Japanese with English abstract)
  50. GitHub - yaneurao/YaneuraOu: YaneuraOu is the World's Strongest Shogi engine(AI player), WCSC29 1st winner, educational and USI compliant engine
  51. GitHub - Tama4649/Kristallweizen: 第29回世界コンピュータ将棋選手権 準優勝のKristallweizenです。
  52. The Stockfish of shogi by Larry Kaufman, CCC, January 07, 2020
  53. Stockfish NN release (NNUE) by Henk Drost, CCC, May 31, 2020
  54. Stockfish NNUE – The Complete Guide, June 19, 2020 (Japanese and English)
  55. Rosenblatt's Contributions
  56. The abandonment of connectionism in 1969 - Wikipedia
  57. Frank Rosenblatt (1962). Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms. Spartan Books
  58. Seppo Linnainmaa (1976). Taylor expansion of the accumulated rounding error. BIT Numerical Mathematics, Vol. 16, No. 2
  59. Backpropagation from Wikipedia
  60. Paul Werbos (1994). The Roots of Backpropagation. From Ordered Derivatives to Neural Networks and Political Forecasting. John Wiley & Sons
  61. Neocognitron - Scholarpedia by Kunihiko Fukushima
  62. Classical conditioning from Wikipedia
  63. Sepp Hochreiter's Fundamental Deep Learning Problem (1991) by Jürgen Schmidhuber, 2013
  64. Nici Schraudolph’s go networks, review by Jay Scott
  65. Re: Evaluation by neural network ? by Jay Scott, CCC, November 10, 1997
  66. Long short term memory from Wikipedia
  67. Tsumego from Wikipedia
  68. Helmholtz machine from Wikipedia
  69. Who introduced the term “deep learning” to the field of Machine Learning by Jürgen Schmidhuber, Google+, March 18, 2015
  70. Presentation for a neural net learning chess program by Dann Corbit, CCC, April 06, 2004
  71. Clément Farabet | Code
  72. Demystifying Deep Reinforcement Learning by Tambet Matiisen, Nervana, December 21, 2015
  73. high dimensional optimization by Warren D. Smith, FishCooking, December 27, 2019
  74. Generative adversarial networks from Wikipedia
  75. Teaching Deep Convolutional Neural Networks to Play Go by Hiroshi Yamashita, The Computer-go Archives, December 14, 2014
  76. Teaching Deep Convolutional Neural Networks to Play Go by Michel Van den Bergh, CCC, December 16, 2014
  77. Arasan 19.2 by Jon Dart, CCC, November 03, 2016 » Arasan's Tuning
  78. GitHub - BarakOshri/ConvChess: Predicting Moves in Chess Using Convolutional Neural Networks
  79. ConvChess CNN by Brian Richardson, CCC, March 15, 2017
  80. Jürgen Schmidhuber (2015) Critique of Paper by "Deep Learning Conspiracy" (Nature 521 p 436).
  81. How Facebook’s AI Researchers Built a Game-Changing Go Engine | MIT Technology Review, December 04, 2015
  82. Combining Neural Networks and Search techniques (GO) by Michael Babigian, CCC, December 08, 2015
  83. DeepChess: Another deep-learning based chess program by Matthew Lai, CCC, October 17, 2016
  84. ICANN 2016 | Recipients of the best paper awards
  85. Jigsaw puzzle from Wikipedia
  86. CMA-ES from Wikipedia
  87. catastrophic forgetting by Daniel Shawul, CCC, May 09, 2019
  88. Using GAN to play chess by Evgeniy Zheltonozhskiy, CCC, February 23, 2017
  89. AlphaGo Zero: Learning from scratch by Demis Hassabis and David Silver, DeepMind, October 18, 2017
  90. Google's AlphaGo team has been working on chess by Peter Kappler, CCC, December 06, 2017
  91. Residual Networks for Computer Go by Brahim Hamadicharef, CCC, December 07, 2017
  92. AlphaZero: Shedding new light on the grand games of chess, shogi and Go by David Silver, Thomas Hubert, Julian Schrittwieser and Demis Hassabis, DeepMind, December 03, 2018
  93. Alois Heinz (1994). Efficient Neural Net α-β-Evaluators. pdf
  94. Mathieu Autonès, Aryel Beck, Phillippe Camacho, Nicolas Lassabe, Hervé Luga, François Scharffe (2004). Evaluation of Chess Position by Modular Neural network Generated by Genetic Algorithm. EuroGP 2004
  95. Naive Bayes classifier from Wikipedia
  96. GitHub - pluskid/Mocha.jl: Deep Learning framework for Julia
  97. Rectifier (neural networks) from Wikipedia
  98. Muthuraman Chidambaram, Yanjun Qi (2017). Style Transfer Generative Adversarial Networks: Learning to Play Chess Differently. arXiv:1702.06762v1
  99. Yann Dauphin, Razvan Pascanu, Caglar Gulcehre, Kyunghyun Cho, Surya Ganguli, Yoshua Bengio (2014). Identifying and attacking the saddle point problem in high-dimensional non-convex optimization. arXiv:1406.2572
  100. kernel launch latency - CUDA / CUDA Programming and Performance - NVIDIA Developer Forums by LukeCuda, June 18, 2018
  101. erikbern/deep-pink · GitHub
  102. Neural networks (NN) explained by Erin Dame, CCC, December 20, 2017

Up one Level