Changes

Jump to: navigation, search

Neural Networks

578 bytes removed, 21:31, 9 January 2019
no edit summary
==Backpropagation==
In 1974, [https://en.wikipedia.org/wiki/Paul_Werbos [Mathematician#PWerbos|Paul Werbos]] started to end the AI winter concerning neural networks, when he first described the mathematical process of training [https://en.wikipedia.org/wiki/Multilayer_perceptron multilayer perceptrons] through [https://en.wikipedia.org/wiki/Backpropagation backpropagation] of errors <ref>[https://en.wikipedia.org/wiki/Paul_Werbos [Mathematician#PWerbos|Paul Werbos]] ('''1974'''). ''[http://aitopics.org/publication/beyond-regression-new-tools-prediction-and-analysis-behavioral-sciences Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences]''. Ph. D. thesis, [[Harvard University]]</ref>, derived in the context of [https://en.wikipedia.org/wiki/Control_theory control theory] by [https://en.wikipedia.org/wiki/Henry_J._Kelley Henry J. Kelley] in 1960 <ref>[https://en.wikipedia.org/wiki/Henry_J._Kelley Henry J. Kelley] ('''1960'''). ''[http://arc.aiaa.org/doi/abs/10.2514/8.5282?journalCode=arsj& Gradient Theory of Optimal Flight Paths]''. [http://arc.aiaa.org/loi/arsj ARS Journal, Vol. 30, No. 10</ref> and by [https://en.wikipedia.org/wiki/Arthur_E._Bryson Arthur E. Bryson] in 1961 <ref>[https://en.wikipedia.org/wiki/Arthur_E._Bryson Arthur E. Bryson] ('''1961'''). ''A gradient method for optimizing multi-stage allocation processes''. In Proceedings of the [[Harvard University]] Symposium on digital computers and their applications</ref> using principles of [[Dynamic Programming|dynamic programming]], simplified by [https://en.wikipedia.org/wiki/Stuart_Dreyfus Stuart Dreyfus] in 1961 applying the [https://en.wikipedia.org/wiki/Chain_rule chain rule] <ref>[https://en.wikipedia.org/wiki/Stuart_Dreyfus Stuart Dreyfus] ('''1961'''). ''[http://www.rand.org/pubs/papers/P2374.html The numerical solution of variational problems]''. RAND paper P-2374</ref>. It was in 1982, when Werbos applied a [https://en.wikipedia.org/wiki/Automatic_differentiation automatic differentiation] method described in 1970 by [[Mathematician#SLinnainmaa|Seppo Linnainmaa]] <ref>[[Mathematician#SLinnainmaa|Seppo Linnainmaa]] ('''1970'''). ''The representation of the cumulative rounding error of an algorithm as a Taylor expansion of the local rounding errors''. Master's thesis, [https://en.wikipedia.org/wiki/University_of_Helsinki University of Helsinki]</ref> to neural networks in the way that is widely used today <ref>[https://en.wikipedia.org/wiki/Paul_Werbos [Mathematician#PWerbos|Paul Werbos]] ('''1982'''). ''Applications of advances in nonlinear sensitivity analysis''. [http://link.springer.com/book/10.1007%2FBFb0006119 System Modeling and Optimization], [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer], [http://werbos.com/Neural/SensitivityIFIPSeptember1981.pdf pdf]</ref> <ref>[https://en.wikipedia.org/wiki/Paul_Werbos [Mathematician#PWerbos|Paul Werbos]] ('''1994'''). ''[http://eu.wiley.com/WileyCDA/WileyTitle/productCd-0471598976.html The Roots of Backpropagation. From Ordered Derivatives to Neural Networks and Political Forecasting]''. [https://en.wikipedia.org/wiki/John_Wiley_%26_Sons John Wiley & Sons]</ref> <ref>[http://www.scholarpedia.org/article/Deep_Learning#Backpropagation Deep Learning - Scholarpedia | Backpropagation] by [[Jürgen Schmidhuber]]</ref> <ref>[http://people.idsia.ch/~juergen/who-invented-backpropagation.html Who Invented Backpropagation?] by [[Jürgen Schmidhuber]] (2014, 2015)</ref>.
Backpropagation is a generalization of the [https://en.wikipedia.org/wiki/Delta_rule delta] rule to multilayered [https://en.wikipedia.org/wiki/Feedforward_neural_network feedforward networks], made possible by using the [https://en.wikipedia.org/wiki/Chain_rule chain rule] to iteratively compute [https://en.wikipedia.org/wiki/Gradient gradients] for each layer. Backpropagation requires that the [https://en.wikipedia.org/wiki/Activation_function activation function] used by the artificial neurons be [https://en.wikipedia.org/wiki/Differentiable_function differentiable], which is true for the common [https://en.wikipedia.org/wiki/Sigmoid_function sigmoid] [https://en.wikipedia.org/wiki/Logistic_function logistic function] or its [https://en.wikipedia.org/wiki/Softmax_function softmax] generalization in [https://en.wikipedia.org/wiki/Multiclass_classification multiclass classification].
* [[Mathematician#SGrossberg|Stephen Grossberg]] ('''1973'''). ''Contour Enhancement, Short Term Memory, and Constancies in Reverberating Neural Networks''. [https://en.wikipedia.org/wiki/Studies_in_Applied_Mathematics Studies in Applied Mathematics], Vol. 52, [http://cns.bu.edu/~steve/Gro1973StudiesAppliedMath.pdf pdf]
* [[Mathematician#SGrossberg|Stephen Grossberg]] ('''1974'''). ''[http://techlab.bu.edu/resources/article_view/classical_and_instrumental_learning_by_neural_networks/ Classical and instrumental learning by neural networks]''. Progress in Theoretical Biology. [https://en.wikipedia.org/wiki/Academic_Press Academic Press]
* [https://en.wikipedia.org/wiki/Paul_Werbos [Mathematician#PWerbos|Paul Werbos]] ('''1974'''). ''[http://aitopics.org/publication/beyond-regression-new-tools-prediction-and-analysis-behavioral-sciences Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences]''. Ph. D. thesis, [[Harvard University]] <ref>[https://en.wikipedia.org/wiki/Backpropagation Backpropagation from Wikipedia]</ref> <ref>[https://en.wikipedia.org/wiki/Paul_Werbos [Mathematician#PWerbos|Paul Werbos]] ('''1994'''). ''[http://eu.wiley.com/WileyCDA/WileyTitle/productCd-0471598976.html The Roots of Backpropagation. From Ordered Derivatives to Neural Networks and Political Forecasting]''. [https://en.wikipedia.org/wiki/John_Wiley_%26_Sons John Wiley & Sons]</ref>
* [[Richard Sutton]] ('''1978'''). ''Single channel theory: A neuronal theory of learning''. Brain Theory Newsletter 3, No. 3/4, pp. 72-75. [http://www.cs.ualberta.ca/%7Esutton/papers/sutton-78-BTN.pdf pdf]
==1980 ...==
* [http://www.scholarpedia.org/article/User:Kunihiko_Fukushima Kunihiko Fukushima] ('''1980'''). ''Neocognitron: A Self-organizing Neural Network Model for a Mechanism of Pattern Recognition Unaffected by Shift in Position''. [http://link.springer.com/journal/422 Biological Cybernetics], Vol. 36 <ref>[http://www.scholarpedia.org/article/Neocognitron Neocognitron - Scholarpedia] by [http://www.scholarpedia.org/article/User:Kunihiko_Fukushima Kunihiko Fukushima]</ref>
* [[Richard Sutton]], [[Andrew Barto]] ('''1981'''). ''Toward a modern theory of adaptive networks: Expectation and prediction''. Psychological Review, Vol. 88, pp. 135-170. [http://www.cs.ualberta.ca/%7Esutton/papers/sutton-barto-81-PsychRev.pdf pdf]
* [https://en.wikipedia.org/wiki/Paul_Werbos [Mathematician#PWerbos|Paul Werbos]] ('''1982'''). ''Applications of advances in nonlinear sensitivity analysis''. [http://link.springer.com/book/10.1007%2FBFb0006119 System Modeling and Optimization], [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer], [http://werbos.com/Neural/SensitivityIFIPSeptember1981.pdf pdf]
* [[A. Harry Klopf]] ('''1982'''). ''The Hedonistic Neuron: A Theory of Memory, Learning, and Intelligence''. Hemisphere Publishing Corporation, [[University of Michigan]]
* [[Mathematician#DHAckley|David H. Ackley]], [[Mathematician#GEHinton|Geoffrey E. Hinton]], [[Terrence J. Sejnowski]] ('''1985'''). ''A Learning Algorithm for Boltzmann Machines''. Cognitive Science, Vol. 9, No. 1, [https://web.archive.org/web/20110718022336/http://learning.cs.toronto.edu/~hinton/absps/cogscibm.pdf pdf]
* [[Mathematician#EGelenbe|Erol Gelenbe]] ('''1989'''). ''[http://cognet.mit.edu/journal/10.1162/neco.1989.1.4.502 Random Neural Networks with Negative and Positive Signals and Product Form Solution]''. [https://en.wikipedia.org/wiki/Neural_Computation_(journal) Neural Computation], Vol. 1, No. 4
==1990 ...==
* [https://en.wikipedia.org/wiki/Paul_Werbos [Mathematician#PWerbos|Paul Werbos]] ('''1990'''). ''Backpropagation Through Time: What It Does and How to Do It''. Proceedings of the [[IEEE]], Vol. 78, No. 10, [http://deeplearning.cs.cmu.edu/pdfs/Werbos.backprop.pdf pdf]
* [[Gordon Goetsch]] ('''1990'''). ''Maximization of Mutual Information in a Context Sensitive Neural Network''. Ph.D. thesis
* [[Vadim Anshelevich]] ('''1990'''). ''Neural Networks''. Review. in Multi Component Systems (Russian)
* [[Martin Riedmiller]], [[Heinrich Braun]] ('''1993'''). ''A direct adaptive method for faster backpropagation learning: The RPROP algorithm''. [http://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=1059 IEEE International Conference On Neural Networks], [http://paginas.fe.up.pt/~ee02162/dissertacao/RPROP%20paper.pdf pdf]
'''1994'''
* [https://en.wikipedia.org/wiki/Paul_Werbos [Mathematician#PWerbos|Paul Werbos]] ('''1994'''). ''[http://eu.wiley.com/WileyCDA/WileyTitle/productCd-0471598976.html The Roots of Backpropagation. From Ordered Derivatives to Neural Networks and Political Forecasting]''. [https://en.wikipedia.org/wiki/John_Wiley_%26_Sons John Wiley & Sons]
* [[David E. Moriarty]], [[Risto Miikkulainen]] ('''1994'''). ''Evolving Neural Networks to focus Minimax Search''. [[AAAI|AAAI-94]], [http://www.cs.utexas.edu/~ai-lab/pubs/moriarty.focus.pdf pdf]
* [[Eric Postma]] ('''1994'''). ''SCAN: A Neural Model of Covert Attention''. Ph.D. thesis, [[Maastricht University]], advisor [[Jaap van den Herik]]

Navigation menu