A fast semi-linear backpropagation learning algorithm

Published on Sep 9, 2007
· DOI :10.1007/978-3-540-74690-4_20
Bertha Guijarro-Berdiñas16
Estimated H-index: 16
(University of A Coruña),
Oscar Fontenla-Romero19
Estimated H-index: 19
(University of A Coruña)
+ 1 AuthorsPaula Fraguela1
Estimated H-index: 1
(University of A Coruña)
Sources
Abstract
Ever since the first gradient-based algorithm, the brilliant backpropagation proposed by Rumelhart, a variety of new training algorithms have emerged to improve different aspects of the learning process for feed-forward neural networks. One of these aspects is the learning speed. In this paper, we present a learning algorithm that combines linearleast-squares with gradient descent. The theoretical basis for the method is given and its performance is illustrated by its application to several examples in which it is compared with other learning algorithms and well known data sets. Results show the proposed algorithm improves the learning speed of the basic backpropagation algorithm in several orders of magnitude, while maintaining good optimization accuracy. Its performance and low computational cost makes it an interesting alternative even for second order methods, specially when dealing large networks and training sets.
📖 Papers frequently viewed together
2015
1 Citations
8 Citations
References19
Newest
#1Grgoire MontavonH-Index: 1
#2Genevive OrrH-Index: 1
Last. Klaus-Robert MllerH-Index: 1
view all 3 authors...
The twenty last years have been marked by an increase in available data and computing power. In parallel to this trend, the focus of neural network research and the practice of training neural networks has undergone a number of important changes, for example, use of deep learning machines. The second edition of the book augments the first edition with more tricks, which have resulted from 14 years of theory and experimentation by some of the world's most prominent neural network researchers. The...
772 CitationsSource
#1Tao Xiong (UMN: University of Minnesota)H-Index: 10
#2Vladimir Cherkassky (UMN: University of Minnesota)H-Index: 36
This paper describes a new large margin classifier, named SVM/LDA. This classifier can be viewed as an extension of support vector machine (SVM) by incorporating some global information about the data. The SVM/LDA classifier can be also seen as a generalization of linear discriminant analysis (LDA) by incorporating the idea of (local) margin maximization into standard LDA formulation. We show that existing SVM software can be used to solve the SVM/LDA formulation. We also present empirical compa...
988 CitationsSource
#1Deniz Erdogmus (OHSU: Oregon Health & Science University)H-Index: 54
#2Oscar Fontenla-Romero (University of A Coruña)H-Index: 19
Last. Enrique Castillo (UC: University of Cantabria)H-Index: 62
view all 5 authors...
Training multilayer neural networks is typically carried out using descent techniques such as the gradient-based backpropagation (BP) of error or the quasi-Newton approaches including the Levenberg-Marquardt algorithm. This is basically due to the fact that there are no analytical methods to find the optimal weights, so iterative local or global optimization techniques are necessary. The success of iterative optimization procedures is strictly dependent on the initial conditions, therefore, in t...
57 CitationsSource
#1Oscar Fontenla-Romero (University of A Coruña)H-Index: 19
#2Deniz Erdogmus (UF: University of Florida)H-Index: 54
Last. Enrique Castillo (UCLM: University of Castilla–La Mancha)H-Index: 62
view all 5 authors...
This paper presents two algorithms to aid the supervised learning of feedforward neural networks. Specifically, an initialization and a learning algorithm are presented. The proposed methods are based on the independent optimization of a subnetwork using linear least squares. An advantage of these methods is that the dimensionality of the effective search space for the non-linear algorithm is reduced, and therefore it decreases the number of training epochs which are required to find a good solu...
11 CitationsSource
#1Okyay KaynakH-Index: 62
#2Ethem AlpaydinH-Index: 27
Last. Lei XuH-Index: 49
view all 4 authors...
129 CitationsSource
#1Enrique Castillo (UCLM: University of Castilla–La Mancha)H-Index: 62
#2Oscar Fontenla-Romero (University of A Coruña)H-Index: 19
Last. Amparo Alonso-Betanzos (University of A Coruña)H-Index: 29
view all 4 authors...
The article presents a method for learning the weights in one-layer feed-forward neural networks minimizing either the sum of squared errors or the maximum absolute error, measured in the input scale. This leads to the existence of a global optimum that can be easily obtained solving linear systems of equations or linear programming problems, using much less computational power than the one associated with the standard methods. Another version of the method allows computing a large set of estima...
61 CitationsSource
Jan 1, 1998 in NeurIPS (Neural Information Processing Systems)
#1Yann LeCunH-Index: 119
#2Léon BottouH-Index: 70
Last. Klaus-Robert MüllerH-Index: 134
view all 4 authors...
1,493 Citations
#1Johan A. K. SuykensH-Index: 89
#2Joos VandewalleH-Index: 87
Preface. 1. Neural Nets and Related Model Structures for Nonlinear System Identification J. Sjoberg, L.S.H. Ngia. 2. Enhanced Multi-Stream Kalman Filter Training for Recurrent Networks L.A. Feldkamp, et al. 3. The Support Vector Method of Function Estimation V. Vapnik. 4. Parametric Density Estimation for the Classification of Acoustic Feature Vectors in Speech Recognition S. Basu, C.A. Micchelli. 5. Wavelet Based Modeling of Nonlinear Systems Yi Yu, et al. 6. Nonlinear Identification Based on F...
135 Citations
#1Yat-Fung Yam (CityU: City University of Hong Kong)H-Index: 4
#2Tommy W. S. Chow (CityU: City University of Hong Kong)H-Index: 50
Last. Chi-Tat Leung (CityU: City University of Hong Kong)H-Index: 6
view all 3 authors...
Abstract An algorithm for determining the optimal initial weights of feedforward neural networks based on a linear algebraic method is developed. The optimal initial weights are evaluated by using a least squares method at each layer. With the optimal initial weights determined, the initial error is substantially smaller and therefore the number of iterations required to achieve the error criterion is reduced. For a character recognition task, the number of iterations required for the network st...
45 CitationsSource
#1Christopher M. Bishop (Aston University)H-Index: 62
From the Publisher: This is the first comprehensive treatment of feed-forward neural networks from the perspective of statistical pattern recognition. After introducing the basic concepts, the book examines techniques for modelling probability density functions and the properties and merits of the multi-layer perceptron and radial basis function network models. Also covered are various forms of error functions, principal algorithms for error function minimalization, learning and generalization i...
18.7k Citations
Cited By1
Newest
The random neural network is a biologically inspired neural model where neurons interact by probabilistically exchanging positive and negative unit-amplitude signals that has superior learning capabilities compared to other artificial neural networks. This paper considers nonnegative least squares supervised learning in this context, and develops an approach that achieves fast execution and excellent learning capacity. This speedup is a result of significant enhancements in the solution of the n...
2 CitationsSource