An Efficient Elman Neural Networks Based on Improved Conjugate Gradient Method with Generalized Armijo Search

Published on Aug 15, 2018 in ICIC (International Conference on Intelligent Computing)
路 DOI :10.1007/978-3-319-95930-6_1
Mingyue Zhu1
Estimated H-index: 1
(China University of Petroleum),
Tao Gao3
Estimated H-index: 3
(China University of Petroleum)
+ 2 AuthorsJian Wang15
Estimated H-index: 15
(China University of Petroleum)
Sources
Abstract
Elman neural network is a typical class of recurrent network model. Gradient descent method is the popular strategy to train Elman neural networks. However, the gradient descent method is inefficient owing to its linear convergence property. Based on the Generalized Armijo search technique, we propose a novel conjugate gradient method which speeds up the convergence rate in training Elman networks in this paper. A conjugate gradient coefficient is proposed in the algorithm, which constructs conjugate gradient direction with sufficient descent property. Numerical results demonstrate that this method is more stable and efficient than the existing training methods. In addition, simulation shows that, the error function has a monotonically decreasing property and the gradient norm of the corresponding function tends to zero.
馃摉 Papers frequently viewed together
2017ICONIP: International Conference on Neural Information Processing
5 Authors (Bingjie Zhang, ..., Jian Wang)
5 Authors (Jian Wang, ..., Qingying Sun)
References12
Newest
#1Jian Wang (China University of Petroleum)H-Index: 15
#2Bingjie Zhang (China University of Petroleum)H-Index: 4
Last. Qingying Sun (China University of Petroleum)H-Index: 1
view all 5 authors...
Abstract In this paper, a novel multilayer backpropagation (BP) neural network model is proposed based on conjugate gradient (CG) method with generalized Armijo search. The presented algorithm requires low memory and performs fast convergent speed in practical applications. One reason is that the constructed conjugate direction guarantees the sufficient descent behavior in minimizing the given objective function. The other stems from the fact that the generalized Armijo method can automatically ...
Source
#1Shih Yin Ooi (MMU: Multimedia University)H-Index: 5
#2Shing Chiang Tan (MMU: Multimedia University)H-Index: 15
Last. Wooi Ping Cheah (MMU: Multimedia University)H-Index: 7
view all 3 authors...
Elman network is an extension of multilayer perceptron (MLP), where it introduces single hidden layer architecture, as well as an additional state table to store the time units for hidden neurons. This additional state table allows it to do the sequential prediction which is not possible in MLP. To examine its general performance as a temporal classifier, a Weka version of Elman network is exploited on 11 public temporal datasets released by UCI Machine Repository.
Source
#1Mohd RivaieH-Index: 8
#2Muhammad FauziH-Index: 4
Last. Ismail MohdH-Index: 4
view all 4 authors...
Conjugate gradient (CG) methods play a significant and important role in solving unconstrained optimization. This paper presents, a new modification of the conjugate gradient coefficient (尾魏) that has global convergence properties. The global convergence result is established using exact line searches. Numerical results show that the proposed formula is superior when compared to other CG coefficients. Numerical results also suggest that this method possesses global convergence properties.
Source
#1Jian Wang (DUT: Dalian University of Technology)H-Index: 15
#2Wei Wu (DUT: Dalian University of Technology)H-Index: 67
Last. Jacek M. Zurada (University of Louisville)H-Index: 42
view all 3 authors...
Conjugate gradient methods have many advantages in real numerical experiments, such as fast convergence and low memory requirements. This paper considers a class of conjugate gradient learning methods for backpropagation neural networks with three layers. We propose a new learning algorithm for almost cyclic learning of neural networks based on PRP conjugate gradient method. We then establish the deterministic convergence properties for three different learning modes, i.e., batch mode, cyclic an...
Source
In this study, Elman recurrent neural networks have been defined by using conjugate gradient algorithm in order to determine the depth of anesthesia in the continuation stage of the anesthesia and to estimate the amount of medicine to be applied at that moment. The feed forward neural networks are also used for comparison. The conjugate gradient algorithm is compared with back propagation (BP) for training of the neural Networks. The applied artificial neural network is composed of three layers,...
Source
#1Wei Wu (DUT: Dalian University of Technology)H-Index: 15
#2Dongpo Xu (DUT: Dalian University of Technology)H-Index: 10
Last. Zhengxue Li (DUT: Dalian University of Technology)H-Index: 7
view all 3 authors...
The gradient method for training Elman networks with a finite training sample set is considered. Monotonicity of the error function in the iteration is shown. Weak and strong convergence results are proved, indicating that the gradient of the error function goes to zero and the weight sequence goes to a fixed point, respectively. A numerical example is given to support the theoretical findings.
Source
#1Dongpo XuH-Index: 10
#2Zhengxue LiH-Index: 7
Last. Wei Wu (DUT: Dalian University of Technology)H-Index: 67
view all 3 authors...
An approximated gradient method for training Elman networks is considered. For flnite sample set, the error function is proved to be monotone in the training process, and the approximated gradient of the error function tends to zero if the weights sequence is bounded. Furthermore, after adding a moderate condition, the weights sequence itself is also proved to be convergent. A numerical example is given to support the theoretical flndings.
#1A. AsuncionH-Index: 1
#1DeLiang Wang (OSU: Ohio State University)H-Index: 87
#2Xiaomei Liu (OSU: Ohio State University)H-Index: 1
Last. Stanley C. Ahalt (OSU: Ohio State University)H-Index: 21
view all 3 authors...
Abstract Simple recurrent networks (Elman networks) have been widely used in temporal processing applications. In this study we investigate temporal generalization of simple recurrent networks, drawing comparisons between network capabilities and human performance. Elman networks are trained to generate temporal trajectories sampled at different rates. The networks are then tested with trajectories at the trained rates and other sampling rates, including trajectories representing mixtures of dif...
Source
#1Jeffrey L. Elman (UCSD: University of California, San Diego)H-Index: 53
Time underlies many interesting human behaviors. Thus, the question of how to represent time in connectionist models is very important. One approach is to represent time implicitly by its effects on processing rather than explicitly (as in a spatial representation). The current report develops a proposal along these lines first described by Jordan (1986) which involves the use of recurrent links in order to provide networks with a dynamic memory. In this approach, hidden unit patterns are fed ba...
Source
Cited By1
Newest
#1Tao Gao (China University of Petroleum)H-Index: 3
#2Xiaoling Gong (China University of Petroleum)H-Index: 2
Last. Jacek M. Zurada (ITI: Information Technology Institute)H-Index: 42
view all 7 authors...
Abstract Elman network is a classical recurrent neural network with an internal delay feedback. In this paper, we propose a recalling-enhanced recurrent neural network (RERNN) which has a selective memory property. In addition, an improved conjugate algorithm with generalized Armijo search technique that speeds up the convergence rate is used to train the RERNN model. Further enhancement performance is achieved with adaptive learning coefficients. Finally, we prove weak and strong convergence of...
Source
This website uses cookies.
We use cookies to improve your online experience. By continuing to use our website we assume you agree to the placement of these cookies.
To learn more, you can find in our Privacy Policy.