Variants of combinations of additive and multiplicative updates for GRU neural networks

Published on May 2, 2018
· DOI :10.1109/SIU.2018.8404457
Ali H. Mirza3
Estimated H-index: 3
(Bilkent University)
In this paper, we formulate several variants of the mixture of both the additive and multiplicative updates using stochastic gradient descent (SGD) and exponential gradient (EG) algorithms respectively. We employ these updates on the gated recurrent unit (GRU) networks. We then derive the gradient-based updates for the parameters of the GRU networks. We propose four different updates as a mean, minimum, even-odd and balanced set of updates for the GRU network. Through an extensive set of experiments, we demonstrate that these update variants perform better than simple SGD and EG updates. Overall, we observed that GRU-Mean update achieved the minimum cumulative and steady-state error performance. We also simulated the same set of experiments on the long short-term memory (LSTM) networks.
