Nonlinear Theory and Its Applications, IEICE
Online ISSN : 2185-4106
ISSN-L : 2185-4106
Special Issue on Recent Progress in Nonlinear Theory and Its Applications
aSNAQ: An adaptive stochastic Nesterov's accelerated quasi-Newton method for training RNNs
Indrapriyadarsini SendilkkumaarShahrzad MahboubiHiroshi NinomiyaHideki Asai
Author information
JOURNAL FREE ACCESS

2020 Volume 11 Issue 4 Pages 409-421

Details
Abstract

Recurrent Neural Networks (RNNs) are powerful sequence models that are particularly difficult to train. This paper proposes an adaptive stochastic Nesterov's accelerated quasi-Newton (aSNAQ) method for training RNNs. Several algorithms have been proposed earlier for training RNNs. However, due to high computational complexity, very few methods use second-order curvature information despite its ability to improve convergence. The proposed method is an accelerated second-order method that attempts to incorporate curvature information while maintaining a low per iteration cost. Furthermore, direction normalization has been introduced to solve the vanishing and/or exploding gradient problem that is prominent in training RNNs. The performance of the proposed method is evaluated in Tensorflow on benchmark sequence modeling problems. The results show that the proposed aSNAQ method is effective in training RNNs with a low per-iteration cost and improved performance compared to the second-order adaQN and first-order Adagrad and Adam methods.

Content from these authors
© 2020 The Institute of Electronics, Information and Communication Engineers
Previous article Next article
feedback
Top