Proposal and evaluation of recurrent neural network training by multi-phase quantization optimizer

Hiiro Yamazaki; Itsuki Akeno; Koki Nobori; Tetsuya Asai; Kota Ando

doi:10.1587/nolta.16.30

Abstract

In recent years, artificial intelligence (AI) has attracted attention to edge AI, which operates in an offline environment without the cloud and emphasizes response time. The demands of training neural networks at the edge have been actively discussed, but the problem is that the large amount of memory and computation required exceeds the limits of resources available at the edge. Much memory is allocated to the optimizer to hold and update model parameters, and advanced optimizers require memory to store additional parameters such as moments (past gradient information) for each parameter. Therefore, this research aims to reduce the amount of memory allocated to the optimizer and realize edge AI by using Holmes, an optimizer designed for implementation at the edge. In this study, we verify the applicability of Holmes to recurrent neural networks (RNNs), a variant of neural networks commonly used for time-series data. In a previous study, Holmes was proven to achieve sufficient accuracy using the MNIST dataset, but its use in RNNs has not been well confirmed. The difficulty in applying Holmes to RNNs is that the layers are effectively deeper due to the characteristic of RNNs having feedback paths. As we proceeded step by step from relatively easy verification to evaluation, we discovered the possibility that Holmes could be applied to RNNs. We present several validations we have performed, mainly on training of function prediction and language processing, which we compare and evaluate with other optimizers in terms of training accuracy. Through the proof-of-concept implementation and evaluation of Holmes in RNNs, we expand the possibilities toward edge training of RNNs. The study shows Holmes' potential for efficient edge AI applications, enabling resource-constrained devices to handle complex RNN tasks with accuracy comparable to traditional optimizers.

Content from these authors

This article is licensed under a Creative Commons [Attribution-NonCommercial-NoDerivatives 4.0 International] license.
https://creativecommons.org/licenses/by-nc-nd/4.0/

Favorites & Alerts

Corresponding author

Register with J-STAGE for free!