データ並列深層学習における短期事前学習を用いた適応的学習係数調節手法

山田 和樹; 森 陽紀; 陽川 哲也; 宮内 勇貴; 和泉 慎太郎; 吉本 雅彦; 川口 博

doi:10.11517/pjsai.JSAI2019.0_2H3J202

Abstract

This paper describes short-term pre-training (STPT) algorism to adaptively select an optimum learning rate (LR). The proposed STPT algorism is beneficial for quick model prototyping in data-parallel deep learning. It adaptively finds an appropriate LR from multiple LR sets by STPT, which means the multiple LRs are evaluated within the beginning few iterations in an epoch. The STPT short cuts the tuning process of LRs that is requested in conventional training procedure as hyper-parameter tuning, even if the unknown models are considered. Therefore, the proposed STPT reduces computational time and increases throughput to find the best LR for network training. This algorism reduces the computational time by 87.5% than the conventional method when the eight-LR sets are evaluated using eight-parallel workers. We verified the accuracy improvement by 4.8 % compared with the conventional one with a reference LR of 0.1; there are no accuracy deterioration is observed. In this algorism, better training convergence is shown and expresses the advantage in terms of training time especially for the unknown models than other cases such as fixed LR.

Content from these authors

Favorites & Alerts

Corresponding author

Conference information

Register with J-STAGE for free!