Host: The Japanese Society for Artificial Intelligence
Name : 34th Annual Conference, 2020
Number : 34
Location : Online
Date : June 09, 2020 - June 12, 2020
Deep Neural Networks(DNN), which have extremely large numbers of parameters, have been overwhelming other machine learning methods by using enormous volumes of data for the training. Since the training of DNN costs a significant amount of time for the computation, large-scale parallelization has been employed to reduce the training time. Large-batch training increases the batch size to reduce the number of required iterations and hence speeds up the training. However, recent research has shown that the effect of speed up hits a certain limit as the batch size becomes very large. In this paper, we conduct experiments to study the relationship between the batch size and the number of required iterations as the batch size increases up to the full batch using LARS, a commonly used method to adjust the learning rate. Our results experimentally verify that LARS is superior to other optimization methods in reducing the number of iterations and also in generalization performance.