2019 Volume 10 Issue 1 Pages 45-59
Neural networks have a rich ability to learn complex representations and have achieved remarkable results in various tasks. However, they are prone to overfitting owing to the limited number of training samples and regularizing the learning process of neural networks is essential. In this paper, we propose a regularization method that estimates the parameters of a large convolutional neural network as probabilistic distributions using a hypernetwork, which generates the parameters of another network. Additionally, we perform model averaging to improve the network performance. Then, we apply the proposed method to a large model such as wide residual networks. The experimental results demonstrate that our method and its model averaging outperform the commonly used maximum a posteriori estimation with L2 regularization.