Bayesian estimation and model averaging of convolutional neural networks by hypernetwork

Kenya Ukai; Takashi Matsubara; Kuniaki Uehara

doi:10.1587/nolta.10.45

Abstract

Neural networks have a rich ability to learn complex representations and have achieved remarkable results in various tasks. However, they are prone to overfitting owing to the limited number of training samples and regularizing the learning process of neural networks is essential. In this paper, we propose a regularization method that estimates the parameters of a large convolutional neural network as probabilistic distributions using a hypernetwork, which generates the parameters of another network. Additionally, we perform model averaging to improve the network performance. Then, we apply the proposed method to a large model such as wide residual networks. The experimental results demonstrate that our method and its model averaging outperform the commonly used maximum a posteriori estimation with L2 regularization.

Content from these authors

Favorites & Alerts

Add to favorites
Additional info alert
Citation alert
Authentication alert

Corresponding author

Register with J-STAGE for free!