Quantization Strategy for Achieving Low-Cost, Accurate, Pareto-Optimal Convolutional Neural Networks Based on Analysis of Quantized Weight Parameters

Kengo NAKATA; Daisuke MIYASHITA; Asuka MAKI; Fumihiko TACHIBANA; Shinichi SASAKI; Jun DEGUCHI; Ryuichi FUJIMOTO

doi:10.1587/transfun.2025EAP1034

抄録

Quantization is an effective way to reduce memory and computational costs in the inference of convolutional neural networks. However, it remains unclear which model can achieve higher recognition accuracy while minimizing memory and computational costs: a large model (with a large number of parameters) quantized to an extremely low bit width (1 or 2 bits) or a small model (with a small number of parameters) quantized to a moderately low bit width (3, 4, or 5 bits). In this paper, we define a metric that combines the numbers of parameters and computations with the bit widths of quantized weight parameters. By utilizing this metric, we demonstrate that Pareto-optimal performance, where the best accuracy is attained at a given memory or computational cost, is achieved when a small model is moderately quantized, not when a large model is extremely quantized. Based on this finding, we empirically show that the Pareto frontier is improved by 4.3 × in a post-training quantization scenario for a quantized ResNet-50 model using the ImageNet dataset.

著者関連情報

お気に入り & アラート

閲覧履歴

責任著者(Corresponding author)

J-STAGEへの登録はこちら（無料）