A Precision-Scalable Sparse CNN Accelerator with Fine-Grained Mixed Bitwidth Configurability

Rongfeng Li; Xueming Li; Chaoming Yang; Xianghong Hu; Yuanmiao Lin; Shansen Fu; Hongmin Huang; Shuting Cai; Xiaoming Xiong

doi:10.1587/elex.22.20240601

この記事には本公開記事があります。本公開記事を参照してください。
引用する場合も本公開記事を引用してください。

A Precision-Scalable Sparse CNN Accelerator with Fine-Grained Mixed Bitwidth Configurability

Rongfeng Li, Xueming Li, Chaoming Yang, Xianghong Hu, Yuanmiao Lin, Shansen Fu, Hongmin Huang, Shuting Cai, Xiaoming Xiong

著者情報

キーワード: Convolutional neural networks, accelerator, mixed bitwidth, bit-wise sparsity

ジャーナルフリー早期公開

論文ID: 22.20240601

DOI https://doi.org/10.1587/elex.22.20240601

この記事には本公開記事があります。

The final version of this article is now available: Vol. 22 (2025), No. 7 pp. 20240601-20240601

詳細

抄録

Designing special accelerators for mixed-precision and sparse convolutional neural networks (CNNs) is a very effective method to improve computational efficiency. However, few accelerators can support mixed bitwidth and sparsity at the same time, and most of them can only support layer-wise mixed bitwidth or element-wise sparsity, which make limited use of compressed CNNs. In order to fully utilize the benefits of model compression, it is of great significance to design accelerators that support fine-grained mixed bitwidth and bit-wise sparse computation. Therefore, this brief first proposes a hardware-efficient and precision-scalable sparse processing element (PE) that can support mixed bitwidth multiply-and-accumulate (MAC) operation and bit-wise sparse zero-skipping. Secondly, a fine-grained convolution acceleration method is proposed, which quantifies and encodes the valid bits of weight group to exploit the bit-wise sparsity of the high and low bits. Finally, the proposed fine-grained precision-scalable sparse accelerator is proposed and implemented on Xilinx ZC706 FPGA device, which achieves better accuracy and performance. When tested on VGG16, the proposed accelerator improves DSP efficiency by 1.77× to 3.84× compared to state-of-the-art accelerators.

責任著者(Corresponding author)

J-STAGEへの登録はこちら（無料）