A precision-scalable sparse CNN accelerator with fine-grained mixed bitwidth configurability

Rongfeng Li; Xueming Li; Chaoming Yang; Xianghong Hu; Yuanmiao Lin; Shansen Fu; Hongmin Huang; Shuting Cai; Xiaoming Xiong

doi:10.1587/elex.22.20240601

LETTER

A precision-scalable sparse CNN accelerator with fine-grained mixed bitwidth configurability

Rongfeng Li, Xueming Li, Chaoming Yang, Xianghong Hu, Yuanmiao Lin, Shansen Fu, Hongmin Huang, Shuting Cai, Xiaoming Xiong

Author information

Rongfeng Li
https://orcid.org/0009-0001-1697-6556
School of Integrated Circuits, Guangdong University of Technology
Xueming Li
School of Integrated Circuits, Guangdong University of Technology
Chaoming Yang
School of Integrated Circuits, Guangdong University of Technology
Xianghong Hu
School of Integrated Circuits, Guangdong University of Technology
Yuanmiao Lin
School of Integrated Circuits, Guangdong University of Technology
Shansen Fu
School of Integrated Circuits, Guangdong University of Technology
Hongmin Huang
School of Electronics and Information, Guangdong Polytechnic Normal University
Shuting Cai
School of Integrated Circuits, Guangdong University of Technology
Xiaoming Xiong
School of Integrated Circuits, Guangdong University of Technology

Keywords: convolutional neural networks, accelerator, mixed bitwidth, bit-wise sparsity

JOURNAL FREE ACCESS

2025 Volume 22 Issue 7 Pages 20240601

DOI https://doi.org/10.1587/elex.22.20240601

Browse “Advance online publication” version

Details

Abstract

Designing special accelerators for mixed-precision and sparse convolutional neural networks (CNNs) is a very effective method to improve computational efficiency. However, few accelerators can support mixed bitwidth and sparsity at the same time, and most of them can only support layer-wise mixed bitwidth or element-wise sparsity, which make limited use of compressed CNNs. In order to fully utilize the benefits of model compression, it is of great significance to design accelerators that support fine-grained mixed bitwidth and bit-wise sparse computation. Therefore, this brief first proposes a hardware-efficient and precision-scalable sparse processing element (PE) that can support mixed bitwidth multiply-and-accumulate (MAC) operation and bit-wise sparse zero-skipping. Secondly, a fine-grained convolution acceleration method is proposed, which quantifies and encodes the valid bits of weight group to exploit the bit-wise sparsity of the high and low bits. Finally, the proposed fine-grained precision-scalable sparse accelerator is proposed and implemented on Xilinx ZC706 FPGA device, which achieves better accuracy and performance. When tested on VGG16, the proposed accelerator improves DSP efficiency by 1.77× to 3.84× compared to state-of-the-art accelerators.

Corresponding author

Register with J-STAGE for free!