2025 Volume 22 Issue 7 Pages 20240601
Designing special accelerators for mixed-precision and sparse convolutional neural networks (CNNs) is a very effective method to improve computational efficiency. However, few accelerators can support mixed bitwidth and sparsity at the same time, and most of them can only support layer-wise mixed bitwidth or element-wise sparsity, which make limited use of compressed CNNs. In order to fully utilize the benefits of model compression, it is of great significance to design accelerators that support fine-grained mixed bitwidth and bit-wise sparse computation. Therefore, this brief first proposes a hardware-efficient and precision-scalable sparse processing element (PE) that can support mixed bitwidth multiply-and-accumulate (MAC) operation and bit-wise sparse zero-skipping. Secondly, a fine-grained convolution acceleration method is proposed, which quantifies and encodes the valid bits of weight group to exploit the bit-wise sparsity of the high and low bits. Finally, the proposed fine-grained precision-scalable sparse accelerator is proposed and implemented on Xilinx ZC706 FPGA device, which achieves better accuracy and performance. When tested on VGG16, the proposed accelerator improves DSP efficiency by 1.77× to 3.84× compared to state-of-the-art accelerators.