Sample-wise dynamic precision quantization for neural network acceleration

Bowen Li; Dongliang Xiong; Kai Huang; Xiaowen Jiang; Hao Yao; Junjian Chen; Luc Claesen

doi:10.1587/elex.19.20220229

LETTER

Sample-wise dynamic precision quantization for neural network acceleration

Bowen Li, Dongliang Xiong, Kai Huang, Xiaowen Jiang, Hao Yao, Junjian Chen, Luc Claesen

Author information

Keywords: convolutional neural networks, dynamic quantization, hardware accelerators

JOURNAL FREE ACCESS

2022 Volume 19 Issue 16 Pages 20220229

DOI https://doi.org/10.1587/elex.19.20220229

Browse “Advance online publication” version

Details

Abstract

Quantization is a well-known method for deep neural networks (DNNs) compression and acceleration. In this work, we propose the Sample-Wise Dynamic Precision (SWDP) quantization scheme, which can switch the bit-width of weights and activations in the model according to the task difficulty of input samples at runtime. Using low-precision networks for easy input images brings advantages in terms of computational and energy efficiency. We also propose an adaptive hardware design for the efficient implementation of our SWDP networks. The experimental results on various networks and datasets demonstrate that our SWDP achieves an average of 3.3× speedup and 3.0× energy saving over the bit-level dynamically composable architecture BitFusion.

Corresponding author

Register with J-STAGE for free!