IEICE Electronics Express
Online ISSN : 1349-2543
ISSN-L : 1349-2543
LETTER
Sample-wise dynamic precision quantization for neural network acceleration
Bowen LiDongliang XiongKai HuangXiaowen JiangHao YaoJunjian ChenLuc Claesen
Author information
JOURNAL FREE ACCESS

2022 Volume 19 Issue 16 Pages 20220229

Details
Abstract

Quantization is a well-known method for deep neural networks (DNNs) compression and acceleration. In this work, we propose the Sample-Wise Dynamic Precision (SWDP) quantization scheme, which can switch the bit-width of weights and activations in the model according to the task difficulty of input samples at runtime. Using low-precision networks for easy input images brings advantages in terms of computational and energy efficiency. We also propose an adaptive hardware design for the efficient implementation of our SWDP networks. The experimental results on various networks and datasets demonstrate that our SWDP achieves an average of 3.3× speedup and 3.0× energy saving over the bit-level dynamically composable architecture BitFusion.

Content from these authors
© 2022 by The Institute of Electronics, Information and Communication Engineers
Previous article Next article
feedback
Top