IEICE Transactions on Electronics
Online ISSN : 1745-1353
Print ISSN : 0916-8524
High-Reuse Quantized Bit-Serial Convolutional Neural Network Processing Element Arrays with Weight Ring Dataflow
Xiaoshu CHENGYiwen WANGHongfei LOUWeiran DINGPing LI
Author information
JOURNAL FREE ACCESS Advance online publication

Article ID: 2024LHP0001

Details
Abstract

With the growth of deep learning and machine learning applications, an efficient processing element array (PEA) has become increasingly important. To address this need, this paper introduces a quantized bit-serial PEA, which improves data reusability by integrating a weight ring (WR) dataflow mechanism and increases operation frequency through the use of bit-serial circuits. This design substantially reduces the number of feature map accesses, thereby optimizing data processing efficiency. A key aspect of our approach is the use of quantization techniques. By converting floating-point values to signed 8-bit fixed-point numbers, we reduce computational complexity and ease memory bandwidth pressure. We briefly discuss that ignoring bias terms may not impact model inference accuracy when the appropriate neural network type and dataset are chosen. Our proposed WR dataflow, inspired by the weight stationary (WS) dataflow, only updates the outdated row with a new row. This not only boosts data reuse rates but also diminishes costly data access operations. Notably, the 3×3 WR PEA requires 38.54% of the off-chip accesses per second as compared to the 3×3 WS PEA and merely 11.25% compared to its no local reuse (NLR) PEA counterpart. Empirical results show its excellent trade-off between area, power, and speed, ensuring robust data reuse efficiency. By combining quantization and WR dataflow, our high-reuse, quantized bit-serial PEA offers a fresh perspective on deep learning hardware design.

Content from these authors
© 2024 The Institute of Electronics, Information and Communication Engineers
Previous article Next article
feedback
Top