IEICE Electronics Express
Online ISSN : 1349-2543
ISSN-L : 1349-2543
BitFleX: Exploiting Extreme Bit-Level Sparsity via BTD Encoding and Dynamic Pruning
Chendong XiaHuidong ZhaoQiang LiShushan Qiao
Author information
JOURNAL FREE ACCESS Advance online publication

Article ID: 22.20250409

Details
Abstract

Deep neural networks (DNNs) have achieved remarkable success in critical domains such as computer vision. However, their substantial model scale and computational demands hinder deployment on resource-constrained edge devices. Bit-serial accelerators (BSAs) leverage significant bit-level sparsity (BLS) in weights and activations to accelerate inference. Unstructured BLS causes hardware inefficiency, while existing static pruning methods cannot adapt to real-time activations. To address these challenges, we propose BitFleX, a BSA enabling runtime semi-structured pruning for both weights and activations. Specifically, we introduce Bit-Term Decomposition (BTD) encoding to enhance inherent BLS and reduce pruning complexity. Additionally, a pruning-error predictor dynamically selects operands for sparsification with minimal error. Experiments show BitFleX achieves 87.5% BLS in ViT-B with <1% Top-1 accuracy loss on ImageNet, yielding 5.86× speedup over baseline and 23.61 TOPS/W peak energy efficiency.

Content from these authors
© 2025 by The Institute of Electronics, Information and Communication Engineers
Previous article Next article
feedback
Top