IEICE Transactions on Electronics
Online ISSN : 1745-1353
Print ISSN : 0916-8524
A Scalable 1L2D Multi-Core Near-DRAM Computing Accelerator Based on 3D Hybrid Bonding for AI Models
Zhongze HanYue CaoXuanzhi LiuJinhui ChengJianguo Yang
著者情報
ジャーナル フリー 早期公開

論文ID: 2025ECP5020

詳細
抄録

Many memory-bound AI applications, including natural language processing, transformer-based visual recognition, and multi-task online inference, rely heavily on large-scale general matrix-vector multiplication (GEMV), which is characterized by strong data locality. However, existing hardware architectures for AI model inference face significant data transfer overheads and fail to fully exploit the data locality inherent in these algorithms. We propose a scalable one-logic-two-DRAM (1L2D) multi-core near-DRAM computing accelerator based on 3D hybrid bonding for AI models. Our 3D integration of RISC-V processors with vector accelerators and DRAM presents a unique approach that significantly boosts bandwidth while reducing energy consumption. A memory access circuit supporting page hit mechanism and prefetching strategy is designed to maximize the utilization of the data locality achieved by the algorithm's partitioning and rearranging of data. An interleaving memory address mapping scheme is designed to effectively enhance the bank-level parallelism of data access. Compared with the high-performance Intel Xeon-6230 CPU and the state-of-the-art commercially available UPMEM-PIM, the proposed architecture's computational efficiency for large-scale GEMV is improved by 3.4× and 2.2×, respectively. The architecture achieves a 3.07× improvement in bandwidth and a 76% reduction in energy consumption over the HBM2-PIM.

著者関連情報
© 2025 The Institute of Electronics, Information and Communication Engineers
前の記事 次の記事
feedback
Top