IEICE Electronics Express
Online ISSN : 1349-2543
ISSN-L : 1349-2543
AsymFly: an area-efficient GPGPU NoC for LLM applications
Ningan ChaoLanxiang LvPeiyong Zhang
Author information
JOURNAL FREE ACCESS Advance online publication

Article ID: 22.20250590

Details
Abstract

This paper introduces a novel area-efficient asymmetric butterfly (AsymFly) network-on-chip (NoC) architecture, specifically designed to manage the communication-intensive traffic patterns generated by large language models (LLMs) on GPGPUs. The proposed architecture strategically places memory and compute nodes on opposite sides of the network fabric. This physical arrangement, combined with node consolidation, reduces the router count by 17%. Furthermore, we employ pipeline-stage time-division multiplexing (TDM) to enhance resource utilization and achieve protocol-level deadlock avoidance within a unified physical network. To counteract throughput degradation induced by TDM, we propose a local adaptive scheduling strategy that dynamically balances resource occupancy across network regions. Compared to a conventional mesh baseline, our evaluations demonstrate that AsymFly improves instructions per cycle (IPC) by 26% while reducing both area and power consumption by 64%.

Content from these authors
© 2025 by The Institute of Electronics, Information and Communication Engineers
Previous article Next article
feedback
Top