IEICE Transactions on Information and Systems
Online ISSN : 1745-1361
Print ISSN : 0916-8532
Regular Section
Enhancing GPU Performance Through Complexity-Effective Out-of-Order Execution Using Distance-Based ISA
Reoma MATSUOToru KOIZUMIHidetsugu IRIEShuichi SAKAIRyota SHIOYA
Author information
JOURNAL FREE ACCESS

2025 Volume E108.D Issue 6 Pages 558-569

Details
Abstract

Graphics processing units (GPUs) have been introduced in various fields due to their high parallel computing performance. A key feature of GPUs is multi-threaded execution, where a GPU executes many threads simultaneously to hide various latencies. However, even with such multi-threaded execution, there is a limit to the number of threads that can be launched, and long latency instructions eventually stall the GPU core. While long latencies can be hidden by out-of-order execution, it requires expensive circuits such as rename logic and load-store queues and is not typically introduced on GPUs with massively multi-threaded execution. We propose the TURBULENCE architecture for very low-cost out-of-order execution on GPUs. TURBULENCE consists of a novel ISA that introduces the concept of referencing operands by inter-instruction distance instead of register numbers, and a novel microarchitecture that executes the novel ISA. This distance-based operand has the property of not causing false dependencies. By exploiting this property, we achieve complexity-effective out-of-order execution on GPUs without introducing any expensive hardware. Simulation results show that TURBULENCE improves performance by 20.4% while reducing energy consumption over an existing GPU.

Content from these authors
© 2025 The Institute of Electronics, Information and Communication Engineers
Previous article Next article
feedback
Top