Journal of Information Processing
Online ISSN : 1882-6652
ISSN-L : 1882-6652
 
An Efficient Execution Mechanism on a GPU for Fine-Grained Parallel Programs With the Fork-Join Model
Kosuke KiuchiYudai TanabeHidehiko Masuhara
Author information
JOURNAL FREE ACCESS

2025 Volume 33 Pages 840-851

Details
Abstract

General purpose computing on graphics processing units (GPGPU) has an execution model in which the number and type of parallel tasks are managed by the CPU, making it difficult to execute fine-grained parallel programs efficiently with nested parallel tasks at a nonhomogeneous granularity. This work addresses this problem by efficiently executing fine-grained parallel programs by managing parallel tasks on the GPU using a fast memory allocation mechanism. As a preliminary implementation, this work proposes a method for splitting the computation in a fine-grained parallel fork-join program at the fork point and allocating each computation to the GPU memory as a parallel task. In addition, kernel fusion, parallel task reuse, and parallel throttling are explored as optimization methods for the proposed method. This work implements a fine-grained parallel fork-join program in CUDA and investigates its scalability and execution speed to evaluate the feasibility and performance of the proposed method.

Content from these authors
© 2025 by the Information Processing Society of Japan
Previous article Next article
feedback
Top