Transactions of the Japan Society for Computational Engineering and Science
Online ISSN : 1347-8826
ISSN-L : 1344-9443
Acceleration of large scale high accurate advection calculation for multiple GPUs and its strong scalability
[in Japanese][in Japanese]
Author information
JOURNAL FREE ACCESS

2010 Volume 2010 Pages 20100018

Details
Abstract
Recent GPUs (Graphics Processing Unit) have great advantages in performance and memory bandwidth for general-purpose computing. The CUDA programming environment enables us the GPU computing easily as a SIMT(single-instruction, multiple-thread)-type accelerator. High-order Finite Difference Methods (FDM) have been applied to CFD (Computational Fluid Dynamics) and the advection equation has been examined as a typical benchmark. We study the computational performances depending on the arithmetic intensity for several high-accurate FDMs. The detail description of the GPU implementation of the 5th-order WENO scheme is given with respect to the usage of the shared memory and registers. Multiple-GPU computing is required for further speedups and large-scale computing beyond the memory size limitation on a graphics card. The computational domain is decomposed three-dimensionally and the overall performances depend on not only the computation but also the GPU to GPU communication. The overlapping techniques between the computation and the communication are well organized with changing the order of the GPU kernels. The strong scalability is shown on the TSUBAME grid cluster and the performance of 7.8 TFlops is achieved by using 60 GPUs, when we compute the advection equation with the 5th-order WENO scheme.
Content from these authors
© 2010 The Japan Society For Computational Engineering and Science
Previous article Next article
feedback
Top