Breaking the performance bottleneck of sparse matrix-vector multiplication on SIMD processors

Kai Zhang; Shuming Chen; Yaohua Wang; Jianghua Wan

doi:10.1587/elex.10.20130147

抄録

The low utilization of SIMD units and memory bandwidth is the main performance bottleneck on SIMD processors for sparse matrix-vector multiplication (SpMV), which is one of the most important kernels in many scientific and engineering applications. This paper proposes a hybrid optimization method to break the performance bottleneck of SpMV on SIMD processors. The method includes a new sparse matrix compressed format, a block SpMV algorithm, and a vector write buffer. Experimental results show that our hybrid optimization method can achieve an average speedup of 2.09 over CSR vector kernel for all the matrices. The maximum speedup can go up to 3.24.

著者関連情報

お気に入り & アラート

お気に入りに追加
追加情報アラート
被引用アラート
認証解除アラート

閲覧履歴

Generation and Characterization of Anti-phenyl Sulfate Monoclonal Antibodies and a Potential Use for Phenyl Sulfate Analysis in Human Blood
Marital Status and Psychological Distress in Japan
[Foreword] Welcome to the Special Section on Advanced Imaging and Computer Graphics Technology
TWO CASES OF DIAPHRAGMATIC HERNIA THROUGH THE FORAMEN OF BOCHDALEK AND ONE CASE OF INTERPOSITIO HEPATODIAPHRAGMATICA COLI (CHILAIDITI SYNDROME) IN CHILDREN
To Be a Corrosionist from a Young Person with Recent Manner

責任著者(Corresponding author)

J-STAGEへの登録はこちら（無料）