IEICE Electronics Express
Online ISSN : 1349-2543
ISSN-L : 1349-2543
LETTER
Decoupled iteration mapping: improving dependency-loop performance on SIMD processors
Hui YangShuming ChenJianghua WanHuanyao Dai
Author information
JOURNAL FREE ACCESS

2013 Volume 10 Issue 21 Pages 20130798

Details
Abstract
Wide Single Instruction Multiple Data (SIMD) architectures are very important in the compute-intensive applications, but less efficient for applications with cross-iteration dependency loops which are difficult to parallelize and vectorize. This paper introduces Decoupled Iteration Mapping (DIM), a technique that dynamically maps a cross-iteration dependency loop onto the improved SIMD architecture which achieved multicore-like thread-parallel performance. The minor modification on the baseline architecture is composed of a Prefetch Unit & Instruction Buffer Array (PU&IBA), a Loop Control Unit & Instruction Dispatch Unit (LCU&IDU), and a Data Buffer Chain (DBC). Experimental results show that, the proposed DIM scheme can achieve average 3.04x performance speedup with a cost of only 6.44% area overhead.
Content from these authors
© 2013 by The Institute of Electronics, Information and Communication Engineers
Previous article Next article
feedback
Top