Abstract
Most of modern microprocessors have been equipped with SIMD type extended instruction sets mainly for media data processing. We can assume each of them to be a special version of vector instruction set, but they have different features and limitations from conventional ones, and we cannot exploit their potentials only with conventional compiler optimization techniques. Optimization techniques for SIMD instruction sets are classified into two categories in COINS project. One consists chiefly of vectorization-centric transformations that can be accomplished in source-code-level, and another is compiler optimization to generate appropriate SIMD instructions for the transformed codes. We call the latter “SIMD parallelization”, and concentrated upon it in this research. In this article, we report SIMD parallelization in COINS.