Abstract
Mapping matrix operations on SIMD processors brings a large amount of data rearrangement that decrease the system performance. This paper proposes a Configurable Matrix Register File (CMRF) that supports both row-wise and column-wise accesses. The CMRF can be dynamically configured into different operating modes in which one or several sub-matrices can be accessed in parallel. Experimental results show that, compared with the traditional Vector Register File (VRF) and the MRF, the CMRF can respectively achieve about 2.21x and 1.6x average performance improvement. Compared with TMS320C64x+, our SIMD processor can achieve about 5.65x to 7.71x performance improvement by employing the CMRF.