2025 Volume 17 Pages 17-20
A fast computation method and implementation techniques for computing the matrix inverse square root are presented. We demonstrate that the rational approximation method outperforms others in terms of the number of floating-point operations and its rounding error using the sum of resolvents computation is similar to that of the eigenvalue decomposition method. Moreover, the internal and external parallelism of the sum of resolvents computation can be effectively utilized by implementation for both CPU and GPU.