2010 Volume 1 Issue 1 Pages 2-24
Given a vector pi of floating-point numbers with exact sum s, we present a new algorithm with the following property: Either the result is a faithful rounding of s, or otherwise the result has a relative error not larger than epsKcond(∑pi) for K to be specified. The statements are also true in the presence of underflow, the computing time does not depend on the exponent range, and no extra memory is required. Our algorithm is fast in terms of measured computing time because it allows good instruction-level parallelism. A special version for K=2, i.e., quadruple precision is also presented. Computational results show that this algorithm is more accurate and faster than competitors such as XBLAS.