Abstract
Recently, Manochehri et al. proposed a modified radix-2 Montgomery modular multiplication with a new recording method. In this letter, we present an improvement to their scheme that makes it simpler and faster. Manochehri et al.’s algorithm requires n + 2 iterations, whereas the proposed (non-pipelined) algorithm requires n + 2 iterations. Moreover, there is no need for post-processing to obtain the correct output, nor for a non-standard operation such as bitwise subtraction. The area/time complexity of our pipelined multiplier is reduced by approximately 24.36% compared to Manochehri et al.’s multiplier. The proposed architecture is simple, modular, and regular. Moreover, it exhibits low complexity and propagation delay. Accordingly, it is well suited for VLSI implementation.