Abstract
When implementing non-multiplier linear systems, delay-aware common subexpression elimination (DACSE) is a critical algorithm for optimizing the area efficiency under a given timing constraint. In this paper, we propose an optimized DACSE algorithm for the hardware implementation of binary-field linear transform (BFLT). In order to achieve the shortest critical path delay (CPD), the proposed algorithm uses fast-binary-tree structure to implement the BFLT circuit before sharing common subexpressions (CSs). However, as the delays of involved signals are different after sharing CSs, the delay-driven-binary-tree (DDBT) structure is adopted to further optimize the critical path of the BFLT logics. The CPD of the DDBT based circuit is evaluated for each case with an eliminated CS, and the CS elimination will be abandoned if the case cannot meet the given timing constraint. Moreover, the proposed algorithm provides all of the design trade-offs, from the shortest feasible CPD to the smallest area, to designers, offering them the maximum design space. Experiments are carried out to verify the proposed algorithm and the results show that the proposed DACSE is more efficient in area reduction than the previous works, especially under a stringent timing constraint.