2008 Volume E91.A Issue 4 Pages 987-997
One hardware efficient and high speed architecture for variable block size motion estimation (VBSME) in H.264 is presented in this paper. By improving the pipeline structure and processing element (PE) circuits, the system latency and hardware cost is reduced, which makes this structure more hardware efficient than the original Propagate Partial SAD architecture. For small and middle frame size picture's coding, the proposed structure can save 12.1% hardware cost compared with original Propagate Partial SAD structure. In the case of HDTV, since small inter modes trivially contribute to the coding quality, we remove modes below 8×8 in our design. By adopting mode reduction technique, when the set number of PE array is less than 8, the proposed mode reduction based Propagate Partial SAD structure can work at faster clock speed and consume less hardware cost than widely used SAD Tree architecture. It is more robust to the high speed timing constraint when parallel processing is considered. With TSMC 0.18μm technology in worst work conditions (1.62V, 125°C), its peak throughput of 8-set PE array structure is 720p@30Hz with 128×64 search range and 5 reference frames. 12k gates hardware cost can be reduced by our design compared with the parallel SAD Tree architecture.