Two-Step Physical Register Deallocation for Data Prefetching and Address Pre-Calculation

Akihiro Yamamoto; Yusuke Tanaka; Hideki Ando; Toshio Shimada

doi:10.11185/imt.3.755

抄録

This paper proposes an instruction pre-execution scheme for a high performance processor, that reduces latency and early scheduling of loads. Our scheme exploits the difference between the amount of instruction-level parallelism available with an unlimited number of physical registers and that available with an actual number of physical registers. We introduce the two-step physical register deallocation scheme, which deallocates physical registers at the renaming stage as a first step, and eliminates pipeline stalls caused by a shortage of physical registers. Instructions wait for the final deallocation as a second step in the instruction window. While waiting, the scheme allows pre-execution of instructions, that enables prefetching of load data and early calculation of memory effective addresses. Our evaluation results show that our scheme improves the performance significantly, and achieves a 1.26 times speedup over a processor without a prefetcher. If combined with a stride prefetcher, it achieves a 1.18 times speedup over a processor with a stride prefetcher.

著者関連情報

お気に入り & アラート

閲覧履歴

責任著者(Corresponding author)

J-STAGEへの登録はこちら（無料）