2024 Volume 31 Issue 3 Pages 868-893
This paper describes the development of a large-scale English-Japanese simultaneous interpretation corpus named NAIST-SIC and presents analyses of it. We collected the recordings of simultaneous interpreting sentences (SIsent). To understand the characteristics of simultaneous interpreting by human simultaneous interpreters (SIers), we analyzed a subset of this corpus. Samples of speech were interpreted by three SIers having different levels of experience and can be used to compare SIsent attributes in terms of the SIers’ experience. Using this corpus subset, we analyzed the differences in latency, quality, and word order. The results show that (1) SIers with more experience tended to generate a higher quality of SIsent, and (2) they better controlled the latency and quality. We also observed that (3) a large latency degraded the SIsent quality.