NAIST Simultaneous Interpretation Corpus: Development and Analyses of Data from Interpreters of Different Levels

Kosuke Doi; Katsuhito Sudoh; Satoshi Nakamura

doi:10.5715/jnlp.31.868

Abstract

This paper describes the development of a large-scale English-Japanese simultaneous interpretation corpus named NAIST-SIC and presents analyses of it. We collected the recordings of simultaneous interpreting sentences (SI^sent). To understand the characteristics of simultaneous interpreting by human simultaneous interpreters (SIers), we analyzed a subset of this corpus. Samples of speech were interpreted by three SIers having different levels of experience and can be used to compare SI^sent attributes in terms of the SIers’ experience. Using this corpus subset, we analyzed the differences in latency, quality, and word order. The results show that (1) SIers with more experience tended to generate a higher quality of SI^sent, and (2) they better controlled the latency and quality. We also observed that (3) a large latency degraded the SI^sent quality.

Content from these authors

Licensed under CC BY 4.0
https://creativecommons.org/licenses/by/4.0/

Favorites & Alerts

Corresponding author

Register with J-STAGE for free!