Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
General Paper (Peer-Reviewed)
NAIST Simultaneous Interpretation Corpus: Development and Analyses of Data from Interpreters of Different Levels
Kosuke DoiKatsuhito SudohSatoshi Nakamura
Author information
JOURNAL FREE ACCESS

2024 Volume 31 Issue 3 Pages 868-893

Details
Abstract

This paper describes the development of a large-scale English-Japanese simultaneous interpretation corpus named NAIST-SIC and presents analyses of it. We collected the recordings of simultaneous interpreting sentences (SIsent). To understand the characteristics of simultaneous interpreting by human simultaneous interpreters (SIers), we analyzed a subset of this corpus. Samples of speech were interpreted by three SIers having different levels of experience and can be used to compare SIsent attributes in terms of the SIers’ experience. Using this corpus subset, we analyzed the differences in latency, quality, and word order. The results show that (1) SIers with more experience tended to generate a higher quality of SIsent, and (2) they better controlled the latency and quality. We also observed that (3) a large latency degraded the SIsent quality.

Content from these authors
© 2024 The Association for Natural Language Processing
Previous article Next article
feedback
Top