通訳研究
Online ISSN : 2436-861X
Print ISSN : 1346-8715
ISSN-L : 1346-8715
研究ノート
同時通訳コーパスの設計と構築
松原 茂樹相澤 靖之河口 信夫外山 勝彦稲垣 康善
著者情報
ジャーナル フリー

2001 年 1 巻 p. 85-102

詳細
抄録
This paper describes a large-scale spoken language corpus of simultaneous interpreting, which has been constructed at the Center for Integrated Acoustic Information Research (CIAIR), Nagoya University. The corpus, among other things, has the following charac-teristics: (1) English and Japanese speeches are recorded in parallel, (2) the data contain monologue and dialogue speeches, and (3) the exact beginning and ending times are provided for each utterance. We have collected a total of about 65 hours of speech data and transcribed them into ASCII text files (about 367,000 morphemes in 22,000 utterance units). This paper also outlines the software tools which we have developed for the investigation of the corpus. The corpus will be made publicly available in the near future.
著者関連情報
© 2001 日本通訳翻訳学会
前の記事 次の記事
feedback
Top