日本語話し言葉コーパスの設計(<特集>音声研究関連データベースの動向)

前川 喜久雄; 籠宮 隆之; 小磯 花絵; 小椋 秀樹; 菊池 英明

doi:10.24467/onseikenkyu.4.2_51

Abstract

Compilation of a large-scale corpus of spontaneous Japanese monologue is underway as a joint work of the National Language Research Institute (under the Agency of Cultural Affairs) and the Communications Research Laboratory (under Ministry of Post and Telecommunication). The corpus will contain about 700 hours of digitized speech (about 7 million morphemes), its transcription, and various tagging information such as POS information. Phonological labels (segmental as well as prosodic) will be provided for a subset of the corpus. The corpus will become publicly available in the spring of 2004.

Content from these authors

Favorites & Alerts

Add to favorites
Additional info alert
Citation alert
Authentication alert

Corresponding author

Register with J-STAGE for free!