On recent speech corpora activities in Japan

Shuichi Itahashi

doi:10.1250/ast.20.163

抄録

This paper describes a range of Japanese projects which are concerned with speech corpora. ETL is to be credited for initiating research on speech database in 1973, while Tohoku University played a pioneering role in speech corpus development. The JEIDA Japanese Common Speech Data Corpus was reported in 1986 and then later converted to DAT form. Subsequently in 1990, the JEIDA Noise Database was released to the public. Other important contributions are due not only to ATR which has developed a wide varietyof speech corpora, but also to the so-called priority area projects funded by MESSC. On the one hand, the “Spoken Language” project has yielded data on continuous speech, while the “Spoken Japanese” project yielded data on various dialectal speech from all over Japan. On the other hand, the “Spoken Dialogue” project has yielded data on various spoken dialogues. Six CD-ROMs were produced by a committee of the Acoustical Society of Japan. Three of them contain speech of isolated sentences that are phonetically balanced, while the remaining three include continuous speech obtained for various guide-tasks. This paper finally refers to the new ASJ corpus and the “Real World Computing Program” formulated in 1992 by the Japanese Government.

著者関連情報

お気に入り & アラート

閲覧履歴

後続誌

Acoustical Science and Technology

責任著者(Corresponding author)

J-STAGEへの登録はこちら（無料）