自然言語処理
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
A Statistical Approach to Automatic Phonetic Transcription of Japanese Orthographic Words
Wei-Bin ChangSachiko Morishita
著者情報
ジャーナル フリー

2003 年 10 巻 4 号 p. 55-63

詳細
抄録

We address the problem of automatically transcribing Japanese orthographic words into symbols representing their pronunciations. Such a function is necessary for commercial continuous speech recognition systems since there are constant needs to create new recognition lexica for new applications or purposes. Simple look-up schemes are not adequate to deal with Japanese, while methods based on morphological analysis require in-depth linguistic knowledge and development effort. In this paper, we propose a statistical approach which is based on an N-gram language model. It is assumed that the pronunciation of a character only depends on the previous one to two characters and their pronunciations. Given an orthographic word, our method outputs the most likely phonetic transcription. It is shown that our approach provides superior performance to the public-domain conversion tool KAKASI on ten out of twelve test sets.

著者関連情報
© The Association for Natural Language Processing
前の記事 次の記事
feedback
Top