A Statistical Approach to Automatic Phonetic Transcription of Japanese Orthographic Words

Wei-Bin Chang; Sachiko Morishita

doi:10.5715/jnlp.10.4_55

抄録

We address the problem of automatically transcribing Japanese orthographic words into symbols representing their pronunciations. Such a function is necessary for commercial continuous speech recognition systems since there are constant needs to create new recognition lexica for new applications or purposes. Simple look-up schemes are not adequate to deal with Japanese, while methods based on morphological analysis require in-depth linguistic knowledge and development effort. In this paper, we propose a statistical approach which is based on an N-gram language model. It is assumed that the pronunciation of a character only depends on the previous one to two characters and their pronunciations. Given an orthographic word, our method outputs the most likely phonetic transcription. It is shown that our approach provides superior performance to the public-domain conversion tool KAKASI on ten out of twelve test sets.

著者関連情報

お気に入り & アラート

閲覧履歴

責任著者(Corresponding author)

J-STAGEへの登録はこちら（無料）