かな単語マルコフ連鎖モデルを用いたかな漢字変換法

加藤 省三; 荒木 睦大; 小越 康宏; 谷口 秀次; 森 幹男

doi:10.1541/ieejeiss.130.1054

抄録

The processing of kana-to-kanji conversion can be classified into two categories of processing: The first is the processing to detect the boundaries of words in non-segmented kana strings, and the second is the processing to select the candidate of kanji-kana words. Also, the methods of kana-to-kanji conversion can be mainly classified into two types from the point of view of the two processing described above: One is to conduct simultaneously these two processing (called Method-A), and the other is to conduct sequentially them (called Method-B), namely, to detect the boundaries of kana words by using Markov chain model of kana words, and then to convert kana words to kanji-kana words and to select the maximum likely candidates by using Markov chain model of kanji-kana words. This paper evaluates two types of kana-to-kanji conversion method (Method-A and Method-B) by using 2nd-order Markov chain model of words. Through the experiments by using statistical data of daily Japanese newspaper, Method-A and Method-B are evaluated by the criteria of the accuracy rate of conversion, the conversion processing time and the memory capacity. From the results of the experiments, it is concluded that the Method-B is superior to Method-A in the conversion processing time and the memory capacity and is effective in kana-to-kanji conversion of bunsetsu.

著者関連情報

お気に入り & アラート

閲覧履歴

発行機関からのお知らせ

【電気学会会員の方】購読している論文誌を無料でご覧いただけます（会員ご本人のみの個人としての利用に限ります）。購読者番号欄にMyページへのログインIDを，パスワード欄に生年月日8ケタ（西暦，半角数字。例：19800303）を入力して下さい。

ダウンロード

論文(PDF)の閲覧方法はこちら
閲覧方法 (327.9K)

前身誌

電気学会論文誌. C

電氣學會雜誌

責任著者(Corresponding author)

J-STAGEへの登録はこちら（無料）