電気学会論文誌C(電子・情報・システム部門誌)
Online ISSN : 1348-8155
Print ISSN : 0385-4221
ISSN-L : 0385-4221
<ソフトコンピューティング・学習>
かな単語マルコフ連鎖モデルを用いたかな漢字変換法
加藤 省三荒木 睦大小越 康宏谷口 秀次森 幹男
著者情報
ジャーナル フリー

2010 年 130 巻 6 号 p. 1054-1060

詳細
抄録

The processing of kana-to-kanji conversion can be classified into two categories of processing: The first is the processing to detect the boundaries of words in non-segmented kana strings, and the second is the processing to select the candidate of kanji-kana words. Also, the methods of kana-to-kanji conversion can be mainly classified into two types from the point of view of the two processing described above: One is to conduct simultaneously these two processing (called Method-A), and the other is to conduct sequentially them (called Method-B), namely, to detect the boundaries of kana words by using Markov chain model of kana words, and then to convert kana words to kanji-kana words and to select the maximum likely candidates by using Markov chain model of kanji-kana words. This paper evaluates two types of kana-to-kanji conversion method (Method-A and Method-B) by using 2nd-order Markov chain model of words. Through the experiments by using statistical data of daily Japanese newspaper, Method-A and Method-B are evaluated by the criteria of the accuracy rate of conversion, the conversion processing time and the memory capacity. From the results of the experiments, it is concluded that the Method-B is superior to Method-A in the conversion processing time and the memory capacity and is effective in kana-to-kanji conversion of bunsetsu.

著者関連情報
© 電気学会 2010
前の記事 次の記事
feedback
Top