IEICE Transactions on Information and Systems
Online ISSN : 1745-1361
Print ISSN : 0916-8532
Regular Section
Phrase-Based Statistical Model for Korean Morpheme Segmentation and POS Tagging
Seung-Hoon NAYoung-Kil KIM
ジャーナル フリー

2018 年 E101.D 巻 2 号 p. 512-522


In this paper, we propose a novel phrase-based model for Korean morphological analysis by considering a phrase as the basic processing unit, which generalizes all the other existing processing units. The impetus for using phrases this way is largely motivated by the success of phrase-based statistical machine translation (SMT), which convincingly shows that the larger the processing unit, the better the performance. Experimental results using the SEJONG dataset show that the proposed phrase-based models outperform the morpheme-based models used as baselines. In particular, when combined with the conditional random field (CRF) model, our model leads to statistically significant improvements over the state-of-the-art CRF method.

© 2018 The Institute of Electronics, Information and Communication Engineers
前の記事 次の記事