入力文の格助詞ごとに学習データを分割した機械学習による受身文の能動文への変換における格助詞の変換

村田 真樹; 金丸 敏幸; 白土 保; 井佐原 均

doi:10.5687/iscie.21.165

Abstract

We developed a new method of transforming Japanese case particles when transforming Japanese passive sentences into active sentences. This method separates training data into each input particle and uses machine learning for each particle. We also used numerous rich features for learning. Murata et al. conducted a previous study on transforming Japanese passive sentences into active sentences [2]. They used machine learning but did not separate training data for any input particles and did not have many rich features for learning. They achieved an accuracy rate of 89.77%. We added many rich features to those used in Murata et al.'s study and obtained an accuracy rate of 92.00%. In addition, we used our method of separating training data into each input particle and using machine learning for each particle, and obtained an accuracy rate of 94.30%. We confirmed the significance of these improvements through a statistical test. We also conducted experiments utilizing traditional methods using verb dictionaries and manually prepared heuristic rules and confirmed that our method achieved much higher accuracy rates than traditional methods.

Content from these authors

Favorites & Alerts

Corresponding author

Register with J-STAGE for free!