Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
Word Order Acquisition from Corpora
KIYOTAKA UCHIMOTOMASAKI MURATAQING MASATOSHI SEKINEHITOSHI ISAHARA
Author information
JOURNAL FREE ACCESS

2000 Volume 7 Issue 4 Pages 163-180

Details
Abstract
In this paper we propose a method for acquiring word order from corpora.We define word order as the order of modifiers or the order of bunsetsus which depend on the same modifiee. The method uses a model which automatically discovers what the tendency of the word order in Japanese is by using various kinds of information in and around the target bunsetsus. It shows us to what extent each piece of information contributes to deciding the word order and which word order tends to be selected when several kinds of information conflict. The contribution rate of each piece of information in deciding word order is efficiently learned by a model within a maximum entropy (ME) framework.The performance of the trained model can be evaluated by checking how many instances of word order selected by the model agree with those in the original text. A raw corpus instead of a tagged corpus can be used to train the model, if it is first analyzed by a parser. This is possible because text in the corpus is in the correct word order. In this paper, we show that this is indeed possible.
Content from these authors
© The Association for Natural Language Processing
Previous article Next article
feedback
Top