Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
Extraction of Paraphrasing Pattern by Aligned Corpora of Web and Mobile Terminal News Articles
MORITAKA IWAKOSHIHIDETAKA MASUDAHIROSHI NAKAGAWA
Author information
JOURNALS FREE ACCESS

2005 Volume 12 Issue 5 Pages 157-183

Details
Abstract

We have collected both Web news-paper articles of several hundreds of characters, for three years and their counter parts distributed for mobile terminals, which consist of fifty to a hundred characters.Then, we extracted a number of candidates of paraphrases of the final part of sentences from them automatically.At first we have aligned these two types of corpus first at article level, then at sentence level.Next, we extract the final part of mobile article sentences using morphological analyzer, and collect their counterpart expressions of Web article sentences.Finally, we extracted the candidates of morpheme sequence from the final part of Web article sentence, then we propose the combination of two methods for them in order to improve the extraction accuracy of the sets: 1) ranking based on frequency, branching factor and length of string, and 2) filtering to remove inappropriate expressions which eliminate semantically indispensable nouns.

Information related to the author
© The Association for Natural Language Processing
Previous article Next article
feedback
Top