Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
Probabilistic Passage Categorization and its Application
MAKOTO IWAYAMATAKENOBU TOKUNAGA
Author information
JOURNAL FREE ACCESS

1999 Volume 6 Issue 3 Pages 181-198

Details
Abstract
The difficulty in processing long documents is due to the variety of topics they contain. Long documents such as technical papers and reports include more topics than do short documents such as news articles. Since each topic in a long document tends to be relevant to only a small portion of the document, conventional text categorization, which tries to assign predefined topics to the entire document, results in limited effectiveness. In this paper we study the use of probabilistic passage categorization, assigning predefined topics to each passage contained in a document. We show that the performance of passage categorization is superior to that of conventional text categorization especially for long documents. We also discuss possibility of applying passage categorization to topic-dependent text summarization, and show some preliminary experimental results.
Content from these authors
© The Association for Natural Language Processing
Previous article
feedback
Top