Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
Statistical Summarization for Creating Headline from Web Pages
NOBUAKI HIROSHIMATAKAAKI HASEGAWAMASAHIRO OKU
Author information
JOURNAL FREE ACCESS

2005 Volume 12 Issue 6 Pages 113-128

Details
Abstract
We propose a statistical method of generating headlines that show an outline of a Web page.The requirements for creating headlines are completeness, readability, and high compressibility.Our method constructs a keyword selection model using several word features using a Support Vector Machine and a sentence generation model, based on both word N-gram probability and the style of the original sentences.To achieve high compressibility, we create headlines by choosing words from an original text using two models.Our experimental results show that our keyword selection model results in a more complete search and our sentence generation model results in higher readability, compared with conventional methods.
Content from these authors
© The Association for Natural Language Processing
Previous article
feedback
Top