IEEJ Transactions on Electronics, Information and Systems
Online ISSN : 1348-8155
Print ISSN : 0385-4221
ISSN-L : 0385-4221
<Software and Information Processing>
Extraction of Profile Information from Newspaper Articles Using Support Vector Machines
Hitoshi YoshitaniKoichi KiseKeinosuke Matsumoto
Author information
JOURNAL FREE ACCESS

2004 Volume 124 Issue 11 Pages 2260-2266

Details
Abstract

This paper presents a method for extracting profile information in tabular formats based on existing technologies called named entity extraction and information integration. Named entity extraction enables us to provide elements of tables for profile information. Information integration allows us to unify tables for making the profile information fruitful, though it requires predetermined initial tables. In this paper, we propose a whole system of extracting profile information by bridging the gap between the two technologies. For this purpose we employ a method of grouping named entities for making initial tables. For the extraction and grouping of named entities, we utilize support vector machines. Initial tables are then integrated if these are with the same name. From the experimental results on 7085 newspaper articles, we obtained the results of 53.8% precision with 58.7% recall; Although the proposed method is insufficient as a fully automated information extraction, it provides us a good starting point for extracting profile information.

Content from these authors
© 2004 by the Institute of Electrical Engineers of Japan
Previous article Next article
feedback
Top