Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
Paper
Entity Set Expansion based on Bootstrapping Methods using Topic Information
Kugatsu SadamitsuKuniko SaitoKenji ImamuraYoshihiro MatsuoGenichiro Kikui
Author information
JOURNAL FREE ACCESS

2012 Volume 19 Issue 2 Pages 89-106

Details
Abstract

This paper proposes three modules based on latent topics of documents for alleviating “semantic drift” in bootstrapping entity set expansion. These new modules are added to a discriminative bootstrapping algorithm to realize topic feature generation, negative example selection and positive example disambiguation. In this study, we model latent topics with LDA (Latent Dirichlet Allocation) in an unsupervised way. Experiments show that the accuracy of the extracted entities is improved by 6.7 to 28.2% depending on the domain.

Content from these authors
© 2012 The Association for Natural Language Processing
Previous article
feedback
Top