JSAI Technical Report, Type 2 SIG
Online ISSN : 2436-5556
Wikipedia Retrieval using Multitype Topic Models
Koji EGUCHIHitohiro SHIOZAKI
Author information
RESEARCH REPORT / TECHNICAL REPORT FREE ACCESS

2009 Volume 2009 Issue SWO-020 Pages 14-

Details
Abstract

Very recently, topic model-based retrieval methods have produced good results using Latent Dirichlet Allocation (LDA) model or its variants in language modeling framework. However, for the task of retrieving annotated documents, LDA-based methods cannot directly make use of multiple attribute types that are specified by the annotations. In this paper, we explore a new retrieval method using a multitype topic model that can directly handle multiple word types, such as annotated entities, category labels and other words that are typically used in Wikipedia. We investigate how to effectively apply the multitype topic model to retrieve documents from a typeannotated collection, and then show that our proposed method significantly outperforms several state-of-the-art methods through experiments in the task of entity ranking using a Wikipedia collection.

Content from these authors
© 2009 Authors
Previous article Next article
feedback
Top