Journal of Japan Society for Fuzzy Theory and Intelligent Informatics
Online ISSN : 1881-7203
Print ISSN : 1347-7986
ISSN-L : 1347-7986
Original Papers
Latent Topic Estimation using Pairs of Word as Features
Risa KITAJIMAIchiro KOBAYASHI
Author information
JOURNAL FREE ACCESS

2013 Volume 25 Issue 1 Pages 501-510

Details
Abstract

Latent Dirichlet Allocation (LDA) has been widely used for analyzing latent topics of documents. It assigns a probability distribution over topics to each individual word ofdocuments, and then assigns latent topics as particular words which tend to be appeared in a particular topic. In the method, documents are treated as bag-of-words. It does not deal with the relation between words to precisely express the contents of documents. In this study, to estimate latent topics more precisely, we propose a method to assign a probability distribution over topics to pairs of words. Through document retrieval tasks, we investigate how we should provide constraints on pairs of words which are useful for extracting latent topics, and show that LDA can be improved upon by assigning probability distributions to pairs of words.

Content from these authors
© 2013 Japan Society for Fuzzy Theory and Intelligent Informatics
Previous article Next article
feedback
Top