JSAI Technical Report, SIG-KBS
Online ISSN : 2436-4592
87th (Jan, 2010)
Conference information

Extracting Bursty Latent Topics from Document Streams Using LDA and Time Filters
Mizuta MASATAKAMasahito KUMANOMasahiro KIMURA
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Pages 05-

Details
Abstract

We propose a method for extracting bursty latent topics from a document stream that is a time-series data of documents. We utilize Latent Dirichlet Allocation (LDA), which is a probabilistic generative model of documents, for extracting latent topics, and introduce a time-filter for identifying bursty topics. We construct a measure of similarity between two documents with time-stamps on the basis of LDA and the time-filter, and extract bursty latent topics from a document stream by applying a hierarchical agglomerative clustering method. Using real data of document streams, we experimentally demonstrate the effectiveness of the proposed method.

Content from these authors
© 2010 The Japaense Society for Artificial Intelligence
Previous article Next article
feedback
Top