人工知能学会全国大会論文集
Online ISSN : 2758-7347
26th (2012)
セッションID: 3B1-R-2-1
会議情報

A Distance Between Text Documents based on Topic Models and Ground Metric Learning
*金 涛Marco Cuturi山本 章博
著者情報
会議録・要旨集 フリー

詳細
抄録

We propose a new distance between text documents that builds upon two techniques. We first represent each document in a database as a histogram of topics using the Latent Dirichlet Allocation (LDA) topic model. We then compare two documents by computing the earth mover's distance between their respective topic histograms. The Earth Mover's Distance parameter, which is in that case a metric matrix between topics, is estimated using Ground Metric Learning. We carry out experiments on different text databases that illustrate the interest of our approach.

著者関連情報
© 2012 The Japanese Society for Artificial Intelligence
前の記事 次の記事
feedback
Top