主催: The Japanese Society for Artificial Intelligence
会議名: 2012年度人工知能学会全国大会(第26回)
回次: 26
開催地: 山口県山口市 山口県教育会館等
開催日: 2012/06/12 - 2012/06/15
We propose a new distance between text documents that builds upon two techniques. We first represent each document in a database as a histogram of topics using the Latent Dirichlet Allocation (LDA) topic model. We then compare two documents by computing the earth mover's distance between their respective topic histograms. The Earth Mover's Distance parameter, which is in that case a metric matrix between topics, is estimated using Ground Metric Learning. We carry out experiments on different text databases that illustrate the interest of our approach.