Journal of Information Processing
Online ISSN : 1882-6652
ISSN-L : 1882-6652
Top-k Similarity Search over Gaussian Distributions Based on KL-Divergence
Tingting DongYoshiharu IshikawaChuan Xiao
Author information

2016 Volume 24 Issue 1 Pages 152-163


The problem of similarity search is a crucial task in many real-world applications such as multimedia databases, data mining, and bioinformatics. In this work, we investigate the similarity search on uncertain data modeled in Gaussian distributions. By employing Kullback-Leibler divergence (KL-divergence) to measure the dissimilarity between two Gaussian distributions, our goal is to search a database for the top-k Gaussian distributions similar to a given query Gaussian distribution. Especially, we consider non-correlated Gaussian distributions, where there are no correlations between dimensions and their covariance matrices are diagonal. To support query processing, we propose two types of novel approaches utilizing the notions of rank aggregation and skyline queries. The efficiency and effectiveness of our approaches are demonstrated through a comprehensive experimental performance study.

Information related to the author
© 2016 by the Information Processing Society of Japan
Previous article Next article