2017 Volume 25 Pages 366-375
Fast similarity searches that use high-dimensional feature vectors for a vast amount of multi-media data have recently become increasingly important. However, ordinary similarity searches are slow because they require a large number of floating-point operations that are proportional to the number of record data. Many studies have been done recently that propose to speed up similarity searches by converting feature vectors to bit vectors. Such similarity searches are regarded as approximations of the similarity searches over the original data. However, some of those approximations are not theoretically guaranteed since no direct approximate relations between the Euclidean and Hamming distances are given. We propose a novel hashing method that utilizes inverse-stereographic projection and gives a direct approximate relation between the Euclidean and Hamming distances in a closed-form expression. Although some studies have discussed the relationship between the two distances, to the best of our knowledge, our hashing method is the first one to give a direct approximate relation between the two distances. We also propose parameter values that are needed for our proposal method. Furthermore, we show through experiments that the proposed method has more accurate approximation than the existing random projection-based and Hamming distance-based methods for many datasets.