Abstract
The chemical space of drug candidates is vast, and data volume in chemical databases is still getting larger. For mining such vast chemical data, accelerating data analysis is an essential issue. This paper validates an approach for improving computational cost by compressing topological fragment spectra (TFS) which is a descriptor of chemical graphs proposed by Takahashi et al.. First we show that TFS is a periodic signal whose cycle length is around 12 (mass number of carbon). And then we apply compression methods for periodic signals: Fourier transform and Wavelet transform. Experimental results on structural similarity searches and pharmaceutical activity predictions show that Wavelet transform gives more effective compression than Fourier transform.