2019 Volume 20 Pages 76-83
A number of studies have investigated the relations between structures and activities of metabolites. It has been proposed that structural similarity between metabolites implies activity similarity between them. In light of this fact we propose a method for activity prediction of secondary metabolites based on association philosophy. First we determined the structural similarity scores between targeted metabolite pairs using COMPLIG algorithm. To increase the possibility of clusters rich with known metabolites we calculated structural similarity between metabolite pairs for which activities of both or at least one metabolite is known and then selected the metabolite pairs for which the similarity score is higher than a threshold (s > 0.95). The network of such metabolite pairs was then clustered using the DPClusO algorithm. Statistically significant cluster-activity pairs were then selected using the hypergeometric test. Then biological activities of unannotated metabolites were predicted from the activity of metabolites included in the statistically overrepresented clusters.