2021 Volume 20 Issue 1 Pages 1-9
Ames test to detect mutagenicity in vitro is of crucial importance in drug discovery and development as an early alerting system for potential carcinogenicity and/or teratogenicity for drug candidates. In the alerting system of machine learning approaches, which are the main approach of in silico prediction, there is a concept called Applicability domain (AD) that has a significant influence on prediction accuracy. In drug discovery, prediction of drug candidate compounds with low structural similarity to the training data may take place, while such compounds have a high probability of being out of the AD and tend to be less accurate. In this study, we evaluated the performance of several machine learning methods for a group of compounds with a high probability of being included in AD or not, respectively. The results showed that the performance of the graph convolutional networks (GCN) method which have achieved a superior performance in a wide range of tasks was better than conventional methods. In particular, the accuracy of the GCN method is significantly different from the conventional methods for the prediction of molecules with low structural similarity (Figure 4).