Proceedings of the Symposium on Chemoinformatics
42th Symposium on Chemoinformatics, Tokyo
Conference information

Poster Session
Development of predictive models of synthetic accessibility using machine learning
*Masaki WakasugiHiroshi KanekoAtushi KawasakiYosihiko Nishibata
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Pages 1P11-

Details
Abstract
Estimation of synthetic accessibility is an important aspect for computer-aided drug design. Several methods to predict synthetic accessibility are reported. These methods are based on retrosynthetic analysis, molecular complexity, and fragment contributions. However, there is almost no method using machine learning. Here we report a prediction method of synthetic accessibility using machine learning. Since synthetic accessibility is a subjective judgment, it is difficult to prepare a large-scale training set for machine learning. Here, we assume that compounds obtained by removing the ZINC15 compounds (purchasable “drug-like” compounds) from the GDB-17 compounds (Compounds of up to 17 atoms of C, N, O, S, and halogens forming the chemical universe database) are likely to be difficult to synthesize, and ZINC15 compounds are easier to synthesize than these compounds. Based on the hypothesis, we have created a data set and applied it on the neural network classifier. Then, we have evaluated the model using a validation set obtained from the literature. The results show that the model was possible to distinguish compounds that are difficult to synthesize from easier ones. We are developing models using different machine learning methods and expect to report a comparison with the neural network model.
Content from these authors
Previous article Next article
feedback
Top