Proceedings of the Symposium on Chemoinformatics
30th Symposium on Chemical Information and Computer Sciences, Kyoto
Conference information

Joint Session
Biological Activity Classification of Chemicals using 100 class SVM
*Kentaro KawaiSatoshi FujishimaYoshimasa Takahashi
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Pages JK02

Details
Abstract
We have investigated multiple category classification of pharmacological activity classes of chemical compounds from their chemical structures using multi-class support vector machine (SVM). For the input to the SVM, every chemical structure was represented by a multidimensional pattern vector that was obtained by the Topological Fragment Spectra (TFS) method. In this work, we adopted the "one against the rest" method, which combine two-class SVMs to solve the multiclass classification problem. For the computational trial, we employed 98,634 compounds that belong to 100 different activity classes. The data set was divided into three groups: a training set of 59,180 compounds, a validation set of 29,590 compounds, and a test set of 9,864 compounds. The SVM model was trained using the training set and the validation set. For the test set that consists of 100 classes, the best model correctly classified 80.8% of the drugs into their own active classes including the multi-labeled compounds. The resulted classifier would be useful in drug-discovery and also helpful in risk estimation of adverse effects.
Content from these authors
© 2007 The Chemical Society of Japan
Previous article Next article
feedback
Top