Journal of Computer Aided Chemistry
Online ISSN : 1345-8647
ISSN-L : 1345-8647
Prediction of Mutagenicity of Organic Molecules by Ensemble Learning
Masamoto ArakawaKimito Funatsu
Author information
JOURNAL FREE ACCESS

2011 Volume 12 Pages 26-36

Details
Abstract

In this paper, the results of construction of classification models for mutagenicity of organic molecules are described. The objective of this study is to construct a model which can predict results of reverse mutation test with high accuracy. For this end, we propose a novel ensemble modeling method in which a lot of support vector machine (SVM) models are constructed as a sub-model and integrated to predict mutagenicity. For constructing sub-model, a part of data matrix which is randomly selected from an original data matrix and randomly determined SVM parameters are used. After the construction of sub-models, a certain number of models which have high accuracy rate are selected and integrated to predict mutagenicity. We constructed an ensemble model using a data set of reverse mutation test which was collected by Hansen et al. [K. Hansen, et al., J. Chem. Inf. Model., 49, 2077-2081] to estimate the proposed method. As a result, an ensemble model with accuracy of 79.6% was successfully obtained. The area under ROC-curve (AUC) is 0.866, which is slightly better than that of Hansen et al. Thus we concluded that the ensemble modeling with SVM sub-models are a promising method for predicting mutagenicity of organic molecules.

Content from these authors
© 2011 The Chemical Society of Japan
Previous article Next article
feedback
Top