2011 Volume 10 Issue 4 Pages 115-121
In our previous study, the performances of various QSAR models were examined to predict carcinogenicities of diverse chemicals from their structures as a method alternative to animal tests. We found that the parallel model combining support vector machine (SVM) models constructed for twenty substructure groups predict the carcinogenicities of a wide variety of chemicals with a satisfactory overall accuracy of approximately 80%. In this study, in order to improve the performance of this model by raising the accuracy for N-nitroso-, nitroso- and nitroaromatic group (89 chemicals) which showed the lowest accuracy (70.8%) among twenty substructure groups, we tested the methods of variable selection in SVM modeling. The accuracy of the SVM model trained with descriptors which were selected by using the correlation coefficient method, the F-score method and the sensitivity analysis method was examined. It was found that the sensitivity analysis method improves the accuracy of the N-nitroso-, nitroso- and nitroaromatic group from 70.8% to 77.5%. Thus, it is the most appropriate for constructing the model to predict the carcinogenicity of chemicals among these variable selection methods.