Proceedings of the Symposium on Chemoinformatics
30th Symposium on Chemical Information and Computer Sciences, Kyoto
Conference information

Poster Session
Development of the variable selection method using rough set theory
*Michio KoyamaMasamoto ArakawaKimito Funatsu
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Pages JP23

Details
Abstract
In QSAR studies, it is known that a model using variables irrelevant to object variables often have a low predictive ability. So in order to build a model that gives high predictive ability, it is important to select minimal number of the variables necessary for prediction. Many variable selection methods have been developed, and we propose the new method using rough set theory (RST). RST is the theory that is used to classify inaccurate or incomplete data sets. By applying this theory to variable selection, we can find minimal subsets of variables which give the same partition as the whole set of variables (reduction set). In this study, this method is applied to QSAR analyses of inhibitors of DHFR and HIV-1. We built the PLS models both using reduction sets and using variables randomly selected. PLS models made by reduction sets give predictive power not less than predictive power which the models made by randomly selected variables give. Furthermore, it became cleared that chemically meaningful variables are tend to be selected by RST, and therefore, shown that RST is superior to the random selection. After this, we will continue to examine the advantage of RST by comparing with other methods.
Content from these authors
© 2007 The Chemical Society of Japan
Previous article Next article
feedback
Top