Volume 8 (2015) Pages 25-32
Feature selection problem has been widely used for various fields. In particular, the sparse estimation has the advantage that its computational cost is the polynomial order of the number of features. However, it has the problem that the obtained solution varies as the dataset has changed a little. The goal of this paper is to exhaustively search the solutions which minimize the generalization error for feature selection problem to investigate the problem of sparse estimation. We calculate the generalization errors for all combinations of features in order to get the histogram of generalization error by using the cross validation method. By using this histogram, we propose a method to verify whether the given data include information for binary classification by comparing the histogram of predictive error for random guessing. Moreover, we propose a statistical mechanical method in order to efficiently calculate the histogram of generalization error by the exchange Monte Carlo (EMC) method and the multiple histogram method. We apply our proposed method to the feature selection problem for selecting the relevant neurons for face identification.