Predictor variables of a multiple regression equation selected by
GCV are commonly considered to have a linear relationship with the target variable.However, some predictor variables may be selected by chance even though they do not have linear relationships with the target variable. To realize predictor variable selection with the consideration of this possibility, a new statistic “
GCVf ” (“f” stands for “flexible”) is proposed. The use of
GCVf allows to adjust the strictness of the condition in the variable predictor selection. For example,
GCVf is produced so as to make the probability of erroneous selection of predictor variables 5 percent when all the predictor variables have no linear relationships with the target variable. The predictor variables selected by this
GCVf almost certainly have linear relationships with the target variable.
View full abstract