Ouyou toukeigaku

Show abstractHide abstract

The two-stage case control study is a common means for reducing the cost of covariate measurements in epidemiologic studies. Under this design, complete covariate data are collected only on randomly sampled cases and controls in the second stage. In many applications, certain covariates are readily measured on all of the first stage samples, and surrogate measurements of the expensive covariates also may be available. Using the covariate data collected outside the second stage samples, the relative risk estimators can be substantially improved. In this study, we propose to apply the multiple imputation method that is one of the well established methods for incomplete data analyses. The multiple imputation method is now available in many standard software, and is familiar with practitioners in epidemiologic studies. In addition, the multiple imputation method uses all the data available and approximates the fully efficient maximum likelihood estimator. Simulation studies demonstrated that the multiple imputation estimators had greater precisions than the many existing estimators in realistic settings. An illustration with data taken from Wilms’ tumor studies is provided.

View full abstract

Download PDF (739K)

Predictor variables of a multiple regression equation selected by GCV are commonly considered to have a linear relationship with the target variable.However, some predictor variables may be selected by chance even though they do not have linear relationships with the target variable. To realize predictor variable selection with the consideration of this possibility, a new statistic “GCV_f ” (“f” stands for “flexible”) is proposed. The use of GCV_f allows to adjust the strictness of the condition in the variable predictor selection. For example, GCV_f is produced so as to make the probability of erroneous selection of predictor variables 5 percent when all the predictor variables have no linear relationships with the target variable. The predictor variables selected by this GCV_f almost certainly have linear relationships with the target variable.

View full abstract

Register with J-STAGE for free!