2013 年 33 巻 2 号 p. 101-124
The nested case-control and case-cohort designs are common means of reducing the cost of covariate measurements in large failure-time studies. Under these designs, complete covariate data are collected only on the cases (i.e., subjects whose failure times are uncensored) and some matched controls selected using risk-set sampling or a subcohort randomly selected from the whole cohort. In many applications, certain covariates are readily measured on all cohort members, and surrogate measurements of the expensive covariates may also become available. Using the covariate data collected outside the selected samples, the relative risk estimators can be improved substantially. In this study, we discuss a unified framework for the analysis of these designs using the multiple imputation method, which is a well-established method for incomplete data analyses. The multiple imputation method is currently available in many standard software, and is familiar to practitioners in epidemiologic studies. In addition, this multiple imputation method uses all the data available and approximates the fully efficient maximum likelihood estimator. We also discuss parametric and nonparametric approaches for modeling the distributions of missing covariates: the Markov Chain Monte Carlo method for the Cox regression model by Chen et al. (Biometrika 2006; 93: 791-807) and the approximate Bayesian bootstrap. Simulation studies demonstrated that in realistic settings, the multiple imputation estimators had greater precisions than existing estimators. Illustrations with data taken from Wilms’ tumor studies are provided.