2016 年 37 巻 1 号 p. 23-44
In clinical investigations designed to demonstrate the efficacy of a diagnostic procedure, the procedure is usually evaluated by multiple independent raters. Although the sensitivity and specificity may be estimated by considering consensus evaluations to treat results from multiple raters as if there were a single rater, raters are not considered independent in consensus evaluations. Typically, estimation methods are based on an “average rater” or a “majority rater” to account for multiple raters. In this paper, we propose a method for summarizing sensitivities and specificities evaluated from multiple independent raters based on a bivariate random effects model (BVRM) to account between-rater variance and correlation between sensitivity and specificity. In addition, we propose methods to draw joint confidence regions of sensitivity and specificity based on the BVRM. Simulation results show that the differences in the biases between the proposed method and the average rater method are small and that the empirical coverage probabilities of the proposed joint confidence regions are close to the nominal level. The proposed methods are illustrated using data from florbetapir F 18 positron emission tomographic imaging to predict the presence of β-amyloid in the brains of subjects with Alzheimer’s disease.