Prediction models are usually developed through model-construction and validation. Especially for binary or time-to-event outcomes, the risk prediction models should be evaluated through several aspects of the accuracy of prediction. With unified algebraic notation, we present such evaluation measures for model validation from five statistical viewpoints that are frequently reported in medical literature: 1) Brier score for prediction error; 2) sensitivity, specificity, and C-index for discrimination; 3) calibration-in-the-large, calibration slope, and Hosmer-Lemeshow statistic for calibration; 4) net reclassification and integrated discrimination improvement indexes for reclassification; and 5) net benefit for clinical usefulness. Graphical representation such as a receiver operating characteristic curve, a calibration plot, or a decision curve helps researchers interpret these evaluation measures. The interrelationship between them is discussed, and their definitions and estimators are extended to time-to-event data suffering from outcome-censoring. We illustrate their calculation through example datasets with the SAS codes provided in the web appendix.
Fisher’s randomization rule has been widely viewed as a revolutionary invention in experimental design. The three rationales of randomization in clinical trials are (i) randomization ensures that known and unknown confounders are asymptotically controlled, (ii) the use of randomization itself provides the basis of statistical inference, supposing patients in a clinical trial are a non-random sample of a population, and (iii) the act of randomization mitigates selection bias by providing unpredictability in treatment allocation. Randomized controlled trials have been the gold standard for more than five decades, while such trials may be costly, inconvenient and ethically challenging. Some Fisherian statisticians have emphasized the importance of design-based inference based on randomization test, however some statisticians does not agree with them. From the Bayesian point of view, the randomization sequence is ancillary for a parameter of interest, and randomization itself is not absolutely essential although it may sometimes be helpful. In this review, I provide an overview of the rationales of randomization and the related topics, and discuss the significance and limitations of randomization in clinical trials.