The nested case-control and case-cohort designs are common means of reducing the cost of covariate measurements in large failure-time studies. Under these designs, complete covariate data are collected only on the cases (i.e., subjects whose failure times are uncensored) and some matched controls selected using risk-set sampling or a subcohort randomly selected from the whole cohort. In many applications, certain covariates are readily measured on all cohort members, and surrogate measurements of the expensive covariates may also become available. Using the covariate data collected outside the selected samples, the relative risk estimators can be improved substantially. In this study, we discuss a unified framework for the analysis of these designs using the multiple imputation method, which is a well-established method for incomplete data analyses. The multiple imputation method is currently available in many standard software, and is familiar to practitioners in epidemiologic studies. In addition, this multiple imputation method uses all the data available and approximates the fully efficient maximum likelihood estimator. We also discuss parametric and nonparametric approaches for modeling the distributions of missing covariates: the Markov Chain Monte Carlo method for the Cox regression model by Chen et al. (Biometrika 2006; 93: 791-807) and the approximate Bayesian bootstrap. Simulation studies demonstrated that in realistic settings, the multiple imputation estimators had greater precisions than existing estimators. Illustrations with data taken from Wilms’ tumor studies are provided.
Definition of similarity is required for clustering co-expressed genes or estimating gene regulatory network from gene expression data. Pearson correlation coefficient and mutual information are the popular measures to evaluate similarity between gene expression profiles. To investigate which measure is appropriate for evaluating similarity between gene expression profiles, we have compared these two measures using Gene ontology annotation similarity. Genes that have similar Gene ontology annotations can be interpreted that they have commonality in biological processes or molecular functions. The results showed that the better similarity measure is different depending on the purpose of the analysis or from which organism the data derived. In the case of evaluating similarities among more than three genes, mutual information was a better similarity measure for the data derived from multicellular organisms, though Pearson correlation coefficient was a better similarity measure for the data derived from unicellular organisms. In the case of finding genes whose transcripts have similar functions or genes that participate to similar processes, Pearson correlation coefficient was always a better measure.
Imaging techniques have been used for effectively studying the brain in a non-invasive manner in several fields, for example, psychiatry and psychology. In this review, we focus on two imaging techniques that provide different views of brain structure and function. Structural magnetic resonance imaging (sMRI) provides information about various tissue types in the brain, for example, gray matter, white matter, and cerebrospinal fluid. Functional MRI (fMRI) measures brain activity by detecting changes in cerebral blood flow. These techniques enable high-quality visualization of brain activity or the location of atrophies; moreover, these techniques facilitate the study of disease mechanisms in the healthy brain and might lead to the development of effective therapies or drugs against such diseases. However, raw MRI data must be statistically analyzed to obtain objective answers to clinical questions. Therefore, statistical methods play a very important role in brain research. Here, we briefly review the most commonly used statistical analyses, namely, data pre-processing, general linear model, random field theory, mixed effect model, independent component analysis, network analysis, and discriminant analysis. Further, we provide information about brain imaging data structure and introduce useful software to implement these methods.