Classification methods typically applied to the Invader assay include k-means clustering and the normal mixture model for original two-dimensional data or angle data. Combining the normal mixture model and angle data might result in an inproved method. In fact, such an approach has the advantages that it can be used to evaluate the goodness of classification for each individual and angle data are easily handled. However, the method requires that the data have an origin, which implies that one cluster must be specified before clustering. Therefore, an alternative method using the normal mixture model is desirable. We propose a mathematical model with a latent time variable. Optimization is based mainly on a one-dimensional normal mixture model with two components, which provides stable computational results more quickly than can be obtained using a bivariate normal mixture model.
For square contingency tables with nominal categories, Tomizawa (1994) and Tomizawa, Seo and Yamamoto (1998) considered measures that reflect the degree of departure from symmetry. This paper proposes a generalized measure for T-way (T ≥ 3) tables. The proposed measure is expressed by using the power divergence of Cressie and Read (1984) or the diversity index of Patil and Taillie (1982). The measure could be useful for comparing the degrees of departure from symmetry in several multi-way tables. Some examples of analysis using biomedical data are shown.
Missing data is a prevalent complication in the analysis of data from longitudinal studies, and remains an active area of research for biostatisticians and other quantitative methodologists. This paper reviews several statistical methods that are used to address outcome-related drop-out. We begin with a review of important concepts such as missing data patterns, missing data mechanisms, ignorability and likelihood-based inference, which were originally proposed by Rubin (1976, Biometrika63, 581-592). Secondly, we review the simple analysis methods for handling drop-outs such as a complete-case analysis, an available data analysis and a last observation carried forward analysis, and their limitations are given. Thirdly, we review the more sophisticated approaches for handling drop-outs, which take account of the missing data mechanisms in the analysis. Inverse probability weighted methods and multiple imputation methods, which represent two distinct paradigms for handling missing data, are reviewed. The analysis methods for non-ignorable drop-outs are also reviewed. Three approaches, selection models, pattern mixture models and latent variable models are presented. We illustrate the analysis techniques using the longitudinal clinical trial of contracepting women reported by Machine et al (1988, Contraception38, 165-179). We briefly review the analysis methods in the presence of missing covariates. Finally, we give some notice in the analysis of missing data.
Recently many methods and tools for bioinformatics have been developed rapidly due to technological progress and the successful Genome Project. However new methodologies for detecting useful and important biomarkers or causal disease genes are still to be developed in order to establish tailor-made medical treatments or personalized medicine. This paper discusses the high throughput genome-related data with clinical information and reviews methods of analyses to search for biomarkers or causal genes. We point out problems in data for statistical analyses and in methods used widely as standards. The data explained here include SNP (Single Nucleotide Polymorphism) data, microarray data and mass-spectrometry proteome data. As for the analyses for high throughput data we discuss study design issues, problems in multidimensional data, and False Discovery Rate (FDR) in multiple testing problems. In the SNP data analyses, we describe that haplotype block based research has replaced separate SNP based analyses as a main research style in association study. In the microarray data analyses, we introduce the usefulness of AdaBoost to search for biomarkers as well as to analyze in the proteome data.