2004 年 25 巻 2 号 p. 117-134
Recently many methods and tools for bioinformatics have been developed rapidly due to technological progress and the successful Genome Project. However new methodologies for detecting useful and important biomarkers or causal disease genes are still to be developed in order to establish tailor-made medical treatments or personalized medicine. This paper discusses the high throughput genome-related data with clinical information and reviews methods of analyses to search for biomarkers or causal genes. We point out problems in data for statistical analyses and in methods used widely as standards. The data explained here include SNP (Single Nucleotide Polymorphism) data, microarray data and mass-spectrometry proteome data. As for the analyses for high throughput data we discuss study design issues, problems in multidimensional data, and False Discovery Rate (FDR) in multiple testing problems. In the SNP data analyses, we describe that haplotype block based research has replaced separate SNP based analyses as a main research style in association study. In the microarray data analyses, we introduce the usefulness of AdaBoost to search for biomarkers as well as to analyze in the proteome data.