Profile of Participants and Genotype Distributions of 108 Polymorphisms in a Cross-Sectional Study of Associations of Genotypes With Lifestyle and Clinical Factors: A Project in the Japan Multi-Institutional Collaborative Cohort (J-MICC) Study

Background Most diseases are thought to arise from interactions between environmental factors and the host genotype. To detect gene–environment interactions in the development of lifestyle-related diseases, and especially cancer, the Japan Multi-institutional Collaborative Cohort (J-MICC) Study was launched in 2005. Methods We initiated a cross-sectional study to examine associations of genotypes with lifestyle and clinical factors, as assessed by questionnaires and medical examinations. The 4519 subjects were selected from among participants in the J-MICC Study in 10 areas throughout Japan. In total, 108 polymorphisms were chosen and genotyped using the Invader assay. Results The study group comprised 2124 men and 2395 women with a mean age of 55.8 ± 8.9 years (range, 35–69 years) at baseline. Among the 108 polymorphisms examined, 4 were not polymorphic in our study population. Among the remaining 104 polymorphisms, most variations were common (minor allele frequency ≥0.05 for 96 polymorphisms). The allele frequencies in this population were comparable with those in the HapMap-JPT data set for 45 Japanese from Tokyo. Only 5 of 88 polymorphisms showed allele-frequency differences greater than 0.1. Of the 108 polymorphisms, 32 showed a highly significant difference in minor allele frequency among the study areas (P < 0.001). Conclusions This comprehensive data collection on lifestyle and clinical factors will be useful for elucidating gene–environment interactions. In addition, it is likely to be an informative reference tool, as free access to genotype data for a large Japanese population is not readily available.


INTRODUCTION
Although the etiology of many diseases is not completely understood, most are likely to be caused by interactions between hazardous environmental factors and the host genome. Recent advances in genotyping techniques have allowed many epidemiologic studies to investigate gene-environment interactions in chronic diseases. [1][2][3][4] Cohort and case-control studies focusing on such interactions are ongoing worldwide, and these investigations use DNA from established and new cohorts. [5][6][7][8] Understanding gene-environment interactions requires long-term cohort studies to clarify the temporality of associations and to avoid information and selection biases that are inevitable in cross-sectional and case-control studies. 9 For most multifactorial diseases, such cohort studies must be conducted on a large scale to ensure significant results.
The Japan Multi-institutional Collaborative Cohort (J-MICC) Study is a new cohort study that was launched in 2005 to examine gene-environment interactions in lifestyle-related diseases, especially cancers. It is supported by a research grant for Scientific Research on Special Priority Areas of Cancer from the Japanese Ministry of Education, Culture, Sports, Science and Technology (MEXT). 10, 11 The J-MICC Study group is composed of 10 cohorts surveyed by 10 independent research teams. 12,13 Although the long-term aim of this study is to elucidate gene-environment interactions in the whole cohort, some of its research objectives will be achieved by cross-sectional studies. In 2009, we started a cross-sectional study to examine correlations between lifestyle and medical factors, as assessed using questionnaires and medical examinations, and the distribution of possible related genotypes. Here we describe the recruitment and profile of the participants, including their genotype analysis, and selected demographic, lifestyle, and medical characteristics.

METHODS
Study participants, data collection, and blood sampling The participants in this study completed questionnaires on lifestyle factors and diseases and donated blood samples at the time of the baseline survey for the J-MICC Study. The details of the J-MICC Study have been described elsewhere. 10 The participants were enrolled in 9 study areas throughout Japan between 2005 and 2008, and in 1 area in 2004, under the supervision of an associate member of the J-MICC Study. The study participants were enrolled from the community by mailing invitation letters or distributing leaflets (3 areas), or by recruiting patients at their first visit to a cancer hospital (1 area) or at health checkups (6 areas). The response rates for the baseline survey were 7.0%, 36.5%, 25.9%, 58.4%, 60.1%, 37.6%, 14.0%, 24.0%, 19.7%, and 65.5% for the Chiba, Shizuoka, Okazaki, Aichi Cancer Center, Takashima, Kyoto, Tokushima, Fukuoka, Saga, and Amami areas, respectively. For cases in which the baseline survey is still ongoing in a cohort, the latest response rate (as of 30 September 2010 or later) was used. Anthropometry, blood pressure, and blood chemistry data obtained from health checkups were available in 8 of the study areas. The subjects for the cross-sectional study comprised 500 to 600 participants enrolled consecutively in each area of the J-MICC Study, except in 2 areas, where fewer participants had been recruited. The recruitment period for the present study, however, was arbitrarily defined by the researchers in each area after the enrollment.
Of the 5108 men and women initially selected, we excluded participants for whom we did not have sufficient DNA (n = 442), appropriate informed consent (n = 8), questionnaire data (n = 9), or local government registration of residence in the study area (n = 7), as well as anyone who had declined follow up (n = 2) or withdrew from the study (n = 1), and the 120 participants who were younger than 35 years or older than 69 years. Thus, our final study group comprised 4519 participants aged 35 to 69 years.
All the participants included in this analysis had provided written informed consent. The ethics committees of Nagoya University School of Medicine (the affiliation of the former principal investigator, Nobuyuki Hamajima) and the other participating institutions approved the protocol for the J-MICC Study.

Genotyping
We chose 107 single nucleotide polymorphisms (SNPs) and 1 insertion/deletion polymorphism for genotyping, based on their potential relevance to the lifestyle and medical factors described in the next section ("Lifestyle and clinical data"). Researchers from all participating cohorts proposed potentially relevant polymorphisms, and those selected for inclusion in the present study were determined through discussion among the members of the J-MICC Study Group.
In all study areas except Fukuoka, buffy coat fractions were prepared from blood samples and stored at −80°C at the central J-MICC Study office. DNA was extracted from all buffy coat fractions using a BioRobot M48 Workstation (Qiagen Group, Tokyo, Japan) at the central study office. For the samples from the Fukuoka area, DNA was extracted locally from samples of whole blood, using an automatic nucleic acid isolation system (NA-3000, Kurabo, Co., Ltd, Osaka, Japan). The buffy coat fractions or DNA samples were anonymized in a linkable manner 14 and then sent to the central office.
The selected polymorphisms were genotyped using the multiplex polymerase chain reaction (PCR)-based Invader assay 15 (Third Wave Technologies, Madison, WI, USA) at the Laboratory for Genotyping Development, Center for Genomic Medicine, RIKEN.

Lifestyle and clinical characteristics
The lifestyle factors considered were smoking and drinking habits, coffee consumption, sleep, and mental stress, while the clinical characteristics were height, weight, blood pressure, blood glucose, glycated hemoglobin (HbA1c), serum triglyceride, total and high-density lipoprotein (HDL) cholesterol, uric acid, aspartate aminotransferase (AST), alanine aminotransferase (ALT), gamma-glutamyltransferase (γ-GT), C-reactive protein (CRP), creatinine, and bone mineral density. Ages at menarche and menopause were also ascertained.
We used a standard questionnaire in all study areas except the Fukuoka area, where some questions are slightly different from those of other areas. Furthermore, a validated food-frequency questionnaire was used for the dietary assessment. [16][17][18][19] We were unable to directly control the quality of information from health examinations because most data were obtained at routine health checkups offered by other institutions. However, the J-MICC Study Group is now collecting information on participation in the Japan Medical Association's quality control program for clinical laboratories and the instruction manuals used for measurement of blood pressure, height, and weight. For the current report, participants whose blood was drawn less than 3 hours after their last meal were excluded from the analysis of serum lipids and blood glucose.

Statistical analysis
We tabulated selected baseline characteristics by sex and 10-year age group or by sex and study area. In this analysis, body mass index (BMI; kg/m 2 ) was calculated on the basis of self-reported height and body weight, as independent measurements were not available in some study areas. In the case of educational attainment, participants from the Fukuoka area were excluded from the analysis because the questionnaire used there had not included this item. Participants who consumed alcohol at least once a week were classified as drinkers. To compare characteristics among participating cohorts, we attempted to adjust for age by using the direct method (for proportions) or the general linear model (for means). The variations among study areas, however, were not significantly altered after adjusting for age. Thus, in this report, we present only crude figures by sex and study area. The difference in the minor allele frequency (MAF) among the cohorts was tested by the chi-square test for contingency tables. The MAF of the ABCC11 Arg180Gly (T/C) polymorphism by study area is not presented here because the inter-area variation in the distribution of this genotype will be reported in a separate article.
Genotypes with distributions that departed from the Hardy-Weinberg equilibrium were assessed using the exact test 20 with the genhwi command of Stata version 8.0 (Stata Corp, College Station, TX, USA). Other statistical analyses were performed using Statistical Analysis System version 9.1 (SAS Institute Inc, Cary, NC, USA). 21 To compare the allele frequencies of genotypes in our study with those in another Japanese population, we used data from HapMap, which is an open access database that includes allele frequencies for 45 Japanese in Tokyo (HapMap-JPT, http:// www.ncbi.nlm.nih.gov/snp). Of the 108 polymorphisms of interest, we made comparisons for 88. The 20 polymorphisms excluded from our analysis showed no minor alleles in our study group (n = 4), were not represented (n = 15), or had invalid data (n = 1, 100% heterozygotes) in the HapMap-JPT data set.

RESULTS
Our analysis included 2124 men (47.0%) and 2395 women (53.0%) with a mean age ± standard deviation at baseline of 55.8 ± 8.9 years (range, 35-69 years). There were considerable differences in the age and sex distributions of different study areas (Table 1). In Fukuoka and Saga, the participants originally enrolled in the J-MICC Study were limited to adults aged 50 years or older and 40 years or older, respectively. Table 2 summarizes selected demographic, lifestyle, and medical characteristics of the participants by sex and age. Within our sample, 29.1% of men and 7.1% of women were current smokers. More than two thirds (71.4%) of men drank alcoholic beverages at least once a week, as did 27.7% of the women. Table 3 presents data on selected lifestyle and medical variables of the participants by sex and study area. Considerable variations were found among the participating cohorts.
The P value for departures from the Hardy-Weinberg equilibrium was less than 0.05 for 19 polymorphisms. However, the only genotypes for which the difference between the observed and expected frequencies exceeded 3% were the CETP Ile405Val (A/G) heterozygote and the SLC30A8 Arg325Trp (C/T) heterozygote. As shown in Table 5, some polymorphisms demonstrated a considerable difference in MAF among the participating cohorts; for 32 of the 108 polymorphisms, including ABCC11 Arg180Gly (T/C), there was a highly significant difference in MAF among study areas (P < 0.001).
The Figure shows a comparison of the allele frequencies in our study population and the HapMap-JPT data set. Among 88 polymorphisms, only 5 (ABCA1 rs2230808, COMT rs4680, IL-6 rs1800796, NOS3 rs2070744, and VDR rs2228570) showed a difference in allele frequencies of more than 0.1 between the 2 populations.

DISCUSSION
The present report describes the profiles of participants in a cross-sectional study within the J-MICC Study data set and the allele frequencies of 108 polymorphisms, with potential relevance to lifestyle and clinical factors, in their genomes. The allele frequencies for most polymorphisms in our study population were comparable to those in the HapMap-JPT data set.
It has been suggested that polymorphisms for APOA1 184Pro (C), ESR1 IVS1-351G, LCAT/SLC12A4 232Thr (A), and SCARB1 135Ile (A) do not exist in the Japanese population (http://www.ncbi.nlm.nih.gov/snp and personal communication); however, we included them in the present study to test this notion in a large sample (>4000 people). Our results confirmed that these minor alleles were indeed absent among Japanese.
Of the remaining 104 polymorphisms, 19 showed departures from the Hardy-Weinberg equilibrium with P values <0.05. In most cases, however, the absolute differences between the actual and expected frequencies were minimal. Thus, these apparently small P values could be accounted for by the large sample size and the multiple tests used in our study, and any errors in genotyping seemed unlikely to have resulted in substantial misclassification.
Although genotype data for only 45 people, at most, are available in the HapMap-JPT data set, the allele frequencies in the HapMap-JPT population and our study population were remarkably similar for most of the polymorphisms examined ( Figure). For 45 individuals, the 95% confidence intervals  A major strength of the current study was that it provided a comprehensive collection of data on lifestyle and clinical factors. Because it is not easy to gain access to data on genotype distributions in a large Japanese population, our data might also be useful as a reference tool. However, because the participants in this study were recruited from various sources throughout Japan, associations of genotypes with lifestyle and clinical factors might vary between populations. There might also have been differences between institutions in terms of the measurement methods used in the clinical examinations, because we could not directly control the quality of the health examinations. These differences must be taken into consideration when analyzing and interpreting the data. In addition, some polymorphisms showed a substantial difference in MAF among the participating cohorts (Table 5). Yamaguchi-Kabata et al suggested that individuals from the Ryukyu Islands, including the Amami Islands, had genetic characteristics that differed considerably from those of individuals from the main islands of Japan, 22 which was consistent with our present results. Genetic variations among study areas should be taken into account in the data analysis. Furthermore, the generalizability of the study findings should be considered because the response rates were low in some areas. In most cases, however, the underlying biological mechanisms are unlikely to differ between the respondents and members of the general population. The low response rate might have been due to the recruitment methods (mailing invitation letters or distributing leaflets to the general populations of 3 areas) or the strict procedures used to obtain informed consent.
In conclusion, this comprehensive data collection on lifestyle and clinical factors will be useful in elucidating gene-environment interactions and could provide an informative reference tool, particularly because free access to genotype data for a large Japanese population is not readily Plus-minus values are means ± SDs. a Participants in Fukuoka area were excluded from the analysis because they were not asked about educational attainment in the questionnaire. b Individuals who drank alcoholic beverages ≥1 day/week. c Not available in some study areas, as shown in Table 3.