Factor Analysis of Digestive Cancer Mortality and Food Consumption in 65 Chinese Counties

Dietary factors were analyzed for the regional difference of GI tract cancer mortality rates in China. Sixty-five rural counties were selected among a total of 2,392 counties to represent a range of rates for seven most prevalent cancers. The dietary data in the selected 65 counties were obtained by three-day dietary record of households in 1983. The four digestive cancer mortality rates (annual cases per 100,000 standardized truncated rates for ages 35-64) and per capita food consumption were analyzed by the principal components factor analysis. Esophageal cancer associated with poor area, dietary pattern rich in starchy tubers, and salt, lack of consumption of meat, eggs, vegetables and rice. Stomach cancer seemed to be less associated with diet in this study because of its small model Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy, suggesting some other carcinogenic factors would play more important role in the development of this cancer in China. The colon and rectal cancer showed close relation to diet; rich in sea vegetables, eggs, soy sauce, meat and fish, while lack in consumption of milk and dairy products. Rapeseed oil was more important risk factor for colon cancer than that of rectum. Rice, processed starch and sugar were closely associated with colon cancer, supporting the insulin/colon cancer hypothesis. J Epidemiol, 1999 ; 9 : 275-284.

Between 1973 and 1975 in China, cancer was a major cause of deaths, and digestive cancer was the most frequent cancer.Carcinomas of the gastrointestinal tract (GI) are among the most common malignancies in both incidence and mortality.Nutritional factors play an important role in the tumor development.The strength of their influence varies with the localization in the GI tract.Epidemiological studies focusing on GI cancer incidence or mortality as an endpoint necessitate large numbers of subjects to achieve significant results, such as cancer of the esophagus, stomach, colon and rectum 1-4).In the present study, dietary factors were analyzed for the regional difference of GI tract cancer mortality in China.
The occurrence of most human cancers is usually the result of exposure to many factors occurring over a long period of time 5).In any assessment of etiological factors in the development of some cancers, multiple dietary exposures should be considered.Multivariable statistical procedures are quite valuable for such cross-sectional study as various hypotheses can be considered, because it is difficult to interpret too many of correlation coefficients generated by this study.Factor analysis is often used in exploratory data analysis to study on the correlation coefficients among a large number of interrelated quantitative variables by grouping the variables into a few factors.After grouping, the variables within each factor are more highly correlated with variables in that factor than with variables in other factors, which can explain the pattern of correlation coefficients within a set of observed variables.

MATERIALS AND METHORS
The cancer mortality data (annual cases per 100,000 standardized truncated rates for ages 35-64 of esophagus, stomach, colon and rectum between 1973 and 1975) and food consumption figures in 1983 were selected from the report of People's Medical Publishing House Beijing 6 ).Seven most prevalent cancer mortality rates, such as nasopharynx, esophagus, stomach, liver, colorectal, lung, and leukemia were selected in 65 rural counties from a total of 2,392 counties.After dividing the range of male nasopharynx cancer mortality rates into 20 segments, the county with the highest mortality was selected from among counties with population over 100,000 for each segment.Thus, 20 counties were selected for this site.This procedure was repeated for esophageal cancer, except that counties were taken only from segments in which no county had previously been selected.Then it was repeated for liver cancer, and so on.Successively fewer counties were required for each new type of cancer, since more segments tended to contain counties that had already been selected, so, the total number of counties selected was not 140, but only 65.When the selection resulted in geographic clustering of counties in a few provinces, then such counties were replaced with the second highest mortality county in that segment.The county selection procedure was not intended to be random, but was simply intended to cover a wide range of cancer death rates and wide geographic scatter.The truncated cancer mortality rates were summarized in Table 1.The 1983's household three-day diet survey obtained the dietary data in the same 65 rural counties.A random cluster sampling procedure was used to select the survey subjects, if the selected subject in each household was absent or declined to participate (less than 1 per cent of those initially identified), then another subject in a neighboring household was selected.This procedure yielded about 13,000 subjects who participated in the three-day food intake measurements.In each of the 65 counties, 30 households were selected for the dietary survey.Food consumption was measured over three successive days, with one surveyor being responsible for 4-6 households.The number of persons who ate meals was recorded each day.Pregnant or lactating women were also noted, but children under two years of age were excluded from the survey.All subjects were divided into five grades of physical activity.All raw and processed foods remaining before the start of the survey and after the last meal were weighed and recorded.The daily purchased food items during the three-day survey period, as well as all snack foods, were measured and recorded.Food consumption was standardized per capita by the so-called reference man procedure.
First a 'person-day', which included at-home, one-day, single-person consumption of all three meals was calculated, using proportions of 0.2, 0.4, and 0.4 as the daily energy consumed during breakfast, lunch, and supper, respectively.Next, the number of person-days over the three-day survey period was calculated for each person.This number was then standardized per'reference man', who was defined as an adult man, 19-59 years of age, 65 kg of body weight, and undertaking very light physical work (category 1 of physical activity), using the conversion factors.Summation of the standardized persondays for each member of the household gave the three-day total reference-man-days for the household.The quantity of the food consumed was then divided by the total number of standardized person-days to give daily intake per reference man.The food intake data represented the average intake per reference man for the 30 households in each county.Food intake data was summarized in Table 2.
The correlation coefficients among the four digestive cancer mortality rates and per capita food consumption were calculated by principal components factor analysis with Varimax rotation of SPSS 8.0.1 J for Windows.There are several rotation methods available in SPSS.Varimax rotation is orthogonal rotation, meaning the resulting factors are uncorrelated.When factors are not relating each other, a factor loading is the correlation of a variable with an underlying factor, and a squared factor loading is a coefficient of determination which specify the amount of variance, associated with a variable which is accounted for by a particular factor.Factor analysis is used to obtain interpretable factors, so, in each factor analysis model, it is necessary to acquire a comparatively large value of model Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy by eliminating the variables with small variable measure of sampling adequacy printed on the diagonal in anti-image correlation matrix one at a time.Kaiser (1974) divided model KMO measure of sampling adequacy into six levels: excellent when the value is in the 0.90 to 1.00 range, good when in the 0.80 to 0.90 range, middle when in the 0.70 to 0.80 range, average when in the 0.60 to 0.70 range, not sufficient when in the 0.50 to 0.60 range and poor when less than 0.50.The average level (in 0.60-0.70range) was used as a standard in this study.

RESULTS
Pearson correlations among four digestive cancer mortality rates and per capita food consumption were presented in Table 3.
Variables with small variable measure of sampling adequacy were eliminated one at a time, until the model KMO value arrived in the 0.60-0.70range.In the esophageal cancer mortality and food consumption factor analysis model, KMO measure of sampling adequacy was 0.609, after eliminating consumption of wheat flour, legume and legume products, salt preserved and dried vegetables, and rapeseed oil.In the colon cancer factor analysis model, KMO value was 0.618, after eliminating wheat flour, legume and legume products, salt preserved and dried vegetables.In the rectal cancer factor analysis model, KMO value was also 0.618, after eliminating wheat flour, starchy tubers, legume and legume products, salt preserved and dried vegetables, and rapeseed oil.In the factor analysis model of stomach cancer mortality and per capita food consumption, the model KMO measure of adequacy was 0.510, after eliminating legume and legume products consumption.The variable measure of adequacy for stomach cancer mortality became the smallest, which meant stomach cancer should be eliminated in this factor analysis model, because the stomach cancer mortality was less associated with per capita food consumption in this study.
The location of the selected areas for this study was shown in Fig. 1.Linxian county had the highest esophageal cancer mortality, Jiashan county had the highest colon and rectal cancer mortality.To understand why a county is unusual, we can compare its per capita food consumption with the mean of each observed counties (Fig. 2).
Table 4 presented the rotated factor loadings of esophageal cancer mortality and food consumption.The factor loadings italicized in a thick character (the largest for each factor) indicated variables which were most representative components of each factor.In this model, 62% (0.7842) of the variance of esophageal cancer mortality was accounted for by factor 6 in which esophageal cancer mortality, consumption of starchy tubers, and rich in salt showed positive coefficients, while consumption of meat, eggs, vegetables and rice had negative coefficients.The other foods, having small loadings, did not significantly identify any dietary pattern.
Component plot in rotated space showed close relationship between the esophageal cancer and starchy tubers (Fig. 3).Starchy tubers and salt belonged to the same dimension, and meat, eggs, vegetables, and rice belonged to the opposite side.Linxian county consumed starchy tubers and salt higher than average, while lower of meat, eggs, and rice, except for vegetables (Fig. 2).
Table 5 presented the rotated factor loadings of colon cancer mortality and food consumption.In this model, 62.4% (0.792) of the variance of colon cancer mortality was accounted for by factor 2. The colon cancer mortality and sea vegetables, eggs, soy sauce, and rapeseed oil consumption showed large coeffi-cient.The colon cancer, sea vegetables, and rapeseed oil were plotted closely, while fruits and milk and dairy products were present in opposite side (Fig. 4).Jiashan county had higher consumption of sea vegetables, eggs, soy sauce, rapeseed oil, meat, rice than average, while lower of salt, milk and dairy products (Fig. 2).
Table 6 presented the rotated factor loadings of rectal cancer mortality and food consumption.In this model, 88% (0.942) of the variance of rectal cancer mortality was accounted for by factor 2, in which most representative of rectal cancer mortality was sea vegetable consumption.The rectal cancer and sea vegetables were closely plotted, and vegetables, milk and dairy products, fruits and other cereals were plotted in the opposite side (Fig. 5).Jiashan county also consumed less vegetables, fruits and other cereals than average (Fig. 2).

DISCUSSION
Epidemiological evidence points to the major importance of the indirect way of carcinogenesis caused by specific nutritional deficiencies and excesses based on statistical associations 7). .

Although
there were limitations in data analyzed, the use of factor analysis produced meaningful results, especially when Esophageal cancer and foods analysis.and 1975 might depend on the pattern of disease causes that existed some years previously, and these earlier pattern would be somewhat different from the pattern of dietary characteristics in1983.Dietary pattern in these rural areas of China, however, was simple in food variety, had probably remained simple and similar for many years, since foods consumed in each area were produced locally and dependent on reasonably stable local crop conditions.Major risk factors of esophageal cancer were smoking and strong alcohol drinking habit2).Dietary pattern characteristics of starchy tubers consumption, high salt intake while low consumption of meat, eggs, vegetables, and rice were suggested to be a risk factor in this study.County with high esophageal cancer mortality tended to locate in poor area taking starchy tubers as a major energy source at that time in China.It is consistent with that regions with a high incidence of esophageal cancer were generally located in poor parts of the world, and their inhabitants subsisted on a diet high in starch and almost without fresh fruits and vegetables.Tendency to eat meat less frequently and dependence on the salted foods were also common in these area.Intake of meat, eggs and higher proportion of rice in the grain ration seemed to be protective factors 3,8-11).
There was only one significant correlation of processed starch and sugar consumption, among correlations of stomach cancer mortality and per capita food consumption.There were some reports that stomach cancer was associated with starch consumption, and that added sugar also increased the risk of gastric cancer [12][13].In our study, stomach cancer mortality didn't show significant relation to salt intake, which was confirmed to be positively associated with stomach cancer death in Japan 14).It may raise the problem in evaluating the level of salt intake by using the composition table in China 15).Results of this factor analysis suggested some other carcinogenic factors would play more important role in the development of stomach cancer than diet, for example, Helicobacter pylori infection, positive family history, smoking, and nitrite in foods [16][17][18][19] Variance of colon and rectal cancer mortality accounted for by one factor in each factor analysis were 62.4% and 88% respectively.The proportion of colorectal cancer attributed to dietary habits was high 20).The two dietary patterns were similarly rich in consumption of sea vegetables, eggs, soy sauce, meat, and fish, while lack of milk and dairy products.Howell et al 21) reported that cancer rates of colon and rectum were associated with high food consumption of meats, eggs and fats.Frequent consumption of eggs and fat-rich foods such as meats would be risk factors for colorectal cancer.Consumption of fish also positively associated with colorectal cancer 14), and colorectal cancer risk was weakly inversely associated with the consumption of milk and dairy products 22).It was difficult to explain food consumption of sea vegetables and soy sauce as risk for colorectal cancer.
This study showed some differences between colon and rectal cancer dietary factors at the same KMO level.Several inconsistencies remained in colorectal cancer risk factors, especially with respect to the influence of some food groups 20).Frequent intake of eggs and meats, as an animal fat origin, with rich saturated fatty acids, were important risk factors for colon cancer, and only slightly related to rectal cancer, while low consumption of vegetables was closely associated with an increased risk for rectal cancer, which were consistent with results of case-control study in Shanghai, China 23).Rapeseed oil (rich in monounsaturated fatty acids but lack of polyunsaturated ones) was also risk for colon cancer comparing to oil other than rapeseed (included peanut, sesame, cotton seed and soybean oil, abundance of polyunsaturated fatty acids), Liu K et al 24) reported that intake of total fat, saturated fat, and monounsaturated fat were highly correlated with colon cancer incidence rate.Other studies showed that the incidence of colon cancer was significantly higher in rats fed abundantly saturated and monounsaturated fatty acids containing oil than that of polyunsaturated 25).Polyunsaturated fat was not associated with colon cancer26).In addition, high temperature heated cooking with unrefined Chinese rapeseed oil resulted in higher emission condensation of 1,3-butadiene, benzene, acrolein, formaldehyde, and other related compounds than that of Chinese soybean, peanut oil, linolenic, linoleic, and erucic fatty acids.Combined with experimental and epidemiological findings, it was suggested that heated rapeseed oil increased lung cancer risk due to inspired mutagenic chemicals in emission 27,28).Similarly, the genotoxicity of inhaled chemicals would also increase colon cancer risk when eaten.Consumption of rice, processed starch and sugar were closely associated with colon cancer, leading us to reconsider the role of starchy foods and refined sugar on the insulin/colon cancer hypothesis 20).

ACKNOWLEDGEMENT
We are grateful to LI JUNYAO (Department of Cancer Epidemiology, China Cancer Institute, Chinese Academy of Medical Sciences, Beijing, People's Republic of China), who gave us the opportunity to analyze the present data, by sending the late Dr. Hirayama the book (Chen Junshi, et al, Diet, Lifestyle and Mortality in China: A Study of the Characteristic of 65 Chinese Counties, People's Medical Publishing House, Beijing, 1991).
).The national retrospective 1973-1975 mortality survey was undertaken in China in 1976, by China National Office of Cancer Control and Research.It included 850 million people residing in about 2400 counties, and represented over 96 per cent of the population at the time of the survey 6

Table 1 .
Cancer mortality rates of four digestive cancer sites a.
a Annual cases per 100 ,000-Standardized truncated rates for ages 35-64, standardized by average annual population between 1973 and 1975 of China .b Sixteen counties did not separately identify colon and rectal cancers , there was only one county with no deaths from colon cancer in the mortality investigation.

Table 2 .
Intake of foods in 19 food categories.
a Oil other than rapeseed included peanut , sesame, cotton seed and soybean oil.b Vegetables included light colored and green vegetables .

Table 3 .
Pearson correlations between digestive cancer mortality rates and per capita food consumption a.

Table 4 .
Rotated component matrix of esophageal cancer and foods analysis a.
a Principal components factor analysis with Varimax rotation based on Pearson correlations among esophageal cancer mortality and food consumption, exclude cases pairwise.Component 6 Figure 3. Component plot in rotated space.

Table 5 .
Rotated component matrix of colon cancer and foods analysis a a Principal components factoranalysis with Varimax rotation based on Pearson correlations among colon cancer mortality and food consumption, exclude cases pairwise.

Table 6 .
Rotated component matrix of rectal cancer and foods analysis a.
a Principal components factor analysis with Varimax rotation based on Pearson correlations among rectal cancer mortality and food consumption, exclude cases pairwise.Component 2 Figure 5. Component plot in rotated space.Rectal cancer and foods analysis.