A Simple Food Frequency Questionnaire for Japanese Diet-Part I . Development of the Questionnaire , and Reproducibility and Validity for Food Groups

We developed a simple food frequency questionnaire (FFQ) based on one-day dietary records (DRs) among 1001 subjects in Nagoya, Japan. A total of 97 foods and dishes were selected through a two-step procedure; first by ranking food items according to the contribution to the population intake of nutrient variables, and second by stepwise multiple regression analyses of individual food items as the independent variables and of total nutrient intake as the dependent variables. For simplicity, questions on portion sizes were not included except for a few selected food items, which resulted in short time (about 20 minutes) to complete the questionnaire. This FFQ was validated for food groups by referring to four 4-day DRs among 88 men and women in central Japan, from 1996 to 1997. The energy-, sexand age-adjusted test-retest correlation coefficients between the two FFQs administered at an one year interval ranged from 0.34 to 0.78. The de-attenuated, energy-, sexand age-adjusted correlation coefficients between the second FFQ and the DRs were larger than 0.40 for most food groups, indicating the usefulness of this simple FFQ with its sufficient validity in epidemiological surveys. J Epidemiol, 1999; 9 : 216-226.

expensive color picture booklet of food samples.Detailed questionnaires might be required in Japan, in particular, to evaluate accurate food/nutrient intakes, because contemporary Japanese diet included a large variety of foods or dishes; traditional Japanese, Western and Chinese.Nevertheless, the complex questionnaires would be expensive to apply and also be a greater burden on subjects, which certainly discount the merits of food frequency method.Recently, some attempts have been made to develop much simpler questionnaires 5).We, therefore, also tried to develop a simple and self-administered FFQ that would be able to assess individual diet with reasonable validity.This questionnaire was primarily developed for a case-control study of diet and bladder cancer in Japan.
Most studies on the reproducibility and validity of FFQs have been analyzed on the basis of nutrient intakes [6][7][8][9][10][11][12].The reproducibility and validity for food groups 13-175 are, however, also important, since findings clarified by food group might be more directly useful to formulate dietary recommendations than those clarified by each nutrient.Such analyses may possibly detect shortcomings in FFQs, thus suggesting specific areas that should be improved at subsequent revisions of a questionnaire v.
We therefore conducted a validation study for both food groups and nutrients.
In this communication, we will describe some processes in developing our FFQ, and its reproducibility and validity for food groups in particular.The reproducibility and validity for nutrients will be discussed in an accompanying paper 14.The validation study was scheduled as illustrated in Figure 1.

Development
The study started in June 1996, when the first FFQ (FFQ 1) was distributed to the subjects.(DR1, DR2, DR3 and DR4) were then conducted at intervals of three months, and the subjects were asked to fill in the second FFQ (FFQ2) after the final DRs.Response to FFQ1 was compared with that to FFQ2 to assess the reproducibility, and the questionnaire was validated by referring to the 16-day (four 4-day) DRs as the standard.
The dietary recording was carried out by each subject with assistance of responsible students/graduates, following the specific standardized procedure.Foods and beverages (excluding water and Japanese tea) consumed were weighed and recorded on the dietary form specifically designed for this validation study.When foods or beverages could not be weighed (for example, when eaten out), the subjects were instructed to describe the foods/beverages in detail, and the portion sizes were estimated from the description.The food records were initially coded by the students/graduates, but all of the records were thoroughly reviewed independently by two other dietitians.A dietitian telephoned responsible students/graduates to resolve ambiguities.Days with special events such as New Year's holidays were excluded from the dietary recording period.
Statistical Analysis The food records were coded according to the Japanese food composition table 22,23).Foods were categorized to 18 food groups, and vegetables were further divided into green-yellow vegetables and others.The food composition table, supplemented by another source 24), was used to compute energy intake of the subjects.
We used Pearson and intraclass correlation coefficients between the two FFQs (FFQ1 and FFQ2) in order to assess the reproducibility for food groups.The correlations, adjusted for energy intake, sex and age, were also calculated.This adjustment was performed by computing residuals from regression models .All values were log, transformed in advance to improve their normality.
The questionnaire was validated by referring to the 16-day DRs.Pearson correlation coefficients between the FFQ and the DRs adjusted for energy intake, sex and age were computed as well as crude ones.The crude, and energy-and age-adjusted coefficients were also presented by sex.
Within-person day-to-day variations in individual intake for many food groups could be quite large and also attenuate correlations between the FFQ and the DRs.We, therefore, statistically adjusted the Pearson correlation coefficients between the two methods for this attenuation.If one day values were treated as random units of observation, as was done in the analyses for nutrients la), population distributions of the consumption levels for food categories could not be easily approximated to normality.This is because the one day values from DRs can be considered a mixture of two different types of distribution, that is, a binomial distribution and a right-skewed (approximately log-normal) one for non-zero values 1).Only after averaging the amounts of intake for food groups over 8 days, the population distributions could reasonably be approximated to normality by natural logarithmic transformation.Two 4-day DRs, the first and the second or the third and the final ones, were thus treated as an unit of observation when analyzing food groups.The natural logarithms of these 8-day averages were adjusted for total energy intake, sex and age by using regression models 25).We assessed the within-person and between-person components of variance in 8-day food group intakes obtained from the DRs by one-way analysis of variance, and corrected (de-attenuated) the Pearson correlation coefficients of consumptions by food group between those based on the FFQs and those based on the DRs to take withinperson variability into consideration 26).The 95% confidence intervals of the de-attenuated coefficients were computed using the formula proposed by Rosner and Willett 26).This de-attenuation was not made for the correlation coefficients by sex since the small sample size precluded us from estimating the sexspecific, de-attenuated coefficients with reasonable precision.

Development of the Simple Food Frequency Questionnaire
A total of 647 food and 168 dishes were listed from the oneday DRs.Of them, the number of food items which required to cover over 90% of the total population intake for each nutrient variable was as follows; energy 186, protein 208, fat 142, carbohydrate 119, calcium 159, iron 210, potassium 200, vitamin A 59, retinol 27, carotene 37, vitamin D 36, vitamin C 66 , SFA 110, MUFA 107, PUFA 109, cholesterol 76 , vitamin E 147, dietary fiber 94, magnesium 122, zinc 104, isoleucine 149, leucine 144, tryptophan 149 and valine 147.Then, we needed 357 food items to cover 90% of the total intake for energy and the nutrients.
By collapsing several similar food items into one , 281 food items were listed and included in the stepwise multiple regression analyses.The multiple regression models , accounting for RI of 80%, identified 209 foods or dishes to predict 80% of the between-person variability for all the 24 nutrients considered .We further combined some similar foods and dishes , and final- a) Two subjects were excluded, because their consumption frequency was missing or less than once a month for more than 2/3 of the food items in FFQ 1.
Table 2. Pearson (r) and intraclass (r) correlation coefficients for food groups between the two food frequency questionnaires (n=86)a).a) Two subjects were excluded, because their consumption frequency was missing or less than once a month for more than 2/3 of the food items in FFQI.All values were loge transformed to improve normality.
The Pearson correlation coefficients between daily consumption of food groups based on the FFQs and the four 4-day DRs are summarized in Table 3.The correlations between FFQ2 and the DRs were stronger than those between FFQ 1 and the food records.When adjusting for energy intake, sex and age, the correlations were slightly improved only for those between the second FFQ and the DRs.The de-attenuated, energy, sex and age-adjusted coefficients between FFQ1 and the DRs ranged from 0.19 for sugars and sweeteners or vegetables other than green-yellow ones to 0.71 for milk and dairy products or fruits (median=0.45),while those between FFQ2 and the DRs ranged from 0.16 for potatoes and starches to 0.83 for milk and dairy products (median=0.56).
Table 4 presents the sex-specific Pearson correlation coefficients between daily consumption of food groups based on the FFQs and the four 4-day DRs.In males, the energy and ageadjusted (but not de-attenuated) coefficients between FFQ2 and the DRs ranged from 0.09 for potatoes and starches or seaweeds to 0.75 for milk and dairy products (median=0.43).In females, they ranged from 0.09 for potatoes and starches to 0.69 for milk and dairy products (median=0.45).The coefficient for breads was higher in males than in females, while those for vegetables and meats were lower in males.

Development of the Simple Food Frequency Questionnaire
In developing the food list as simple as possible in our FFQ, we used a two-step procedure to select food items.The foods and dishes, which were initially selected based on their percent contribution to the population intake of energy or nutrients, could be further reduced in number by about 25% by using the stepwise regression analyses.As many as 97 food items, however, had to be included in the final FFQ, though much more limited food lists have been developed for FFQs in Western countries 6.7.27).This might be primarily due to a large variety of modem Japanese diets.In particular, the sources of energy, protein, iron and potassium were widely distributed over many foods and dishes in the present analyses.
To add specific questions on portion sizes will double the number of questions in the FFQ, and impose greater burden on subjects.Among Western populations, individual estimation of portion sizes has not necessarily improved the validity of FFQs to substantial degree ).This might indicate that portion sizes themselves are of minor significance compared with frequencies or that the individual portion sizes could not be estimated correctly.In fact, a simple FFQ with no portion size question, which was developed by Pietinen et al. 7), could estimate energy-adjusted nutrient intake with reasonable validity.
These were the reasons why we did not include questions on portion sizes in our FFQ for most of the food items; resulting in shortening the time required to complete the questionnaire as well as the FFQ itself.

Reproducibility and Validity for Food Groups
The correlation between FFQ2 and the DRs were stronger than those between FFQ1 and the food records.This should be conceptually appropriate, because the FFQ refers to the diet during preceding one year and the DRs was carried out between FFQ1 and FFQ2.
In our validation study for food groups, the correlation coefficients between intakes estimated by the FFQs and the DRs were adjusted for energy intake.It has been extensively discussed that nutrient intakes should be adjusted for energy in epidemiological analyses.Intakes of most nutrients tend to be positively associated with total caloric intake.Specific nutrients may be associated with disease simply due to their correlation with energy intake.If only crude values are used , therefore, it may be unclear whether an association of a given nutrient intake with a disease is attributable to the nutrient per se .These arguments seem to be true also for food group intakes since most of them are positively related to total amount of food consumption, which is well represented by total energy intake.Thus, food group intakes adjusted for caloric intake would be useful in epidemiological investigations and therefore also in validation studies for FFQs.Dietary advice often begins with determining one's optimal energy intake and is followed by considering the best composition of his/her diet.Thus, calorie-adjusted food group intakes might be relevant also from a practical point of view.We also adjusted the coefficients for sex and age.This is because these variables are almost always controlled in epidemiological analyses 1).Between-person variation in dietary intake due to sex and age would increase the observed correlations between diet and disease in epidemiological studies, but this increased correlation are removed in the analyses adjusting for sex and age.It will be necessary, therefore, to present the coefficients adjusted for these covariates to validate a FFQ for nutritional epidemiology.We also presented sex-specific correlations since epidemiological data are often analyzed by sex.
We averaged food group intakes over 8 days, and treated their natural logarithms parametrically, that is, regarded them as normally distributed variables.Most of the food group intakes could reasonably be approximated to normality after these transformations, but some deviations from normal distribution were still observed for breads, confectioneries, nuts and seeds, and alcoholic beverages.We therefore computed Spearman's rank correlation coefficients between daily intakes based on the FFQs and the four 4-day DRs for the four food groups, and got figures similar to the crude Pearson correlation coefficients shown in Table 3.The methodological limitation, however, should be kept in mind when interpreting the adjusted or de-attenuated coefficients for these food groups.
The correlation coefficients for reproducibility and validity for food groups were not so high as those observed in previous studies 13-17.The correlation between the FFQs and the DRs was found to be considerably weak in such food groups as potatoes and starches, sugars and sweeteners, and seaweeds.This might be ascribable to the small number of food items Table 4. Pearson correlation coefficients (r) between daily consumption of food groups based on food frequency questionnaires (FFQs) and four 4-day dietary records (DRs) by sex°).
a) Two 4-day dietary records, the first and second ones or the third and final ones, were treated as an unit of observation in the analysis for food groups.All values were loge transformed to improve normality.b) Two subjects were excluded, because their consumption frequency was missing or less than once a month for more than 2/3 of the food items in FFQ1.
included in the questionnaire for these food groups.Only small proportion of these food groups was seemingly covered by our FFQ, as indicated by the low mean intake (less than 50% of the DRs) based on the FFQ.In general, intakes of food groups had to be considered when selecting food items which should be included in FFQs, as so were that of nutrient variables in order to increase the validity at food group level.Some foods were collapsed into one question, in particular, for potatoes and starches, fishes, and seaweeds.For example, we asked how often "potatoes (white potato, taro and sweet potato)" were eaten during preceding one year.These "combined" questions might be difficult to answer) and might result in lower reproducibility and validity.
Consumption of rice or milk and dairy products were overestimated by more than 30% (Table 1), though estimates derived from the FFQ showed excellent correlations with intakes based on the DRs for these food groups.Contrary to our expectation, it was not easy to standardize portion sizes of rice and milk.Four serving sizes for rice (small, medium and large rice bowls ("chawan"), and a China bowl ("donburi")) were listed in our FFQ.These serving sizes corresponded well to the actual distribution of portions appeared in the DRs.Nevertheless, the participants did not necessarily select appropriate portion sizes in the FFQ; the responses showed an overconcentration into medium rice bowls and small ones in males and females, respectively.Intelligible description of portions including pictures would be required to estimate rice consumption more precisely.Rice in some mixed dishes may be more difficult to be quantified with FFQs than rice itself and therefore might also be overestimated.
The standard portion size for milk had been determined to be 200 gram.In fact, milk was most frequently consumed by 200 gram in the validation study, reflecting the Japanese size of bottled milk.Milk was, however, often drunk also by less than 200 gram perhaps in a glass or cup , while only rarely consumed by more than 200 gram.The resulting overestimation in intake may be ascribable to limitations of FFQs without questions on portion sizes.Other dairy products had more various portions and it would not be easy to determine "standard" serving sizes.
The validity of our FFQ for vegetables and meats was poorer in males than in females.This may be because these foods are frequently included in mixed dishes.Men are not so likely to cook their own meals in Japan, and it would be difficult for those who do not cook to tell how often vegetables and meats are used in mixed dishes.Pure dish-based FFQs 2) would be required to improve the validity for these groups of foods in males.
Another issue in our FFQ is that the standard recipes for mixed dishes were defined by dietitians, and therefore might be different from those for dishes eaten by the target populations.Dish databases based on actual dietary records or recalls should be prepared to improve validity of the FFQ.
The above-mentioned weakness of our FFQ, which was suggested by the validation study, should be taken into account when developing subsequent questionnaires.Nevertheless, the de-attenuated, energy-, sex-and age-adjusted correlation coefficients between FFQ2 and the four 4-day DRs were larger than 0.40 for most food groups; indicating usefulness of our simple FFQ when ranking respondents according to food group consumption in epidemiological surveys among the middle-aged and the elderly.
In summary, we developed a simple FFQ based on one-day DRs.A total of 97 foods and dishes were selected through a two-step procedure; first by ranking food items according to the contribution to the population intake of energy and nutrients, and second by stepwise multiple regression analyses of individual food items as the independent variables and of total nutrient intake as the dependent variable.For simplicity, questions on portion sizes were not included except for a few selected food items; resulting in short time to complete the questionnaire.The FFQ was validated for food groups by referring to four 4-day DRs.The correlation coefficients between the FFQ and the DRs were larger than 0.40 for most food groups; indicating the usefulness of our FFQ with its sufficient validity in epidemiological studies among the middleaged and the elderly in Japan.

Figure 1 .
Figure 1.Schedule of the validation study.

Table 1 .
Mean daily consumption of food groups (g/day) based on four 4-day dietary records (DRs) and the first/second food frequency questionnaires (FFQ1/FFQ2).

Table 3 .
Pearson correlation coefficients (r) between daily consumption of food groups based on food frequency questionnaires (FFQs) and four 4-day dietary records (DRs)a).