Food Science and Technology Research
Online ISSN : 1881-3984
Print ISSN : 1344-6606
ISSN-L : 1344-6606
Original papers
A milk coffee flavor lexicon developed based on the perceptions of Japanese consumers and its application to check-all-that-apply questions
Shinichiro HatakeyamaToshiyoshi KawaguchiTakuya YamaguchiDaisho YoshiharaKana TakahashiMasayuki AkiyamaReiko KoizumiKazuhiro MiyajiYasuhiro TakedaMito Kokawa Yutaka Kitamura
Author information
JOURNAL OPEN ACCESS FULL-TEXT HTML

2023 Volume 29 Issue 3 Pages 197-209

Details
Abstract

A flavor lexicon of milk coffee was developed based on the perceptions of Japanese consumers. To collect an exhaustive set of sensory terms for milk coffee, sixty types of samples were prepared. The samples were presented to 203 untrained panelists, and 456 sensory terms were collected. Following qualitative and quantitative screening, 53 terms were selected. The validity of the 53 terms was confirmed by check-all-that-apply (CATA) questions for six samples. The results indicated that the frequency of use of 42 terms was significantly different for the samples, and all terms were used at least once. In addition, the configuration of the six samples in the correspondence analysis of the CATA results was similar to that for a quantitative descriptive analysis conducted on the same samples. The developed flavor lexicon and its application to CATA will provide a new means for evaluating milk coffee from the perspective of Japanese consumers.

Introduction

Coffee is one of the most widely consumed beverages in the world, as people enjoy its flavor. Flavor lexicons or wheels are useful for describing the flavor characteristics of coffee. Several coffee flavor lexicons have been developed, which differ depending on the country, purpose, materials, and methods used (Seo et al., 2009; Hayakawa et al., 2010; Chambers et al., 2016; Spencer et al., 2016), and are widely used by the coffee industry.

However, the current lexicon (Hayakawa et al., 2010) is mainly applicable to black coffee, whereas coffee is often consumed after adding milk and/or sugar. Since the flavor characteristics of milk coffee are influenced by the amount and balance of milk and/or sugar, flavor lexicons optimized for the evaluation of black coffee are less useful for the evaluation of milk coffee, yet there are no studies regarding the terminology to describe the flavor characteristics of milk coffee.

Furthermore, most conventional flavor lexicons were developed using a highly trained panel (Lawless and Civille, 2013). The resulting lexicons may thus contain terms that are difficult to understand and are unfamiliar to consumers. Milk coffee is usually provided to the end consumer in coffee shops or supermarkets as ready-to-drink (RTD) products, and thus it is preferable that a flavor lexicon for milk coffee consists of terms that are easy to understand and use. Therefore, the first objective of this study was to develop a flavor lexicon that can be used by consumers for evaluating milk coffee.

RTD milk coffee products are popular in the Japanese beverage market. To develop increasingly attractive products, manufacturers try to analyze the flavor characteristics of products and their relationship to consumer preferences. A descriptive analysis is useful for describing the flavor characteristics of products, for example, by using a quantitative descriptive analysis (QDA) (Stone and Sidel, 2004). This analysis provides useful information about sensory characteristics. However, the enrollment and retention of a trained panel is costly and time-consuming for manufacturers (Ares et al., 2011), and a QDA does not provide information about consumer preferences because QDA panels typically consist of relatively small numbers of trained panelists (Ares and Varela, 2017; Mello et al., 2019). Analyzing the correlations between flavor characteristics and consumer preferences, therefore, requires both a QDA and a survey of preferences, followed by statistically calculating the effects of flavor characteristics on hedonic data. However, such analyses are time-consuming and are thus difficult to apply to product development.

In this respect, check-all-that-apply (CATA) questions are a useful method. Consumer CATA panelists are presented with a list of attributes and asked to check which words appropriately describe their experience with the samples (Meyners and Castura, 2014). Since CATA can be combined with hedonic questions, correlations between described flavor characteristics and hedonic data can be calculated quickly (Meyners and Castura, 2014), allowing CATA to contribute to product development by saving time and increasing efficiency. Several comparative studies showed that CATA can describe flavor characteristics (Ares et al., 2015; Mello et al., 2019), but there are few studies on the terminology for CATA regarding milk coffee. Appropriate evaluation of the flavor characteristics of milk coffee with CATA and obtaining consumer perceptions require the development of consumer-oriented optimized sensory terms to evaluate the flavor characteristics of milk coffee.

Given the above, this study aimed to: 1) develop a flavor lexicon that can be used by consumers to evaluate milk coffee using terms provided by consumers, and 2) discuss the utility of the terms in the developed lexicon to CATA, and compare the results of CATA with QDA.

Materials and Methods

Coffee samples and roasting  Green coffee beans from Brazil, Colombia, Ethiopia, Indonesia, and Vietnam were used; the commodity names were Brazil no. 2, Colombia Supremo, Ethiopia Sidamo grade 4, Indonesia Mandheling grade 1, and Vietnam Robusta grade 1, respectively. All beans were roasted to an L value of 18 or 23 using a Probat L-5 roaster (Emmerich, Germany). The L value of the ground roasted coffee beans (particle size < 500 µm) was measured using a ZE-2000 color meter (Nippon Denshoku Industries Co., Ltd., Tokyo, Japan). For extraction, roasted beans were ground by Tokyo Allied Coffee Roasters Co., Ltd. (Tokyo, Japan) to a particle size in the range of 1 000 to 2 000 µm using a GRN-1041 grinder (Nippon Granulator Co., Ltd., Shizuoka, Japan), and then 2 100 g of ground roasted coffee beans was extracted using a 10.5 L column extractor (Towa Techno Co., Ltd., Tokyo, Japan). Coffee extract (8 000 g, about 6.3 Brix) was obtained using reverse osmosis water at 100 °C and immediately cooled to 10 °C or lower.

Sample preparation  The lexicon was developed by preparing 60 types of milk coffee samples by combinations of five parameters: coffee beans (Brazil no. 2, Colombia Supremo, Ethiopia Sidamo grade 4, Indonesia Mandheling grade 1, or Vietnam Robusta grade 1), degree of roasting (L value 18 or 23), sugar (with or without), composition (milk-rich or coffee-rich type), and milk fat (full or low fat) (Table 1). Sugar (Hokkaido Sugar Co., Ltd., Tokyo, Japan) was dissolved in reverse osmosis water. Full-fat milk (nonfat milk solids: 8.3 %, milk fat: 3.5 %) and low-fat milk (nonfat milk solids: 8.4 %, milk fat: 1.5 %) were obtained commercially (Morinaga Milk Industry Co., Ltd., Tokyo, Japan) and were pasteurized at 130 °C for 2 s with a plate-type heat exchanger. The materials and reverse osmosis water were mixed and stored in the dark at 5 °C or lower until use.

Table 1 Samples used to develop the lexicon and their composition.
Full-fat milk without sugar Low-fat milk without sugar Full-fat milk with sugar
Full-fat milka  70.0  40.0  70.0  40.0  70.0  40.0  70.0  40.0
Low-fat milkb  70.0  40.0  70.0  40.0
Sugar   2.3   4.5   2.3   4.5
Coffee extractc
  L value 18  20.0  40.0  20.0  40.0  20.0  40.0
  L value 23  20.0  40.0  20.0  40.0  20.0  40.0
Water  10.0  20.0  10.0  20.0  10.0  20.0  10.0  20.0  7.7  15.5  7.5  15.5
Total 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0
(Units: g)
a  Milk solids-nonfat: 8.3 %, milk fat: 3.5 %.

b  Milk solids-nonfat: 8.4 %, milk fat: 1.5 %.

c  Prepared from Brazil no. 2, Colombia Supremo, Ethiopia Sidamo grade 4, Indonesia Mandheling grade 1, or Vietnam Robusta grade 1.

For CATA and QDA, the number of samples provided to panelists per day was limited to six, considering fatigue. The six types of milk coffee samples shown in Table 2 were selected from the above 60 types, as these covered the range of parameters that were investigated in this study, namely, coffee beans, degree of roasting, with or without sugar, and composition.

Table 2 Samples for CATA and QDA and their composition.
Sample
BL18 BL18-2 BL23 BL18S EL18 EL23
Full-fat milka  70.0  40.0  70.0  70.0  70.0  70.0
Coffee extract
 B#2b L value 18  20.0  40.0  20.0
 B#2b L value 23  20.0
 EG4c L value 18  20.0
 EG4c L value 23  20.0
Sugar   2.3
Water  10.0  20.0  10.0   7.7  10.0  10.0
Total 100.0 100.0 100.0 100.0 100.0 100.0
(Unit: g)
a  Milk solids-nonfat: 8.3 %, milk fat: 3.5 %.

b  Brazil no. 2

c  Ethiopia Sidamo grade 4

Lexicon development  The 60 kinds of samples shown in Table 1 were evaluated, and flavor characteristics were collected by verbal description. Although the combination of 5 coffee beans, 2 degrees of roasting, 2 levels of sugar, 2 levels of milk, and 2 levels of milk fat results in 80 samples, samples based on the combination of low fat and 0.0 % sugar were omitted because this combination was uncommon in milk coffee products, resulting in 60 types of samples. The samples were presented to untrained panelists with white plastic cups, lids, and straws, since most milk coffee products in the market are provided with plastic cups and straws. The panel comprised 203 untrained consumers (78 males and 125 females, ranging in age from 25 to 57 years). Although the frequency of coffee consumption was not asked in their recruitment, people who dislike coffee were excluded. The cups were randomly labeled. Samples were provided one by one, and it was recommended that panelists refresh their mouth with drinking water between sample evaluations. There was no time limit for evaluation and panelists were able to evaluate samples at their own pace. The sample volume was approximately 80 ml and the temperature was below 10 °C. It was not mandatory for the panelists to drink all 80 ml of the samples. The evaluation was conducted in a quiet room. The room temperature was 24 °C and the humidity was 55 %. The participants were briefed on the contents of the cups before the evaluation, and after acknowledging that participation in the evaluation was voluntary, written informed consent was obtained.

It was not practical for all 203 panelists to evaluate all 60 samples. Instead, the number of panelists per sample was uniformly set at 40, while different numbers of samples were evaluated by each panelist; the fewest was three samples, the highest was 21, and the median was 12. Considering fatigue, the number of samples provided to panelists per day was limited to six.

Following the collection of sensory terms, qualitative screening was conducted by discussion with 12 experts in the development of beverage products. The criteria for qualitative screening were: 1) terms with the same meanings were unified (for example, “milk flavor”, “milky flavor”, and “milk like flavor”), 2) abstract or unclear terms were excluded (for example, “flat”, “tasty”, and “pleasant”), and 3) terms believed unnecessary in describing the flavor characteristics of milk coffee (for example, “beer”, “met al”, and “mushroom”) were excluded. These steps were conducted with reference to existing coffee lexicons (Seo et al., 2009; Hayakawa et al., 2010; Chambers et al., 2016; Spencer et al., 2016). Next, quantitative screening was conducted, excluding terms used by only one panelist out of the 203 (i.e., when the frequency of use was less than 0.5 %).

Collected terms were arranged into a hierarchical structure through discussion with the 12 experts, following the method described by Lawless and Civille (2013). The similarity of the terms was discussed and the terms were grouped into “subcategories”. The order of the terms within a subcategory was decided according to the frequency of the term. The name of a subcategory was represented by the term appearing with the highest frequency in that subcategory (data not shown). The subcategories were grouped into “categories”. The order of the subcategories was decided according to their frequency of use.

Check-all-that-apply  The obtained terms were used in the CATA questions for evaluating milk coffee samples with distinct characteristics. The six milk coffee samples shown in Table 2 were presented with white plastic cups, lids, and straws to 74 untrained panelists comprising 34 males and 40 females in the same age bracket as above. Although the frequency of coffee consumption was not asked in their recruitment, people who dislike coffee were excluded. Most of them had participated in the lexicon development. The cups were randomly labeled and provided one by one in random order. The panelists were instructed to drink the sample using a straw, and it was recommended to refresh their mouth with drinking water, and adequate breaks between evaluations were provided. The sample volume was approximately 80 ml and the temperature was below 10 °C. It was not mandatory for the panelists to drink all 80 ml of the samples. The panelists were first instructed to indicate their impression of the samples using a 7-point structured hedonic scale ranging from “dislike extremely” (1) to “like extremely” (7). Then, they were asked to check all attributes they considered appropriate to describe their perception of the samples in a questionnaire composed of the terms from the lexicon. The hedonic scale and questionnaire were provided using FIZZ software version 2.51 (Biosystems, Coutemon, France), and the position of each term within the questionnaire was randomized. The evaluation was conducted in a dedicated sensory evaluation booth. The booth temperature was 25 °C and the humidity was 55 %. The participants were briefed on the contents of the cups before the evaluation, and after acknowledging that participation in the evaluation was voluntary, informed consent was obtained.

Quantitative descriptive analysis  A QDA panel was established as reported in a previous study (Ikeda et al., 2019). Following its establishment, the panel was periodically trained following the methods of Stone and Sidel (2004). The same six milk coffee samples used in the CATA questions were presented with white plastic cups, lids, and straws to the trained panel, which comprised 13 females ranging in age from 40 to 60 years. The panelists evaluated the sensory attributes of the milk coffee samples based on 22 attributes (Table 3) selected through the establishment process, using a 15 cm line scale from “weak” (0) to “strong” (15) (FIZZ software version 2.51). The attributes consisted of 4 categories, which were “aroma” (orthonasal aroma), “flavor” (consisting of retronasal aroma and taste), “mouthfeel”, and “aftertaste”. The sample volume was approximately 100 ml, and it was not mandatory for the panelists to drink all 100 ml. Samples were provided one by one, and the panelists were recommended to refresh their mouth with drinking water between sample evaluations. The panelists were instructed to drink the sample using a straw. The cups were randomly labeled and provided in random order. There was no time limit for evaluation, and adequate breaks between evaluations were provided. The evaluation was conducted in a dedicated sensory evaluation booth. The evaluation session was performed three times and the panelists evaluated the six milk coffee samples every session, resulting in triplicate results per panelist for each sample. The booth temperature was 22 °C and the humidity was 27 %. The participants were briefed on the contents of the cups before the evaluation, and after acknowledging that participation in the evaluation was voluntary, written informed consent was obtained.

Table 3 Sensory attributes for QDA.
No. Category No. Sensory attribute Definition
1 Aroma  1 Sweet aroma Sweet aroma
 2 Coffee aroma Coffee aroma
 3 Bitter aroma Bitter aroma
 4 Caramel-like aroma Aroma with a hint of caramel
 5 Milk aroma Milk aroma
2 Flavor  6 Sweet flavor Sweet flavor derived from milk, coffee, and sugar
 7 Milk flavor Milk flavor
 8 Caramel-like flavor Flavor with a hint of caramel
 9 Richness of coffee flavor Richness of coffee flavor
10 Bitter flavor Bitter flavor derived from coffee
11 Acidic flavor Acidic flavor derived from coffee
12 Mild flavor Well-balanced and round flavor
13 Rich flavor Strong flavor of fat
3 Mouthfeel 14 Creamy mouthfeel Creamy feeling derived from fat
15 Rich mouthfeel Thick and strong feeling
16 Fresh mouthfeel Light feeling derived from low fat
4 Aftertaste 17 Aftertaste of refreshness Lightness and short-lasting aftertaste
18 Aftertaste of milk flavor Aftertaste of milk flavor
19 Aftertaste of sweetness Aftertaste of sweetness
20 Aftertaste of coffee flavor Aftertaste of coffee flavor
21 Aftertaste of bitterness Aftertaste of bitterness
22 Aftertaste of astringency Aftertaste of astringency

Statistical analysis  In this study, the data were analyzed for the following questions: 1) Is there significant differences between the 6 samples used in CATA (Table 2) based on preference (hedonic evaluation), 2) Which of the collected terms are effective in discriminating between different samples, 3) Which terms are positively and negatively related to preference, 4) How can the 6 samples be explained by the CATA data, and does this explanation differ from that obtained from the QDA. The following analyses were performed to evaluate each question. 1) ANOVA was performed on the hedonic data acquired for each sample, and significant differences between samples were evaluated by the Tukey-Kramer honestly significant difference (HSD) test. 2) To evaluate which terms were effective in discriminating between different samples, Cochran's Q test was applied to the frequency of terms selected for each sample. 3) The relationship between the frequencies of terms selected and the hedonic data for each sample was analyzed using penalty analysis. 4) The frequencies of terms selected for each sample were used as explanatory variables in the correspondence analysis (CA), while the QDA results were analyzed using principal component analysis (PCA). The results of CA from the CATA data and PCA from the QDA results were compared by calculating the RV coefficient to understand the similarities and differences between the two sensory analysis methods. All statistical analyses were performed using XLSTAT (Ver. 2020. 5. 1. 1060, Mindware Inc., Okayama, Japan).

Results and Discussion

Development of the flavor lexicon  The coffees used in this study encompassed the major coffees consumed in Japan, including the Arabica and Robusta varieties, as well as “dry” and “wet” processing (Smith, 1985). The 60 kinds of milk coffee samples used to collect the sensory terms were prepared to cover a wide range of flavors. The samples were presented to 203 untrained panelists, and 456 terms were initially collected by verbal description. Qualitative screening excluded 394 terms with the same meaning, such as “flavor of milk”, “sweet flavor of milk”, “milk flavor”, “sweetness of milk”, “milk taste”, and “taste of milk”. As a result of qualitative screening, 62 terms were obtained. Of the 62 terms, 9 terms used by only one panelist out of the 203, such as “mint”, “spice”, “pepper”, “cocoa”, “forest”, and “medicine” were excluded (quantitative screening). As a result, 53 terms were selected, and the lexicon was developed by arranging the terms into a hierarchical structure (Table 4).

Table 4 Flavor lexicon for milk coffee for Japanese consumers.
No. Category No. Subcategory No. Term (in Japanese)
1 Taste  1 Sweetness  1 Sweet (amami)a, b, c, d
 2 Sugar (satou-no-amasa)
 3 Lightly sweet (saratto-shita-amasa)
 4 Refreshing sweet (sawayaka-na-amasa)
 2 Bitterness  5 Bitter (nigami)a, b, c, d
 6 Rounded bitter (horonigasa)b
 3 Acidity  7 Acidic (sanmi)a, b, c, d
 4 Mildness  8 Mild (maroyaka)b
 5 Astringency  9 Astringent (shibumi)a, b
 6 Rich 10 Rich (koku)a, b
 7 Uncleanness 11 Unclean (zatsumi)b
 8 Harshness 12 Harsh (egumi)b
 9 Salty 13 Salty (enmi)a, b, c, d
10 Umami 14 Umami (umami)a
2 Aroma 11 Coffee 15 Coffee (coffee)
16 Coffee beans (coffee beans)
17 Espresso (espresso)
18 Instant coffee (instant coffee)
19 Ground coffee (coffee-hunmatsu)
12 Milk 20 Milk (milk)a
21 Condensed milk (rennyu)
13 Roast 22 Roasted (roast)b, c, d
23 Burnt (koge)a, b, c, d
24 Smoky (smoke)a, b, c, d
25 Carbony (sumi)b
26 Tarry (tobacco)a, b, c, d
14 Fruity 27 Fruity (fruity)a, b, c, d
28 Flower (hana)a, b, c, d
29 Citrus (kankitsu)b, c, d
30 Honey (hachimitsu)d
31 Cherry (sakuranbo)
15 Fatty 32 Fatty (sibou)
33 Cream (cream)
34 Coffee whitener (potion cream)
16 Artificial 35 Artificial (jinkouteki)
36 Chemical (kagakuteki)b, c, d
17 Soybean 37 Soybean (daizu)b
38 Barley (mugi)a, b
18 Caramel 39 Caramel (caramel)a, b, c, d
19 Nuts 40 Nutty (nuts)a, b, c, d
41 Almond (almond)c, d
20 Tea 42 Black tea (koucha)b, d
21 Chocolate 43 Chocolate (chocolate)b, c, d
44 Cacao (cacao)
45 Bitter chocolate (bitter chocolate)a, c, d
22 Green 46 Green (aoi)b, c, d
47 Grass (kusa)b, b
23 Wood 48 Wood (ki)a, b, c, d
24 Soil 49 Earthy (tsuchi)a, b, d
3 Mouthfeel 25 Mouthfeel 50 Crisp mouthfeel (kire-no-aru-kuchi-atari)b
51 Rich mouthfeel (noukou-na-kuchi-atari)a, b
52 Creamy mouthfeel (creamy-na-kuchi-atari)
53 Coarse mouthfeel (zaratsuita-kuchi-atari)a, b
a  Indicates that the term is similar with that in the list of Seo et al. (2009).

b  Indicates that the term is similar with that in the list of Hayakawa et al. (2010).

c  Indicates that the term is similar with that in the list of Spencer et al. (2016).

d  Indicates that the term is similar with that in the list of Chambers et al. (2016).

Although these terms were obtained from consumers, many of them were the same or similar to the terms collected in previous studies, where terms were obtained from a trained panel or coffee professionals (Seo et al., 2009; Hayakawa et al., 2010; Chambers et al., 2016; Spencer et al., 2016) (Table 4). It is notable that 32 out of 53 terms were considerably similar with those of Hayakawa et al. (2010). This is not only a consequence of the qualitative screening step, where the terms believed unnecessary in describing the flavor characteristics of milk coffee were excluded in reference to these previous studies, but also because they were collected from people with the same cultural background (language and expressions).

Despite preparing a wide range (60 types) of milk coffee samples and collecting terms from a large number (203) of consumers, the final number of terms in this study was smaller than that of other coffee lexicons (Hayakawa et al., 2010; Chambers et al., 2016; Spencer et al., 2016). One of the main reasons for this is that the terms were collected from untrained panelists (i.e., consumers). There was a large number of ambiguous and abstract terms such as “flat”, “light”, “heavy”, “dry”, “pleasant”, “gentle”, “glamorous”, “unnatural”, resulting in the elimination of many of the 456 terms in the qualitative screening. As pointed out by Chollet and Valentin (2001), the description of flavor characteristics by the untrained panel tended to be less precise and specific.

Adding milk impacted the terms used. The addition of milk suppresses flavor release (Akiyama et al., 2009), and these changes are also affected by the content of milk fat and the lipophilicity of each flavor compound (Akiyama et al., 2016). Thus, the addition of milk would affect flavor release from the coffees, making it difficult for consumers to perceive some flavor characteristics of the coffee. In addition, the samples were provided to the panel with covered plastic cups and straws, and the sample temperature was below 10 °C. These conditions differ from previous studies (Seo et al., 2009; Hayakawa et al., 2010; Chambers et al., 2016; Spencer et al., 2016). Using covered cups and straws may have suppressed the orthonasal aroma. Moreover, the amount of flavor release from beverages generally decreases with lower temperatures. These factors could also have affected consumer perception.

Flavor lexicons are typically developed using a highly trained panel (Lawless and Civille, 2013). In contrast, the lexicon developed in this study used terms collected from consumers, since milk coffee is usually provided to the end consumer and our aim was to develop a flavor lexicon easy for consumers to understand. Therefore, the lexicon developed in this study was different from other lexicons developed by trained panels. The fewer terms included in our lexicon does not affect its usefulness.

Confirmation of the validity of the terms by CATA  In this study, CATA was conducted using the flavor lexicon consisting of the 53 terms described above. The number of terms used was large compared to several other studies using CATA (Dooley et al., 2010, Alencar et al., 2019, Lee et al., 2013). Conducting CATA with too many terms may result in bias and fatigue effects (Meyners and Castura, 2014). In addition, the flavor must last sufficiently long (such as in the case of chewing gum) for the panel to go through the long list of terms (Meyners and Castura, 2014). These two points were not problematic in this study. For example, the sample volume was 80 ml, which is sufficiently large to enable the panel to evaluate all the terms. Furthermore, the CATA terms were obtained from the consumers themselves, making their meaning easy to understand, thus reducing fatigue during the evaluation. Given these factors, we conducted CATA using all 53 terms.

All 53 terms were used at least once in CATA (Table 5) and there were significant differences among the samples in the frequency of use of 42 terms. There was no significant difference in the frequency of use of the remaining 9 terms among the different samples. It should be noted that the sample set being evaluated largely affects the use of terms; a less diverse sample set may lead to less terms with significant differences in the frequency of use among samples. The sample compositions used in this study were practical and applicable for product development, supporting the validity of the terms when used for a broad but not extreme sample range. In addition, 32 out of 53 terms were considerably similar with the ones in Hayakawa et al. (2010), validating the terms obtained in this study. Overall, the results suggested that the flavor lexicon contained appropriate terms for consumers to describe the flavor characteristics of milk coffee.

Table 5 Frequency and percentage of use of the terms and preference means in CATAa.
No. Term Sampleb p value (Cochran's Q test)
BL18 BL18-2 BL23 BL18S EL18 EL23
1 Sweet 7 0 11 55 3 6 <0.05
(9.5) (0.0) (14.9) (74.3) (4.1) (8.1)
2 Sugar 0 0 5 38 2 1 <0.05
(0.0) (0.0) (6.8) (51.4) (2.7) (1.4)
3 Lightly sweet 14 1 8 20 14 21 <0.05
(18.9) (1.4) (10.8) (27.0) (18.9) (28.4)
4 Refreshing sweet 5 2 7 13 7 4 <0.05
(6.8) (2.7) (9.5) (17.6) (9.5) (5.4)
5 Bitter 48 72 40 22 41 29 <0.05
(64.9) (97.3) (54.1) (29.7) (55.4) (39.2)
6 Rounded bitter 44 17 46 41 39 48 <0.05
(59.5) (23.0) (62.2) (55.4) (52.7) (64.9)
7 Acidic 22 36 21 12 23 26 <0.05
(29.7) (48.6) (28.4) (16.2) (31.1) (35.1)
8 Mild 21 1 29 45 24 28 <0.05
(28.4) (1.4) (39.2) (60.8) (32.4) (37.8)
9 Astringent 35 56 28 18 30 29 <0.05
(47.3) (75.7) (37.8) (24.3) (40.5) (39.2)
10 Rich 27 5 19 28 25 18 <0.05
(36.5) (6.8) (25.7) (37.8) (33.8) (24.3)
11 Unclean 17 34 21 11 24 20 <0.05
(23.0) (45.9) (28.4) (14.9) (32.4) (27.0)
12 Harsh 21 49 22 11 24 19 <0.05
(28.4) (66.2) (29.7) (14.9) (32.4) (25.7)
13 Salty 8 2 5 3 8 10 <0.05
(10.8) (2.7) (6.8) (4.1) (10.8) (13.5)
14 Umami 6 0 7 8 10 11 <0.05
(8.1) (0.0) (9.5) (10.8) (13.5) (14.9)
15 Coffee 45 39 42 48 53 36 <0.05
(60.8) (52.7) (56.8) (64.9) (71.6) (48.6)
16 Coffee beans 21 26 16 13 21 17 0.084
(28.4) (35.1) (21.6) (17.6) (28.4) (23.0)
17 Espresso 24 17 13 13 18 17 <0.05
(32.4) (23.0) (17.6) (17.6) (24.3) (23.0)
18 Instant coffee 10 12 25 16 22 17 <0.05
(13.5) (16.2) (33.8) (21.6) (29.7) (23.0)
19 Ground coffee 15 22 14 12 16 13 0.161
(20.3) (29.7) (18.9) (16.2) (21.6) (17.6)
20 Milk 30 5 35 46 31 32 <0.05
(40.5) (6.8) (47.3) (62.2) (41.9) (43.2)
21 Condensed milk 0 0 1 11 0 1 <0.05
(0.0) (0.0) (1.4) (14.9) (0.0) (1.4)
22 Roasted 43 41 27 34 35 37 <0.05
(58.1) (55.4) (36.5) (45.9) (47.3) (50.0)
23 Burnt 26 57 32 16 38 31 <0.05
(35.1) (77.0) (43.2) (21.6) (51.4) (41.9)
24 Smoky 19 31 13 14 26 21 <0.05
(25.7) (41.9) (17.6) (18.9) (35.1) (28.4)
25 Carbony 20 41 13 11 18 12 <0.05
(27.0) (55.4) (17.6) (14.9) (24.3) (16.2)
26 Tarry 12 21 4 3 6 4 <0.05
(16.2) (28.4) (5.4) (4.1) (8.1) (5.4)
27 Fruity 5 3 4 6 5 12 0.062
(6.8) (4.1) (5.4) (8.1) (6.8) (16.2)
28 Flower 3 0 2 4 4 11 <0.05
(4.1) (0.0) (2.7) (5.4) (5.4) (14.9)
29 Citrus 1 1 0 1 0 2 0.594
(1.4) (1.4) (0.0) (1.4) (0.0) (2.7)
30 Honey 2 0 3 18 2 4 <0.05
(2.7) (0.0) (4.1) (24.3) (2.7) (5.4)
31 Cherry 0 1 1 0 1 1 0.818
(0.0) (1.4) (1.4) (0.0) (1.4) (1.4)
32 Fatty 18 3 18 19 19 23 <0.05
(24.3) (4.1) (24.3) (25.7) (25.7) (31.1)
33 Cream 11 1 16 12 9 11 <0.05
(14.9) (1.4) (21.6) (16.2) (12.2) (14.9)
34 Coffee whitener 10 2 16 7 12 16 <0.05
(13.5) (2.7) (21.6) (9.5) (16.2) (21.6)
35 Artificial 5 14 14 20 15 22 <0.05
(6.8) (18.9) (18.9) (27.0) (20.3) (29.7)
36 Chemical 1 7 6 7 6 14 <0.05
(1.4) (9.5) (8.1) (9.5) (8.1) (18.9)
37 Soybean 5 1 8 1 3 8 <0.05
(6.8) (1.4) (10.8) (1.4) (4.1) (10.8)
38 Barley 9 10 13 4 7 7 0.110
(12.2) (13.5) (17.6) (5.4) (9.5) (9.5)
39 Caramel 4 0 3 22 2 5 <0.05
(5.4) (0.0) (4.1) (29.7) (2.7) (6.8)
40 Nutty 12 5 17 10 15 19 <0.05
(16.2) (6.8) (23.0) (13.5) (20.3) (25.7)
41 Almond 7 2 7 6 6 7 0.572
(9.5) (2.7) (9.5) (8.1) (8.1) (9.5)
42 Tea 1 1 1 3 2 4 0.509
(1.4) (1.4) (1.4) (4.1) (2.7) (5.4)
43 Chocolate 5 0 6 14 3 3 <0.05
(6.8) (0.0) (8.1) (18.9) (4.1) (4.1)
44 Cacao 16 14 18 17 10 11 0.331
(21.6) (18.9) (24.3) (23.0) (13.5) (14.9)
45 Bitter chocolate 18 18 16 22 23 12 0.194
(24.3) (24.3) (21.6) (29.7) (31.1) (16.2)
46 Green 6 9 9 2 9 13 <0.05
(8.1) (12.2) (12.2) (2.7) (12.2) (17.6)
47 Grass 4 4 4 3 4 5 0.975
(5.4) (5.4) (5.4) (4.1) (5.4) (6.8)
48 Wood 9 14 11 3 9 9 <0.05
(12.2) (18.9) (14.9) (4.1) (12.2) (12.2)
49 Earthy 17 24 14 6 10 9 <0.05
(23.0) (32.4) (18.9) (8.1) (13.5) (12.2)
50 Crisp mouthfeel 24 32 18 8 24 16 <0.05
(32.4) (43.2) (24.3) (10.8) (32.4) (21.6)
51 Rich mouthfeel 12 4 10 18 14 9 <0.05
(16.2) (5.4) (13.5) (24.3) (18.9) (12.2)
52 Creamy mouthfeel 19 1 16 34 19 22 <0.05
(25.7) (1.4) (21.6) (45.9) (25.7) (29.7)
53 Coarse mouthfeel 9 13 8 3 10 4 <0.05
(12.2) (17.6) (10.8) (4.1) (13.5) (5.4)
Preference meanc 4.20 2.68 3.86 4.78 4.12 3.57
Standard deviations 1.42 1.48 1.40 1.44 1.49 1.61
TK HSD testd AB C B A AB B
a  n = 74

b  BL18: 20 % of coffee extract of Brazil no. 2 (L value 18) and 70 % of full-fat milk; BL18-2: 40 % of coffee extract of Brazil no. 2 (L value 18) and 40 % of full-fat milk; BL23: 20 % of coffee extract of Brazil no. 2 L value 23) and 70 % of full-fat milk; BL18S: 20 % of coffee extract of Brazil no. 2 (L value 18), 70 % full-fat milk, and 2.3 % of sugar; EL18: 20 % of coffee extract of Ethiopia Sidamo grade 4 roasted (L value 18) and 70 % of full-fat milk; EL23: 20 % of coffee extract of Ethiopia Sidamo grade 4 (L value 23) and 70 % of full-fat milk.

c  Structured 7-point hedonic scale ranging from “dislike extremely” (1) to “like extremely” (7).

d  Tukey-Kramer honestly significant difference (HSD) test. There are significant differences between samples not having the same letters (p < 0.05).

Characteristics of the milk coffee flavor lexicon  The developed flavor lexicon consists of three categories: “taste” (mainly perceived in the mouth) described by 14 terms, “aroma” (orthonasal or retronasal) described by 35 terms, and “mouthfeel” described by 4 terms (Table 4). Terms that are the same or similar with those of previous studies (Seo et al., 2009; Hayakawa et al., 2010; Chambers et al., 2016; Spencer et al., 2016) were annotated in Table 4. The “taste” category consisted of ten subcategories. Terms no. 1 to 4 were related to sweetness. Although sample BL18S was described using a large number of “sweet” and “sugar” terms because it contained sugar, few “lightly sweet” and “refreshing sweet” terms were used (Table 5). “Unclean” (no. 11) and “harsh” (no. 12) are terms regularly used by Japanese consumers. “Unclean” is used to describe something spoiling the original flavor, and “harsh” is used to describe an unpleasant feeling on the tongue or throat.

The “aroma” category contained the largest number of terms, with the largest subcategory consisting of terms related to coffee aroma. Terms no. 15 to 19 indicated images of coffee inspired by the samples and were somewhat abstract flavor characteristics that would not be used in lexicons for trained panels. However, one purpose of this study was to obtain consumer perceptions of flavor characteristics of milk coffee, and consumers' impressions about coffee flavor is practical information for discussing the flavor characteristics of products during product development. Therefore, these terms were not deleted in qualitative screening. In addition, the frequency of the use of these words in CATA was significantly different between samples, validating their value.

In contrast, terms no. 22 to 31 and no. 37 to 49 were specific expressions of coffee flavor. Although many are present in existing coffee lexicons, “fruity” (no. 27) and “flower” (no. 28) are noteworthy. The frequency of use of “fruity” and “flower” for EL23 was higher than for other samples, and there was a significant difference in “flower” between the samples (Table 5). The results suggest that “fruity” and “flower” were selected as a result of the perception of the characteristic flavor of Ethiopian coffee in milk coffee products, because Ethiopian coffee has a distinctive “mocha” flavor (Smith, 1985), and 4-(4'-hydroxyphenyl)-2-butanone (raspberry ketone, sweet-fruity odor) was identified as a component contributing to the characteristic aroma of Ethiopian coffee (Akiyama et al., 2008). On the other hand, the term “chemical” (no. 36) was often used to describe EL23 (Table 5), although chemical ingredients such as flavor compounds were not used in this study. Some untrained panelists (i.e., consumers) unfamiliar with the flavor of medium roasted Ethiopian coffee might perceive the flavor as unpleasant for coffee and expressed this as “chemical”.

“Soybean” (no. 37) is also noteworthy. Although the lexicons of Spencer et al. (2016) and Chamber IV et al. (2016) did not use terms indicating a soybean-like flavor, the lexicon of Hayakawa et al. (2010) contained “roasted soybean” and “soybean flour”. Soybean is a Japanese traditional food and soybean descriptors are often generated by Japanese panels (Hayakawa et al., 2010). “Soybean” was used eight times to describe BL23 and EL23, five times for BL18, three times for EL18, and once for BL18-2 and BL18S, and there was a significant difference between samples (Table 5). These results indicated that “soybean” is a common expression for Japanese consumers and this characteristic tended to be perceived when medium roasted coffee without sugar was mixed with milk.

The terms “milk” (no. 20), “condensed milk” (no. 21), “fatty” (no. 32), “cream” (no. 33) and “coffee whitener” (no. 34) represented the characteristics of milk flavor. Five terms were related to milk flavor, whose number is small compared to those of previous studies in which evaluations were conducted using a trained panel and several types of milk samples (Chapman et al., 2001, Lee et al., 2017). In addition to using a consumer panel, the detailed flavor characteristics of milk seemed to be less perceived in milk coffee. The frequency of use of “condensed milk” (no. 21) and “caramel” (no. 39) for BL18S was high, suggesting that these terms are important for describing the flavor characteristics of sweetened milk coffee (Table 5).

The four terms used to describe no. 50 to 53 were related to mouthfeel, and their frequency of use differed significantly between the samples (Table 5). BL18-2, with a strong coffee flavor, was evaluated as having a crisp and coarse mouthfeel, and BL18S with added sugar was evaluated as having a rich and creamy mouthfeel (Table 5), suggested that mouthfeel characteristics of milk coffee are considerably affected by the balance of coffee, milk, and sugar.

The flavor lexicon contains terms common in existing coffee lexicons and useful terms for describing the characteristic flavor and mouthfeel of milk coffee, including unique terms in Japanese. Japanese consumers will therefore find the lexicon developed in this study or its applications (e.g., a flavor wheel) useful for describing the variety of flavor characteristics of milk coffee.

Comparison of the results of CATA and QDA  To evaluate the effectiveness of CATA as a descriptive analysis method, the performance of CATA for separating samples based on different flavor characteristics was compared against the standard QDA method. In this study, CATA and QDA were used to evaluate the same set of samples (Table 2). On one hand, CATA was performed with 74 untrained panelists and the 53 terms collected from consumers were used for evaluation. Conversely, QDA was performed with a trained panel who used 22 attributes that were selected based on training and discussion among the panelists. Figure 1 shows the results of a correspondence analysis of the CATA data. The contribution ratios of axes F1 and F2 were 68.91 % and 21.18 %, respectively. BL18S was characterized by “chocolate”, “caramel”, and “honey”, and was located nearer to “sweet”, “sugar”, and “condensed milk” compared to the other samples. BL18-2 was characterized by “tarry” and “carbony”, which show a deep roasted strong coffee flavor. The other 4 samples were located near the point of origin. EL18 using Ethiopian coffee (instead of Brazilian coffee) was located near BL18, indicating that the difference in flavor between BL18 and EL18 was not distinguished clearly. BL23 and EL23, using medium roasted coffee (L value 23) instead of deep roasted coffee (L value 18), were located to the positive side of the F1 axis (i.e., right side) from the group comprising BL18 and EL18.

Fig. 1.

Correspondence analysis of CATA data. Of the 53 terms, the terms that showed a significant difference between samples are shown (p < 0.05). Sample details are shown in Table 2.

The same six samples were evaluated by a QDA panel, and the results of PCA are shown in Figure 2. The contribution ratios of axes F1 and F2 were 85.34 % and 11.03 %, respectively. The positive direction of the F1 axis indicates strong bitterness, acidity, and coffee flavor. The negative direction indicates strong sweetness, milk flavor, and caramel flavor. BL18S was located far in the negative direction of the F1 axis and was characterized by sweetness and milk flavor. BL18-2 was located far in the positive direction of the F1 axis and was characterized by bitterness and coffee flavor. BL18 and EL18 were located near the intersection of the axes. BL23 and EL23 were located far to the negative side of the F2 axis from BL18 and EL18, indicating the possibility that the different flavors between medium roast and deep roast were perceived by the panel.

Fig. 2.

Biplot of principal component scores and loadings for each sensory description of QDA. Sample details are shown in Table 2.

Comparing the results of CA and PCA, the position of each sample was similar, although the F1 axis was inverted. The sample configuration was compared by calculating the RV coefficient based on the first two dimensions. The RV coefficient was 0.944 (p = 0.013), which was significant, high, and close to 0.95, indicating that CATA discriminated differences between the samples in a similar manner to QDA (Ares et al., 2015; Mello et al., 2019) for this set of samples. However, it should be noted that the overall configuration would be affected by the sample set evaluated, especially by the more distinguishable samples (namely, BL18-2 and BL18S). Therefore, in order to measure the distinguishable power of the CATA panel compared to QDA panel, further studies will be required.

Although QDA did not provide hedonic information, it quantitatively evaluated the intensity of flavor attributes of milk coffee such as “sweet” and “bitter” into the four categories “aroma”, “taste”, “mouthfeel”, and “aftertaste”. On the other hand, CATA characterized the samples using multiple descriptive terms and simultaneously provided hedonic information. Therefore, the manner of sample description is different between QDA and CATA. The results suggested that whether a trained panel or a consumer panel should be used to evaluate milk coffee depends on the purpose of the study, consistent with the conclusions of previous studies (Ares and Varela, 2017; Mello et al., 2019).

Although initial time and cost inputs are necessary to develop a new lexicon that is specifically prepared from consumer-derived terms, this is a necessary step for appropriate evaluation by an untrained panel. The alternative choice of using QDA attributes or other existing lexicons would have decreased the evaluation efficiency by untrained panelists (consumers). This is because QDA attributes contain similar terms (e.g., “sweet aroma”, “sweet taste”, and “aftertaste of sweetness”) that would be difficult for untrained panelists to differentiate. In addition, QDA attributes were divided into four categories (aroma, taste, mouthfeel, and aftertaste) in this study, which may be problematic for untrained panelists who are not used to evaluations. Although there is no rule explaining how to generate the terms used for CATA, and the decision concerning which method is applied depends on the researcher (Dooley et al., 2010), the results of this study show that the development of a new lexicon specifically for consumers was worthwhile. In future, it is expected that CATA will be used in the development of milk coffee products, and a wide range of flavor characteristics perceived by consumers will be evaluated efficiently.

Effects of the flavor characteristics on preference  By analyzing the results of CATA, the effects of the flavor characteristics on milk coffee preference were determined. The mean preference was highest for BL18S at 4.78 (Table 5). The second highest were BL18 and EL18, with means of 4.20 and 4.12, respectively. The third highest were BL23 and EL23, with means of 3.86 and 3.57, respectively. However, there was no significant difference between the second and the third preferences. The mean for BL18-2 was 2.68, which was significantly lower than that for the other samples (p < 0.05, Tukey-Kramer HSD test). Deep roasted and sweetened samples tended to be preferred by the panelists.

Figure 3 shows significant effects of the terms on consumer preference using a penalty analysis (p < 0.05). There were eight positive terms for preference and nine negative terms. The positive terms included “mild”, “rich”, “milk”, “creamy mouthfeel”, and “fatty”, which were related to mild and milk flavor. “Rounded bitter” was positive, but “bitter” was negative. “Smoky”, “carbony”, and “burnt” were characteristics of deep roasted and strong coffee flavor, which were negative. These results suggested that a certain strength and richness of milk flavor, and moderate bitterness of coffee, were important to provide milk coffee with high preference characteristics. In other words, a good balance of milk and coffee is important to impart milk coffee with high preference characteristics. Further studies with appropriate sample variation are needed to confirm the effects of the other specific flavor characteristics constituting the lexicon on preference. In future, investigation of statistical correlations among the terms will aid in understanding consumer perceptions of the flavor characteristics of milk coffee.

Fig. 3.

Figure 3-Effects of CATA terms on preference (horizontal axis) analyzed by penalty analysis (p < 0.05). Preference was indicated using a hedonic structured 7-point scale, from “dislike extremely” (1) to “like extremely” (7). The light grey bars show a positive preference (nice to have) and dark grey bars a negative preference (must not have).

Conclusions

A flavor lexicon for milk coffee was developed based on the perceptions of Japanese consumers. The lexicon consisted of 53 terms, which included terms unique to Japanese consumers and those for describing milk coffee. The application of the terms in the lexicon to CATA provided effective descriptions for the flavor characteristics of milk coffee perceived by consumers. The appropriateness of the CATA results was confirmed compared with QDA, showing that the flavor characteristics of milk coffee and correlations with preferences could be evaluated effectively by administering CATA questions once instead of twice in surveys (i.e., the combination of QDA and a consumer survey for preferences). An effective evaluation process, such as CATA in this study, is relatively easily applied to product development. The process of developing terminology and validation of the study is applicable not only to milk coffee but also to other products.

Conflict of interest  There are no conflicts of interest to declare.

References
 
© 2023 by Japanese Society for Food Science and Technology

This article is licensed under a Creative Commons [Attribution-NonCommercial-ShareAlike 4.0 International] license.
https://creativecommons.org/licenses/by-nc-sa/4.0/
feedback
Top