Reliability and Validity of Wisconsin Upper Respiratory Symptom Survey, Korean Version

Background The Wisconsin Upper Respiratory Symptom Survey (WURSS) is a self-administered questionnaire developed in the United States to evaluate the severity of the common cold and its reliability has been validated. We developed a Korean language version of this questionnaire by using a sequential forward and backward translation approach. The purpose of this study was to validate the Korean version of the Wisconsin Upper Respiratory Symptom Survey (WURSS-K) in Korean patients with common cold. Methods This multicenter prospective study enrolled 107 participants who were diagnosed with common cold and consented to participate in the study. The WURSS-K includes 1 global illness severity item, 32 symptom-based items, 10 functional quality-of-life (QOL) items, and 1 item assessing global change. The SF-8 was used as an external comparator. Results The participants were 54 women and 53 men aged 18 to 42 years. The WURSS-K showed good reliability in 10 domains, with Cronbach’s alphas ranging from 0.67 to 0.96 (mean: 0.84). Comparison of the reliability coefficients of the WURSS-K and WURSS yielded a Pearson correlation coefficient of 0.71 (P = 0.02). Validity of the WURSS-K was evaluated by comparing it with the SF-8, which yielded a Pearson correlation coefficient of −0.267 (P < 0.001). The Guyatt’s responsiveness index of the WURSS-K ranged from 0.13 to 0.46, and the correlation coefficient with the WURSS was 0.534 (P < 0.001), indicating that there was close correlation between the WURSS-K and WURSS. Conclusions The WURSS-K is a reliable, valid, and responsive disease-specific questionnaire for assessing symptoms and QOL in Korean patients with common cold.


INTRODUCTION
The "common cold" is caused by a viral infection of the upper respiratory tract. 1 This syndrome is generally self-limiting with time, but its symptoms negatively influence quality of life (QOL), and its prevalence imposes huge social and economic burdens. 2 In Korea, upper respiratory tract infection is the most common reason for outpatient visits. 3 In addition, such infections can result in complications, such as secondary bacterial infections, exacerbation of asthma, and chronic obstructive airway diseases. 4,5 A number of clinical trials have investigated the common cold. [6][7][8][9][10] Cellular immune responses, antibody titers, and lymphocyte proliferation have been used for quantitative assessment of the common cold in these studies. [9][10][11][12] However, these are not suitable for this purpose because the common cold is caused by various viruses, including rhinovirus, which have more than 100 different serotypes. 13 Thus, it is difficult to find specific biomarkers of progression of common cold. For this reason, symptom change is considered a standard indicator of the effect of cold remedies in many clinical studies. 14 However, if symptom change is to serve as a reliable indicator of common cold in clinical studies, it is critical that quantitative scales of such change are established and validated.
The Wisconsin Upper Respiratory Symptom Survey (WURSS), developed by Barrett et al, 15 is a patient-oriented instrument that evaluates patient QOL in an illness-specific manner. Other scale systems, such as the Jackson Scale 16 for the common cold, include only a limited number items regarding symptoms [17][18][19] and have rarely been evaluated with respect to their validity. 16,20 In contrast, the WURSS includes questionnaires regarding global severity, symptoms, functional QOL, and global change 15 and has been verified for responsiveness, reliability, and validity through monitoring of participants with common cold. 21 The WURSS has been widely used in respiratory disease research 11,[22][23][24] and is now accepted as a useful tool in clinical trials of common cold. Because there is no Korean equivalent of the WURSS, we translated the WURSS into the Korean (WURSS-K) and evaluated the validity and reliability of this instrument in patients with upper respiratory tract infection.

METHODS
The WURSS and development of the Korean version The WURSS includes 44 items: 1 global severity item, 32 symptom-based items, 10 functional QOL items, and 1 global change item. All items are based on 7-point Likert-type severity scales. 21 The content of the WURSS is summarized in Table 1.
The first step in developing the WURSS-K was forward translation: a physician bilingual in English and Korean translated the WURSS into Korean. Then, a translation panel composed of internists, an English linguist, a Korean linguist, and a statistician reviewed the translation and prepared the first Korean draft version (WURSS-K version 1.0). The second step was backward translation: another bilingual physician, blinded to the WURSS, back-translated the WURSS-K version 1.0 into English, and then the panel compared the back-translated English version with the WURSS to identify any ambiguities or inaccuracies in the choice of vocabulary. During this step, we were particularly careful to retain the meaning of the original items, maintaining the subtle differences between descriptions such as "feeling run down"/"feeling tired" and "chest congestion"/"chest tightness"/"chest heaviness." The second draft version (WURSS-K version 1.1) was prepared after this step. Finally, a cognitive briefing was performed to test the comprehensibility of the WURSS-K version 1.1. Here, the translation panel interviewed 10 patients (age 21-58 years; 50% male) and determined if Korean words for "sinus" and "swollen glands" could be described by patients in layman terms. After completing the above 3 steps, the WURSS-K was submitted for statistical validation (described below) to evaluate its reliability, validity, and responsiveness. The WURSS-K results were compared with those of the WURSS. 22

Study design
The Daejeon Oriental Hospital Institutional Review Board approved the protocol (authorization number: DJOMC- 19), and all patients understood the purpose and method of this study and provided written informed consent to participate in this research. Participants were recruited from the 3rd to 21st of March 2008 (19 days) in the Daejeon Oriental Hospital of Daejeon University and the Health Center of Daejeon University.
Participants satisfying the following criteria were included in the study: a diagnosis of "common cold" by research physicians, with onset of symptoms within 48 hours. Exclusion criteria included age younger than 18 years or older than 60 years, allergic rhinitis, asthma, chronic obstructive pulmonary disease, sinusitis recurring more than twice per year, anatomical nasal obstruction or deformity, otitis, and exudative pharyngitis.
Participants were asked to complete the WURSS-K daily from the first day for 6 days and were examined with the SF-8 on the third day. The participants were allowed to take cold medicines during this study (Figure 1).
At least 30 cases were required to ensure normality of the estimators in each item of the WURSS-K. The number of participants was calculated based on this requirement. The occurrence rate of each item was estimated from the results of the WURSS, where the lowest occurrence rate was reported to be 36.9% for the item "sweats." 21 Therefore, at least 82 (= 30/ 0.369) participants were required. A total of 107 patients were enrolled to account for potential dropouts during the study period. ). 27 Here, MID indicates minimal important difference, namely, the mean value of dayto-day changes in each item corresponding to assessments of global change of "a little better" or "somewhat better." MSE is the mean squared error of the scores of stable subjects reporting "the same" on item 44. The Pearson correlation coefficient was also calculated between responsiveness indices of the WURSS-K and WURSS.  168.9 ± 7.9 Weight (mean ± SD), kg 62.7 ± 10.7 Blood pressure (mean ± SD), mm Hg Systolic blood pressure 125.6 ± 15.0 Diastolic blood pressure 73.9 ± 9.5 Pulse (mean ± SD), frequency/min 81.5 ± 13.1 Temperature (mean ± SD),°C 36.3 ± 0.5

Demographics of participants
A total of 107 participants were initially enrolled in the study. One participant changed his mind, and 7 participants did not visit the center. All participants who replied once or more were included in the analysis. Most participants were university students, and age ranged from 18 to 42 years (mean 21.38). There were 54 female and 53 male participants ( Table 2).

Reliability test
As for internal reliability, the Cronbach's alpha of the 10 domains (based on the average of scores for the first 3 days) ranged from 0.666 to 0.962. According to guidelines for interpreting coefficient alpha, 0.65 to 0.70 is minimally acceptable, 0.70 to 0.80 is respectable, and 0.80 to 0.90 is very good. 28,29 For a coefficient alpha greater than 0.90, it is recommended to consider shortening the scale by reducing the number of items. 28,29 The domains with a coefficient alpha of >0.90 were "activity and function," which was composed of 10 items (0.962); "chest," with 3 items (0.950); and "ears," with 2 items (0.935). These domains also had reliabilities greater than 0.90 in the WURSS study. 21 The reliability coefficients of the WURSS ranged from 0.624 to 0.934. The Pearson correlation between coefficient alphas of the WURSS-K and reliabilities of the WURSS 21 was 0.710 (P = 0.02; Table 3). These results indicate that the internal reliability of the WURSS-K was very similar to that of the WURSS. 21 Validity test Convergent validity was evaluated by correlating the total scores of the SF-8 and the total scores on the first day of the WURSS-K. The Pearson correlation coefficient was −0.267 (P = 0.007). The WURSS study 21 used a 24-hour recall version of the SF-8 and observed it from, at a minimum,   day 2 through day 5. The correlation coefficients for the WURSS data from day 2 through day 5 were −0.60 to −0.84 (P < 0.001), much higher than that of this research (Table 4, Figure 2).

Responsiveness test
The responsiveness of each item of the WURSS-K was checked by calculating the Guyatt's responsiveness index. 26,27 The responsiveness indices ranged from 0.13 to 0.46: the maximum index of 0.46 was obtained for the first question, "How sick do you feel today?", and the minimum index of 0.13 was obtained for the question on "sinus pressure near the nose." In the WURSS study, 21 the indices ranged from 0.139 for the question on "plugged ears" to 0.709 for the first question. The correlation coefficient between the responsive indices of the WURSS-K and those of the WURSS 21 was 0.534 (P < 0.001; Table 5), which showed that the responsiveness of these 2 instruments was definitely correlated.

DISCUSSION
Reliability refers to the consistency of a result, ie, if the scores from one experiment can be trusted. 25 Internal reliability defines the consistency of the results delivered in a test, ensuring that the various items measuring different constructs yield consistent scores. 30 Internal reliability can be measured between similar items in the same domain within the same test. In the internal reliability test, most domains of the WURSS-K had Cronbach's alphas indicating respectable or very good reliability. The smallest alpha was 0.666, for the "aches" domain. A potential reason for this result is that it includes 2 items, "body aches" and "swollen glands," which are not closely related. Similarly, the "aches" domain showed low consistency in the WURSS study 21 (0.624). In addition, the correlation coefficient between the WURSS-K and WURSS was 0.710 (P = 0.02), 21 which means that the WURSS-K and WURSS showed a similar trend in reliability coefficients across domains. In the development of an abbreviated version of the WURSS-K (similar to the WURSS-21, which used only 21 of the 44 items on the WURSS to improve its responsiveness 21 ), the "aches" domain and domains with a coefficient alpha greater than 0.90 might be cancelled or adjusted. The convergent validity did not meet expectations because the Pearson correlation between the 4-week recall of SF-8 and the WURSS-K was −0.267 (P = 0.007), indicating a low correlation. This convergent validity was lower than that between the 24-hour recall of SF-8 and the WURSS, which ranged from −0.60 to −0.84 (P < 0.001) from day 2 through day 5. This discrepancy is attributable to differences in the recall time of the SF-8, ie, 4-week recall (present study) vs 24hour recall (WURSS study), the small number of participants, and the narrow age range of the participants enrolled in this research. The evaluation of convergent validity is also limited because it was compared only with the SF-8. Participants from a broader range of age groups and additional instruments such as the Jackson scale will be necessary in future studies.
As for responsiveness, the maximum was 0.46, for the first item, which concerned global condition and was thus expected to display maximum responsiveness. The second maximum was 0.38, for the 32nd item (tiredness), and the minimum value was 0.13, for the 21st item (sinus pressure). The correlation coefficient between responsiveness of the Correlation coefficient between WURSS-K and WURSS. *P < 0.001.
WURSS-K and WURSS was 0.534 (P < 0.001), which indirectly proves that the WURSS-K sensitively reflected changes in symptoms. In summary, the Korean translation of the WURSS was almost equivalent to the WURSS in reliability, validity, and responsiveness and is therefore appropriate for assessing the severity of the common cold in Korean patients.