2015 Volume 57 Issue 4 Pages 353-358
Objectives: The aim of this study was to reexamine the dimensionality of the widely used 9-item Utrecht Work Engagement Scale using the maximum likelihood (ML) approach and Bayesian structural equation modeling (BSEM) approach. Methods: Three measurement models (1-factor, 3-factor, and bi-factor models) were evaluated in two split samples of 1,112 health-care workers using confirmatory factor analysis and BSEM, which specified small-variance informative priors for cross-loadings and residual covariances. Model fit and comparisons were evaluated by posterior predictive p-value (PPP), deviance information criterion, and Bayesian information criterion (BIC). Results: None of the three ML-based models showed an adequate fit to the data. The use of informative priors for cross-loadings did not improve the PPP for the models. The 1-factor BSEM model with approximately zero residual covariances displayed a good fit (PPP>0.10) to both samples and a substantially lower BIC than its 3-factor and bi-factor counterparts. Conclusions: The BSEM results demonstrate empirical support for the 1-factor model as a parsimonious and reasonable representation of work engagement.
(J Occup Health 2015; 57: 353–358)
The concept of work engagement, defined as “a positive, fulfilling, work-related state of mind that is characterized by vigor, dedication, and absorption1)”, is an active research topic in current applied psychology. A high level of work engagement represents positively oriented psychological capacities in the workplace such as high energy levels, mental resilience, enthusiasm, strong connection to the environment and being engrossed by one's work2). The 9-item Utrecht Work Engagement Scale (UWES-9)1, 3) is a concise and widely used measure of work engagement across countries4, 5). Though previous studies have demonstrated acceptable reliability and convergent validity for the scale6, 7), there remain unresolved issues about the scale's dimensionality.
The UWES-9 was originally hypothesized to assess three aspects of work engagement: vigor (3 items), dedication (3 items) and absorption (3 items). On one hand, results from previous studies3, 4, 8) reveal a better fit for a 3-factor model than a 1-factor model, which appears to support interpretation with the three subscale scores. On the other hand, very high correlations (r≥0.90) were consistently found among the three factors, suggesting potential model redundancy. In view of the inadequate fit for the 1-factor model and lack of discriminant validity for the 3-factor model, de Bruin and Henn9) examined a bi-factor model as an alternative factor structure for the UWES-910). The bi-factor model, which specified a general work engagement factor and two specific factors on dedication and absorption10), provided a superior fit to the 1-factor and 3-factor models. The general factor was found to be a dominant factor that accounts for significant portions of the variance in the UWES-9 items.
A number of methodological problems are noteworthy in the previous studies on the UWES-9 based on the traditional maximum-likelihood (ML) approach. The first problem relates to inappropriate evaluation of model fit. Researchers frequently ignore significant χ2 tests of exact fit by claiming that the χ2 test is oversensitive to trivial misspecification at large sample sizes. Despite the high power of the χ2 test to detect model misfit, significant χ2 tests do not routinely imply trivial misspecification11, 12). The use of approximate fit indices in evaluating and justifying model fit has been a contentious issue. Second, the ML approach fixes all cross-loadings and residual covariances at zero. This assumption may be unrealistic and overly restrictive13) and could lead to inadequate model fit and biased parameter estimates14). To locate the source of misfit, model diagnostics are performed based on modification indices to estimate a particular cross-loading or residual covariance one at a time. However, such a practice often lacks theoretical justification and likely capitalizes on idiosyncratic features of the sample.
Given the limitations of the ML approach, Muthén and Asparouhov13) proposed Bayesian structural equation modeling (BSEM) as an alternative modeling approach. This pioneering approach relaxes the restrictive assumptions of the ML approach via the use of zero-mean, small variance informative priors13). Specification of informative priors can better reflect prior knowledge and substantive theories by taking into account the plausible uncertainty over the approximately zero cross-loadings and residual covariances13). With reference to recent applications of the BSEM approach15, 16), the present study aimed to provide new insights on the latent structure of the UWES-9. This was done by reexamining the one-factor, three-factor, and bi-factor models under both the ML and BSEM approaches.
The participants were 1,112 Chinese adults working in the health-care service sector in Hong Kong. The participants provided written informed consents and completed a self-report questionnaire in Chinese. Ethical approval was obtained from the local institutional review board. The majority of the participants were female (81.7%), had a secondary education level (58.6%), and were middle-aged (52.8%), with an age range from 41 to 55. All participants had at least a year's work experience. The sample comprised support workers (53.8%), professional workers (16.8%), administrative workers (14.1%) and medical workers (11.3%). The present sample was randomly split into two samples, with Sample 1 being used for primary analysis and Sample 2 being used for cross-validation.
MeasureWork engagement was assessed by the 9-item Utrecht Work Engagement Scale (UWES-9)17). The UWES-9 was hypothesized to measure work engagement in three dimensions: vigor (3 items), dedication (3 items) and absorption (3 items). The items are scored on a 7-point Likert scale ranging from 0 (“never”) to 6 (“every day”). Previous studies reported good reliability for the UWES-9 total score (median Cronbach's α=0.92) and subscale scores (median α=0.77–0.85)3). Table 1 presents the descriptive statistics of the UWES-9 items for the two samples. The items were positively and significantly correlated (r=0.19–0.68, p<0.01) and displayed minor skewness or kurtosis (≤0.5).
Item | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | M | SD | S | K |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 0.67 | 0.57 | 0.39 | 0.38 | 0.19 | 0.45 | 0.43 | 0.35 | 3.89 | 1.36 | −0.30 | 0.03 | |
2 | 0.60 | 0.67 | 0.53 | 0.50 | 0.37 | 0.57 | 0.56 | 0.44 | 3.89 | 1.41 | −0.24 | −0.35 | |
3 | 0.52 | 0.65 | 0.50 | 0.50 | 0.25 | 0.64 | 0.59 | 0.37 | 4.36 | 1.25 | −0.35 | −0.35 | |
4 | 0.48 | 0.41 | 0.46 | 0.48 | 0.32 | 0.50 | 0.47 | 0.43 | 3.39 | 1.39 | −0.21 | 0.10 | |
5 | 0.39 | 0.45 | 0.50 | 0.43 | 0.41 | 0.48 | 0.53 | 0.43 | 3.38 | 1.56 | −0.08 | −0.47 | |
6 | 0.20 | 0.28 | 0.25 | 0.36 | 0.36 | 0.35 | 0.36 | 0.43 | 2.26 | 1.61 | 0.32 | −0.52 | |
7 | 0.52 | 0.54 | 0.60 | 0.49 | 0.45 | 0.29 | 0.68 | 0.41 | 3.94 | 1.51 | −0.38 | −0.32 | |
8 | 0.46 | 0.56 | 0.61 | 0.47 | 0.50 | 0.39 | 0.64 | 0.59 | 3.53 | 1.49 | −0.22 | −0.21 | |
9 | 0.30 | 0.35 | 0.31 | 0.46 | 0.38 | 0.35 | 0.40 | 0.57 | 3.10 | 1.56 | −0.13 | −0.40 | |
M | 3.82 | 3.92 | 4.33 | 3.35 | 3.27 | 2.24 | 3.87 | 3.49 | 3.10 | ||||
SD | 1.32 | 1.42 | 1.30 | 1.46 | 1.59 | 1.58 | 1.53 | 1.52 | 1.60 | ||||
S | −0.23 | −0.30 | −0.48 | −0.20 | 0.04 | 0.46 | −0.39 | −0.13 | −0.08 | ||||
K | −0.07 | −0.24 | −0.06 | −0.03 | −0.58 | −0.19 | −0.21 | −0.34 | −0.45 |
M=mean; SD=standard deviation; S=skewness; K=kurtosis. Descriptive statistics for Sample 1 (N=556) are displayed on the lower diagonal, while those for Sample 2 (N=556) displayed on the upper diagonal. All correlations are statistically significant at p<0.01.
The 1-factor, 3-factor, and bi-factor models of the UWES-9 were examined using ML-based confirmatory factor analysis (CFA) and BSEM using Mplus version 7.218). The one-factor model specifies a work engagement factor, and the three-factor model assumes three factors, vigor, dedication and absorption. The bi-factor model specifies a general factor that loads on all items and two specific factors that each load on three items9). The general and specific factors are uncorrelated with each other. The present study did not estimate the full bi-factor model (1 general factor + 3 specific factors for vigor, dedication, and absorption), as the extra specific factor would be likely fully accounted for by the general factor. The ML-based CFA models were conducted using a robust maximum likelihood estimator, and model fit was evaluated based on a χ2 test of exact fit and two fit indices19), namely, comparative fit index (CFI)≥0.95 and root mean square error of approximation (RMSEA) ≤0.06. Factor loadings greater than 0.30 were taken as practically significant. Over 93.3% (N=519) of the participants provided complete responses for the UWES-9 items in both samples. The missing data were handled by full-information maximum likelihood estimation. McDonald's coefficient ω, which denotes the proportion of observed variance of the measured items explained by the factor, was used as a measure of composite reliability20).
The BSEM models were estimated using the Bayes estimator with a series of prior specifications for the cross-loadings and residual correlations for the standardized item scores. The BSEM analysis in this study was carried out with reference to Appendix 1 of the recent paper by Asparouhov, Muthén, and Morin21). First, BSEM models specified diffuse priors for the hypothesized factor loadings and did not specify informative priors for the cross-loadings and residual covariances. Next, we specified small-variance informative priors for the cross-loadings, choosing prior variances of 0.01 in line with Muthén and Asparouhov13). Finally, informative Inverse Wishart (dD,d) priors were added for the residual covariances. A starting value of d=100 was recommended for the informative priors with a sample size near 50021) and D referred to the residual variances of the Bayesian CFA models. Two independent Markov chain Monte Carlo chains were used for BSEM estimation using the Gibbs sampler22, 23). Model convergence was monitored by potential scale reduction factor24) and posterior parameter trace plots.
Model fit was evaluated using the posterior predictive p-value (PPP), associated 95% confidence interval and number of iterations needed for convergence13). A PPP<0.05 and a positive 95% lower limit imply a poor model fit. Sensitivity analysis was performed by varying the informative priors for cross-loadings (variances=0.001, 0.01, 0.05 and 0.1) and residual covariances (d=100, 200, 300 and 400), with the aim of arriving at BSEM models with good model identification (fast convergence), a PPP>0.05 and reasonable confidence interval limits. The deviance information criterion (DIC)25) or Bayesian information criterion (BIC)26) was used for comparison of BSEM models with different or the same specifications of informative priors, respectively. Both the DIC and BIC avoid model over-fitting by imposing a model complexity penalty based on the estimated and actual number of parameters, respectively. Models with a lower information criterion (a difference of 10 or above) were favored.
Table 2 reports the results of the ML-CFA models for the UWES-9. In both samples, all three models were rejected by the χ2 test (p=0.000), and the fit indices failed to meet the suggested cutoff (CFI<0.95 and RMSEA>0.06). Given the lack of theoretical justification and the possibility of capitalizing on chance features of the sample, model respecification was not carried out in ML-based models using model modification indices. Instead, we turned to BSEM diagnostic analysis to locate the source of model misfit.
Model | Sample | # | χ2 | p | CFI | RMSEA |
---|---|---|---|---|---|---|
1-factor | 1 | 27 | 131.7 | 0.000 | 0.903 | 0.084 |
2 | 27 | 177.3 | 0.000 | 0.884 | 0.100 | |
3-factor | 1 | 30 | 93.5 | 0.000 | 0.936 | 0.072 |
2 | 30 | 138.7 | 0.000 | 0.912 | 0.093 | |
Bi-factor | 1 | 33 | 85.4 | 0.000 | 0.941 | 0.074 |
2 | 33 | 87.4 | 0.000 | 0.949 | 0.075 |
N=556; ML=maximum likelihood; CFA=confirmatory factor analysis; #=number of free parameters; χ2=chi-square value; CFI=comparative fit index; RMSEA=root mean square error of approximation.
Table 3 presents the fit statistics of the BSEM results with different priors. All three BSEM models without informative priors were rejected by the data (PPP=0.000) with a high 95% lower PP limit in both samples. Specification of cross-loading priors (variances=0.01) led to a lower DIC than the Bayesian CFA models and shifted the 95% PP limits closer to zero. However, the PPP for the BSEM with cross-loadings remained at 0.000 in both samples. All BSEM models with informative residual covariance priors (d=300 in Sample 1 and d=200 in Sample 2) consistently provided an adequate fit to the data, with PPP=0.113–0.193 and a negative 95% lower PP limit. These models showed a substantially lower DIC than previous BSEM models with no informative priors or with cross-loading priors. It is worth noting that the three BSEM models with residual covariances showed comparable PPPs and DICs (difference<10). However, the 1-factor BSEM model provided a substantially lower BIC than the 3-factor and bi-factor BSEM models in both samples.
Prior specification | Sample | # | pD | 2.5% PP limit |
97.5% PP limit |
PPP | DIC | BIC |
---|---|---|---|---|---|---|---|---|
No informative priors | ||||||||
1-factor model | 1 | 27 | 26.6 | 166.5 | 221.3 | 0.000 | 12,126 | 12,244 |
2 | 27 | 26.6 | 222.1 | 278.1 | 0.000 | 11,975 | 12,092 | |
3-factor model | 1 | 30 | 30.8 | 102.5 | 160.1 | 0.000 | 12,065 | 12,192 |
2 | 30 | 32.3 | 157.8 | 214.8 | 0.000 | 11,915 | 12,040 | |
Bi-factor model | 1 | 33 | 31.9 | 103.2 | 157.0 | 0.000 | 12,044 | 12,216 |
2 | 33 | 27.3 | 143.2 | 199.6 | 0.000 | 11,896 | 12,046 | |
Cross-loading priors | ||||||||
3-factor model | 1 | 48 | 32.9 | 59.0 | 117.0 | 0.000 | 12,022 | 12,260 |
2 | 48 | 35.2 | 49.1 | 147.6 | 0.000 | 11,830 | 12,069 | |
Bi-factor model | 1 | 45 | 23.4 | 16.9 | 75.6 | 0.001 | 11,969 | 12,197 |
2 | 45 | 38.5 | 21.8 | 78.2 | 0.000 | 11,779 | 11,987 | |
Residual covariance priors | ||||||||
1-factor model | 1 | 63 | 39.2 | −12.1 | 49.3 | 0.113 | 11,957 | 12,277 |
2 | 63 | 41.9 | −10.7 | 51.2 | 0.124 | 11,751 | 12,066 | |
3-factor model | 1 | 66 | 41.0 | −16.1 | 47.5 | 0.160 | 11,955 | 12,290 |
2 | 66 | 43.1 | −12.7 | 51.1 | 0.128 | 11,752 | 12,083 | |
Bi-factor model | 1 | 69 | 42.3 | −14.9 | 48.6 | 0.149 | 11,957 | 12,309 |
2 | 69 | 44.3 | −17.1 | 45.1 | 0.193 | 11,748 | 12,096 |
N=556; #=number of free parameters; pD=estimated number of parameters; PP limit=posterior predictive limit; PPP=posterior predictive p-value; DIC=deviance information criterion; BIC=Bayesian information criterion.
Table 4 displays the factor loadings of the three well-fitting BSEM models with residual covariances in Sample 1. In the 1-factor model, all 9 items loaded substantially (λ=0.43 to 0.80) on the overall factor (ω=0.88, 95% C.I.=0.86 to 0.89). Though 11 out of the 36 specified residual correlations were statistically significant (the 95% C.I. did not cover zero), they were all less than 0.20 with a range of −0.14 to 0.16. In the 3-factor model, the three factors (ω=0.70 to 0.76, 95% C.I.=0.66 to 0.79) showed salient factor loadings (λ=0.44 to 0.91). However, vigor and dedication were found to be extremely highly correlated (r=0.98, 95% C.I.=0.94 to 0.99), suggesting model redundancy. In the bi-factor model, the overall factor (ω=0.89, 95% C.I.=0.87 to 0.90) had salient loadings (λ=0.41 to 0.79) on all 9 items. However, the specific factor of dedication was poorly defined by its indicators (λ=0.06 to 0.10) and the specific factor of absorption had low reliability (ω=0.39).
Item | 1-factor | 3-factor | Bi-factor | ||||
---|---|---|---|---|---|---|---|
WE | Vig | Ded | Abs | WE | Ded | Abs | |
1 | 0.67† | 0.70† | 0.68† | ||||
2 | 0.75† | 0.78† | 0.76† | ||||
5 | 0.64† | 0.63† | 0.64† | ||||
3 | 0.77† | 0.78† | 0.79† | 0.10 | |||
4 | 0.64† | 0.63† | 0.64† | 0.06 | |||
7 | 0.76† | 0.76† | 0.76† | 0.06 | |||
6 | 0.43† | 0.44† | 0.41† | 0.18† | |||
8 | 0.80† | 0.91† | 0.78† | 0.31† | |||
9 | 0.56† | 0.61† | 0.52† | 0.52† |
N=556; WE=work engagement; Vig=vigor; Ded=dedication; Abs=absorption. Factor loadings were freely estimated using diffuse priors. Daggers indicate that the 95% credibility interval does not contain zero.
The present study performed a systematic examination of the dimensionality of the UWES-9 under the ML and BSEM approaches. Under the traditional ML approach, none of the 1-, 3-, and bi-factor CFA models provided an acceptable fit to the data in terms of the highly significant χ2 test and the approximate fit indices. The mediocre fit may be attributed to the overly restrictive constraints of exactly zero cross-loadings and residual covariances. Though the ML models could be modified via estimation of cross-loadings or residual covariances, simultaneous estimation of all these parameters is not possible in this approach because of the statistical unidentifiability.
On the other hand, the BSEM approach facilitates simultaneous estimation of all residual covariances via informative priors that permit slight deviation from zero if such additions are warranted by the data. The present study applied BSEM analysis to locate the source of model misfit and identify possible model modifications. Without specifying any informative priors, the BSEM models did not fit the data adequately at all. The poor model fit was consistent with that of the ML-CFA models that both fixed cross-loadings and residual covariances exactly at zero. Despite the improvement in the 95% PP limits and DIC to some extent, the BSEM models with cross-loading priors were still rejected by the data. This implies that the model misfit is unlikely to be attributed to the absence of cross-loadings.
The BSEM models with residual covariance priors showed a good PPP, a negative 95% lower PP limit and a substantially lower DIC than the previous models. Despite equivalent PPPs and DICs for the BSEM models with residual covariances, the substantially lower BIC strongly favors the 1-factor model over the other two models. The 11 residual correlations that were found to be statistically significant were indeed substantively insignificant (<0.20). In line with previous studies3–5, 17), the exceptionally strong inter-factor correlation highlights excessive overlapping among the factors and absence of discriminant validity for the 3-factor model. Similarly, given the weak factor loadings and poor composite reliability for the specific factors, the bi-factor model was not supported in the present study.
The BSEM results suggest the model misfit is due to minor differences between the model and the data in the form of omitted minor residual covariances. We choose to treat these statistically significant but substantively insignificant parameters as approximately zero and interpret the 1-factor model as a sufficiently good and parsimonious approximation for the data. Instead of interpreting subscale scores that are potentially redundant, the present results demonstrate support for use of the total UWES-9 score as a measure of work engagement3, 9).
Despite the large sample size, the present study was based on a nonrandom sample of health-care workers. The potential selection bias limits the generalizability of the study results to other worker populations. The self-reported cross-sectional nature of the current study implies the possible existence of common method variance. Future studies that adopt a longitudinal design and incorporate objective measures to elucidate the degree of work engagement and its developmental trajectories are recommended.
In summary, this psychometric study was the first to apply the flexible BSEM approach in reevaluating the dimensionality of the UWES-9. The BSEM results demonstrate empirical support for the overall factor as an adequate and parsimonious representation of work engagement. Future research could investigate the measurement invariance of the UWES-9 across gender or cultural contexts using the Bayesian approach27). This innovative approach allows a test of approximate measurement invariance via zero-mean, small variance informative priors for parameter differences between groups28, 29), thereby providing a useful mean of identifying non-invariance in the case of multiple groups or time points.
Acknowledgment: The authors would like to sincerely thank Dr. Tihomir Asparouhov for his invaluable insights concerning the BSEM analysis results.