Impact of Rotavirus Vaccines on Gastroenteritis Hospitalizations in Western Australia: A Time-series Analysis

Background Rotavirus vaccination was introduced into the Australian National Immunisation Program in mid-2007. We aimed to assess the impact of the rotavirus vaccination program on the burden of hospitalizations associated with all-cause acute gastroenteritis (including rotavirus gastroenteritis and non-rotavirus gastroenteritis) in the Aboriginal and non-Aboriginal population in Western Australia. Methods We identified all hospital records, between July 2004 and June 2012, with a discharge diagnosis code for all-cause gastroenteritis. Age-specific hospitalization rates for rotavirus and non-rotavirus acute gastroenteritis before and after the introduction of the rotavirus vaccination program were compared. Interrupted time-series models were used to examine differences in the annual trends of all-cause gastroenteritis hospitalization between the two periods. Results Between July 2004 and June 2012, there were a total of 106,974 all-cause gastroenteritis-coded hospitalizations (1,381 rotavirus-coded [15% among Aboriginal] and 105,593 non-rotavirus gastroenteritis-coded [7% among Aboriginal]). Following rotavirus vaccination introduction, significant reductions in rotavirus-coded hospitalization rates were observed in all children aged <5 years (up to 79% among non-Aboriginal and up to 66% among Aboriginal). Among adults aged ≥65 years, rotavirus-coded hospitalizations were 89% (95% confidence interval, 16–187%) higher in the rotavirus vaccination program period. The time-series analysis suggested reductions in all-cause gastroenteritis hospitalizations in the post-vaccination period among both vaccinated and unvaccinated (age-ineligible) children, with increases observed in adults aged ≥45 years. Conclusions Rotavirus vaccination has been associated with a significant decline in gastroenteritis hospitalizations among children. The increase in the elderly requires further evaluation, including assessment of the cost-benefits of rotavirus vaccination in this population.


eMaterials 1. Modelling approach
We outline further details regarding the modelling approach used for the interrupted time series analysis.
We assume an impact model whereby both the pre-intervention trend and level would change following the rollout of rotavirus vaccination program (all of 2007) and post-rollout (from 2008 to the end of the dataset) period. This follows standard recommendations provided in the literature [1]. We adopt a segmented negative binomial distributional assumption when the data were over-dispersed, but Poisson distributional assumptions otherwise. The linear predictor includes an intercept, terms for the pre-intervention, rollout and post-rollout trends (the latter two specified as a change from the pre-intervention trend), terms for the stage of the intervention (reference level being pre-intervention, then rollout followed by post-rollout) and terms for seasonality (Winter, Spring, Summer) relative to Autumn, see Equation 1 below. Population size (exposure in the language of Generalised Linear models [GLMs]) was included as an offset variable to convert the outcome into a rate and adjust for any changes in population over time.
We use the Generalised Linear Autoregressive Moving Average (GLARMA) framework to fit the data. GLARMA models are observation-driven state-space models that extend GLMs and allow explicit modelling of temporal correlation via the inclusion of autoregressive and moving average terms in the model [6]. GLARMA models can be fit using the glarma function from the R glarma package. Briefly, we outline this model class. Following [6], let for = 1. . where be the number of observations in the discrete response series and be the regression vectors. Let ℱ = {Υ : < , : ≤ } denote the set of past information on the response and past and present information on the regressors. We take the distribution of Υ conditional on ℱ to be of the exponential family form: As outlined in the manuscript we take as the count of hospitalisations for all-cause gastroenteritis or rotavirus specific hospitalisation and represent as the impact model as detailed in Equation 1. We initially forego modelling temporal correlation as we prefer to be guided by residual diagnostics to determine whether serial dependence is present in the data. When we exclude ARMA terms, the fitted models degenerate to their GLM counterparts in that the parameter estimates and inference will be identical. In cases where residual correlation is apparent in the residual auto-correlation function (ACF) and partial auto-correlation function (PACF) plots, we apply heuristics [8] to interpret the order of the implied auto regressive (AR) and moving average (MA) process and refit the model with the additional ARMA terms. Using this approach, we can better account for variability in the data under a more parsimonious specification that includes explicit modelling of temporal dependence.
Equation 1 below provides the impact model specification of the component of the state vector (assumes time ordered data).

Equation 1
~NegBin( , ) , = log( ) = + log( ) = β0 + β1 timet + β2 time since rolloutt + β3 time post rolloutt + β4I (rollout staget) +β5I (post rollout staget) + β6I (wintert) + β7I (springt) + β8I (summert) + log (population estimatet) Under this specification, and are the location and scale parameters, is the exposure at time , = exp( ) and is the count of hospitalisations during discrete time interval with = 1 … being the time values included in the data. The other terms are as follows: is the discrete time (in months) starting from zero such that the intercept is aligned with July 2004 -the start of the series Holding all other terms constant, the parameters are interpreted (on the log scale) as: 0 the overall pre-intervention intercept (in the language of timeseries decomposition, the pre-intervention level [vertical displacement from zero]) 1 the change in the linear predictor for a month increase in time (the pre-intervention trend) 2 the change from the pre-intervention trend ( 1 ) associated with the rollout phase 3 the change from the pre-intervention trend ( 1 ) associated with the post-rollout phase 4 the level change associated with the rollout period 5 the level change associated with the post intervention stage 6 the change in the linear predictor that was associated with the periodic winter seasonality 7 the change in the linear predictor that was associated with the periodic spring seasonality 8 the change in the linear predictor that was associated with the periodic summer seasonality log( ) the usual offset as used in count models with variable denominators The exponentiated parameter estimates give the commonly known rate-ratios, which can be readily translated into a percentage change from the reference level holding all other terms constant. For the series where outbreaks occurred during 2010, we additionally introduce an indicator variable for the outbreak and fitted a term for that covariate in the model. eMaterials 2. A narrative description of fitting the timeseries for a specific age-class

Example analysis for age class < 12 months
The <12 months year old age group data is used as an example. We applied an analogous approach for the other age-classes.
We fit the specification detailed in Equation 1 to the all-cause acute gastroenteritis (ACG) and rotavirus (RV) coded hospitalisation timeseries for the <12 months cohort and refer to this as Model 1. While the specification is aligned with the standard ITS model suggested in the literature, e.g. [1][3], visual inspection (figure 1) of the ACG and RV-coded hospitalisation series for the <12 months age cohort suggests potential over-specification as no trends are apparent in the pre, nor post-rollout periods. Additionally, the periodicity that appears in the pre-intervention stage all but vanishes in the post-rollout stage. In terms of model specification, this might warrant consideration of an interaction term between stage and seasonality. However, while periodicity is clearly evident in the pre-intervention stage, the peaks occur at different times of the year, albeit typically Winter and Spring. Notwithstanding these possible limitations, the parameter estimates for the ACG and RVspecific series are shown below.  Figure 1 Observed mean monthly hospitalisation rates (and seasonality) per 1000 population for age <12 months for ACG (LHS) and RV (RHS) data Interpretation of residual diagnostics for the exponential family with non-identity link functions can be unreliable for reasons such as the variance of the data is often expected to change with fitted values, normality is not necessarily going to be apparent and patterns are the norm rather than the exception for plots of predicted versus residuals. Model diagnostics (quantile-quantile plot, residual vs fitted, residual vs leverage and autocorrelation function, figure 2) for Model 1 for the ACG series suggest some deviation from distributional assumptions but certainly not extreme. However, serial dependence is readily detected which would lead to biased standard errors if not accounted for via the NW covariance estimator in the case of a GLM or explicitly modelled as we do here. The pattern in the ACF (and PACF, not shown) suggests a mild autoregressive process of order 1, which was also suggested by a Durbin Watson test (significant at the 0.001 level). Structure was also apparent in the residual plots when plotted against the month and month since vaccine variables (not shown) suggesting another area of potential misspecification. Given the serial dependence in both series, we refit the models retaining the original specification but adding an AR(1) component into the state vector. The parameter estimates from both models are shown below. For both series, Akaike Information Criterion (AIC) and likelihood ratio tests (LRT) suggest that Model 2 has better goodness-of-fit for both the ACG and RV series with LRTs significant at the < 0.01 and < 0.001, respectively.   Based on the parameter estimates from Model 2, there is negligible evidence to support a pre-intervention trend, nor changes from the pre-intervention trend in the rollout or post-rollout period in either series. However, Model 2 (ACG) does suggest a -0.783 (95CI -1.224 to -0.342) change in linear predictor when moving from the pre to the post-rollout period, holding all other parameters constant. This corresponds to a 0.46 (95CI 0.29 to 0.71) rate ratio or approximately 54% reduction (on average) in the hospitalisation rate across the whole postrollout period relative to the pre-intervention period. Model 2 (ACG) has little support for seasonality with an LRT comparing nested models for the ACG series with being non-significant (p-value = 0.7). We think this could relate to what is clear by visual inspection, namely that the periodicity that we see in the pre-intervention period vanishes in the post-rollout period.
For the RV series, Model 2 suggests some support for seasonality, which was confirmed by a significant LRT at the 0.001 level comparing nested models for the RV series with and without a seasonality term. However, we note that AIC was only fractionally improved in the more complex model.
We can get some additional insight into the magnitude of change by comparing the modelled pre-intervention and post-rollout hospitalisation rates by averaging across the fitted values and bootstrapping to estimate the uncertainty. Doing so suggests that the median hospitalisation rate reduced, on average, by 11.78 (95CI 8.89to 14.9) hospitalisations per 1000 population for the ACG series and 2.69 (95CI 1.32 to 5.25) in the RV series. Moreover, given that there are no substantive trends in either series, this difference remains approximately constant over the full post-rollout period.
While it would be possible to remove redundant terms from the models, we retained the full specification as we have assumed that the direct effects associated with the vaccine will follow the pre-specified impact model. Importantly, we note that a-priori specification does not protect us from the possibility of mis-specification -we do not know the true model for the data generation process. Our primary goal is to examine the strength of evidence for the original model.  The series are structurally similar to the first age class with level, possible seasonality, and noise: • Possible trends in post-rollout stage • Change in level when considering pre versus post • Change in variability • Periodicity in the pre-intervention period, not as strongly apparent in the post-rollout Also: • Increasing number of zeros in the RV series • Lower rate in RV series

Age class -2 years (24-35 months)
The series are structurally similar to the first age class with level, possible seasonality, and noise: • Negligible differential stage trends apparent (clear overall trend) • Clear change in level when considering pre versus post • Clear change in variability • Periodicity in the pre-intervention period, not strongly apparent in the post-rollout Also: • Number of zeros in the RV series • Lower rate in RV series

Age class -3 Years (36-47 months)
The series are structurally similar to the first age class with level, possible seasonality, and noise: • Negligible differential stage trends apparent (slight overall trend) • Clear change in level • Clear change in variability • Periodicity in the pre-intervention period, not as apparent in the post-rollout New points of note: • Larger number of zeros in the RV series • Peak around 2010 in ACG and RV series • Lower rate in RV series • Insufficient data to model RV

Age class -4 years
The series are structurally similar to the first age class with level, possible seasonality, and noise: • Negligible differential stage trends apparent (clear overall trend) • Change in level not so apparent but still clear • Change in variability • Periodicity in the pre-intervention period, not so apparent in the post-rollout New points of note: • Sparse data in the RV series • Peak around 2010 in ACG and RV series • Insufficient data to model RV

Age class -10-19 years
The series have structural features of level, possible seasonality, trend, and noise: • ACG looks flat in pre-intervention.
• ACG post-rollout growth in hospitalisations • Large number of zeros in the RV series • Suggestion of peak around 2010 in ACG series but not apparent in RV • Insufficient data to model RV

Age class -45-64 years
The series have structural features of possible seasonality, trend, and noise: • ACG shows almost seamless trend in both periods.
• Large number of zeros in the RV series • RV data sparseinsufficient to model • ACG data look almost negatively autoregressive • Insufficient data to model RV

Age class ≥65 years
The series have structural features of possible seasonality, trend, and noise: • ACG seamless trend in both periods with occasional possible 'outbreaks' but could just be stochastic fluctuations • Large number of zeros in the RV series • Insufficient data to model RV