Impact of the Window Length of Four-Dimensional Local Ensemble Transform Kalman Filter: A Case of Convective Rain Event

This study aims to investigate the tradeoff between the computational time and forecast accuracy with different data assimilation (DA) windows of four-dimensional local ensemble transform Kalman filter (4D-LETKF) for a single-case severe rainfall event. We perform a series of Observing System Simulation Experiments (OSSEs) with 1-, 3-, 5and 15-minute DA window in a severe rainstorm event in Kobe, Japan, on July 28, 2008, following the prior OSSEs by Maejima et al. (2019). Running 1-minute DA cycles showed the best forecast accuracy but with the highest computational cost. The computational cost could be reduced by taking a long DA window, but the forecast became less accurate even though the same number of observations were used. A significant gap was found between the 3-minute window and 5-minute window. With the 1and 3-minute windows, the forecasts captured the intense rainfall, while with the 5-minute window or longer, the rainfall intensity was drastically underestimated. This single-case study suggests that 3-minute or shorter DA window be a promising method for a severe rainfall forecast, although more case studies are necessary to draw general conclusion. (Citation: Maejima, Y., and T. Miyoshi, 2020: Impact of the window length of four-dimensional local ensemble transform Kalman filter: A case of convective rain event. SOLA, 16, 37−42, doi:10.2151/sola.2020-007.)


Introduction
have developed innovative "Big Data Assimilation" (BDA) technology for innovating numerical weather prediction (NWP) focusing on local severe weather. The BDA system implemented 30-second-update, 100-m-resolution NWP, which is 120 times higher data assimilation (DA) frequency with 400 times more horizontal grid points than the hourly updated Japan Meteorological Agency (JMA) operational Local Forecast Model (LFM) at 2-km resolution. As a part of the BDA project, Maejima et al. (2019) performed a series of Observing System Simulation Experiments (OSSEs) using the Local Ensemble Transform Kalman Filter (LETKF, Hunt et al. 2007;Miyoshi and Yamane 2007) with the JMA Nonhydrostatic Model (NHM; Saito et al. 2006Saito et al. , 2007Saito 2012), the system so-called NHM-LETKF (Miyoshi and Aranami 2006;Kunii 2014). The objective of the OSSEs is to provide an assessment of the potential impacts of virtual observations on the performance of an NWP system (e.g., Lahoz et al. 2005). They investigated the potential impact of dense and frequent surface observations in 1-minute-update LETKF cycles, and showed that data assimilation (DA) of surface station data has a potential to improve the forecast accuracy of a sudden severe rainstorm event.
To run the BDA system, an effective use of a high performance computing (HPC) system is essential. In the BDA system, we assimilate over 8 million of observation data received every 30 seconds, and the computational cost is very large. Therefore, we would like to save the computational cost as much as possible while maintaining the forecast accuracy.
Here, the four-dimensional expansion of the LETKF (4D-LETKF, Hunt et al. 2004;Miyoshi and Aranami 2006) is useful. The three-dimensional LETKF (3D-LETKF) uses the observation data at a single time. In contrast, the 4D-LETKF enables to use observation data at multiple times simultaneously in a time window (a.k.a. DA window). For example, if we run 10 consecutive 30-second-update 3D-LETKF cycles for a 5-minute period, a single 4D-LETKF cycle with the 5-minute window can take the same amount of data. This will save the Input/Output (I/O) costs by less frequent initialization of the model runs. In general, I/O costs are significant in the LETKF cycles, and a longer DA window saves more of the I/O costs.
However, we have a drawback by having a longer DA window. Namely, a longer DA window would degrade the forecast accuracy due to the chaotic nature of the weather system dynamics. Generally, the weather system becomes more chaotic and unpredictable with a longer time period. The limit to predictability suggests the limit to information transfer within the DA window. If we take a longer DA window beyond the predictability limit, the observational information cannot transfer effectively within the time window, and the resulting forecast will be degraded. Namely, the longer the DA window, the less accurate the forecast. The time scale depends on the scales of the phenomena of interest. In this study, we focus on convective scales, so that the time scale will be limited to several minutes or even shorter.
In this study, we aim to examine the tradeoff between the computational time and forecast accuracy with different DA windows of 4D-LETKF. Following Maejima et al. (2019), we perform experiments with various lengths of the DA window in a specific rainstorm event, and evaluate computational costs and forecast accuracy. The results will suggest a reasonable length of the DA window for a severe weather forecast for this particular case, an important first step toward more general studies in the future.

Methods
This study performed a series of OSSEs following Maejima et al. (2019) for a disastrous case in Kobe, Japan, on July 28, 2008, with 5 fatalities due to a rapid water level rise in River Toga. We used the Japan Meteorological Agency nonhydrostatic model (JMA-NHM; Saito et al. 2006Saito et al. , 2007Saito 2012), which is a former version of the regional operational NWP model developed by JMA, but with a different domain setting (Fig. 1a, black square). For the cloud microphysical process, a 6-category singlemoment scheme but using the double moment for only cloud ice

Impact of the Window Length of Four-Dimensional Local Ensemble Transform Kalman Filter: A Case of Convective Rain Event
and observational data for the OSSEs. Refer to Maejima et al. (2019) for more details of the series of experiments.
In the series of six OSSEs, 4D-LETKF technique is adopted. Hunt et al. (2004) described how to input observation data at multiple times in a single DA cycle. Figure 2 illustrates an example of having every-minute observations with (b) every-minute 3D-LETKF cycles and (c) every-3-minute 4D-LETKF cycles. Taking a longer DA window reduces the number of LETKF cycles while taking the same number of observations. An LETKF cycle contains I/O and MPI (Message Passing Interface) initialization which are required only once per LETKF cycle. Therefore, we can reduce the number of these processes by taking a longer DA window and can save the computational time. Namely, a longer DA window will be more efficient computationally while taking the same number of observations. However, as mentioned in introduction, taking a long DA window would degrade the forecast accuracy if the DA window is long enough that the limit to predictability plays a role. Therefore, it is necessary to evaluate how effectively the LETKF works with longer DA windows.
To investigate the tradeoff between the computational time and forecast performances, we considered six scenarios for 15 minutes from 0245 UTC to 0300 UTC after the 1-minute-update DA cycles from 0230 UTC to 0245 UTC (see Fig. 2a): (A) 1-muniteupdate 3D-LETKF cycles as if there were no suspension, (B) five 4D-LETKF cycles with 3-minute windows, (C) three 4D-LETKF cycles with 5-minute windows, (D) a single 4D-LETKF with a 15-minute window, (E) a single step of 3D-LETKF at 0300 UTC rejecting all data during the suspension from 0246 UTC to 0259 UTC, and (F) no DA during the suspension. Experiments (A)− (E) use all available observations. After 0300 UTC, 30-minute extended forecasts initialized by the analysis ensemble means are performed.
The experiments were performed on the Fujitsu FX10 supercomputer of the University of Tokyo. We used 480 nodes (7680 cores) for the forecasts and 160 nodes (2560 cores) for the LETKF. Although the computational resources are not sufficient to run the DA computations in real time, we can address the main purpose of this study, that is, the tradeoff between the computational time and forecast performances.

Results
We first investigate the forecast accuracy and computational time for each experiment (Fig. 3). In the nature run, intense rainfall areas over 25 mm per 30 minutes spread from east to west. Experiment (A) used all of the PAWR and surface station data as if there were no suspension but shows generally weaker rainfall and different shapes mainly because of the difference of the model resolution. Namely, the nature run was performed at 100-m resolution, while the OSSEs were performed at 1-km resolution. Despite of the resolution difference, the peak rainfall amount reached to 25 (Ikawa and Saito 1991) was used. For sub-grid-scale turbulences, a large eddy simulation scheme developed by Deardorff (1973) was adopted. First, we performed a downscale simulation of the best ensemble member of Seko et al. (2011), and generated the nature run at 100-m resolution. To generate 40 initial ensemble members, a 5-km-mesh downscaling simulation from the JMA Global Spectral Model (JMA-GSM) forecasts initialized at 0000 UTC July 27 was performed, and we chose initial ensemble members from the simulation at different times up to 13 hours away (see Maejima at el. 2019 for details). Every minute synthetic observations simulate the Phased Array Weather Radar (PAWR) at Osaka University and 167 surface weather stations at RIKEN Center for Computational Science and Kobe city elementary schools (Fig.  1b). The synthetic observations were generated by adding Gaussian noise to the observed quantities computed from the nature run. The assimilated observations and the magnitude of the observation errors are shown in Table 1. The error values are somewhat larger than the instrumental errors. Six OSSEs using NHM-LETKF at 1-km resolution with 40 ensemble members were initialized at 0230 UTC July 28, 2008 (Fig. 2a). Here, the model resolution was degraded by a factor of 10 compared with the nature run, and accordingly, the turbulence scheme is changed from Deardorff (1973) to improved Mellor and Yamada Level 3 Niino 2006, 2009). This is the only difference of the model settings. The entire domain of Fig. 1a shows the 300-km-by-300-km model domain for the OSSEs, and Table 1 provides the model settings    mm in Experiment (A). This is considered reasonably well since we would expect systematic underestimation of rainfall intensity due to the resolution degradation. The 3D-LETKF with 1-minute window rapidly updates the analyses before significant nonlinear dynamics appear, and shows the best result. Comparing the different OSSEs, we find a significant gap of peak rainfall intensity between Experiments (B) and (C). Experiment (B) maintained the rainfall amount of 25 mm. However, when the DA window is extended to 5 minutes or more, the peak intensity was drastically decreased to less than 30 % of Experiment (B). Although over 25 mm precipitation was found in Experiments (A) and (B), Experiments (C), (D) and (E) showed only less than 8 mm precipitation. Even though the number of input observations was the same, DA window clearly affects the forecast accuracy of this local severe rainfall event.
Next, we focus on the computational time (top panel bars of Fig. 3). We found tradeoffs between the computational cost and forecast accuracy. Experiment (A) achieved the best forecast accuracy in the price of the highest computational cost. Experiment (B) saved about a half of the computational time of Experiment (A), with similar rainfall intensity to Experiment (A). The computational costs of Experiments (C), (D) and (E) were consistently cheaper as expected, but with significantly lower forecast skills. Experiment (E) did not take observation data at all from 0246 UTC to 0259 UTC as described in Section 2, so that Experiment (E) was much cheaper than Experiment (D). However, Experiments (C), (D) and (E) did not predict heavy rainfall well as mentioned above.
To evaluate the forecast accuracy statistically, the root mean square errors (RMSE) for water vapor mixing ratio [g kg −1 ] at the 2-km level of the terrain following vertical coordinate (z* = 2 km, Saito et al. 2006) was computed. This measure is chosen because water vapor in the lower troposphere is strongly related to precipitation (Maejima et al. 2019). The RMSE was obtained by the difference between the ensemble mean fields of the analyses and the nature run, and was computed in the same domain of the nature run (Fig. 2 of Maejima et al. 2019). Figure 4 shows the time series of the domain averaged RMSE. The RMSE dropped rapidly and stayed around 0.8 g kg −1 by repeating the 1-minute-update LETKF cycles (red solid line). The RMSE of Experiment (B) showed slightly larger than that of Experiment (A) (red and orange solid lines), but the 3-minute-update 4D-LETKF cycles contributed to maintain the similar level of RMSE compared to 1-minute-update 3D-LETKF. In Experiments (C), (D) and (E), the RMSE gradually increased, even if the 5-or 15-minute-update 4D-LETKF cycles were performed (green, light-blue and blue solid lines). We can see a clear gap between orange and green dashed lines from 0310 to 0320 UTC, but the gap became smaller after about 0325 UTC. The results show that 5 minutes or longer DA windows are too long to capture the intense rainfall in this event. Figure 5 shows liquid water path (LWP) [kg m −2 ], i.e., vertically integrated liquid water amount, and mean-sea-level temperature in the analysis fields at 0300 UTC, the initial time of the extended forecasts. A band-like rich LWP zone extended in the east-west direction (black circles in Fig. 5a). Moreover, two very rich LWP areas corresponds to the well-developed convective cells located near the disaster site in the nature run (X, Y in black circles in Fig.  5a). The location of the rich LWP areas X and Y in experiments (A) and (B) were similar to the nature run. The corresponding peak rain intensities were also reproduced well (Fig. 3, bottom). The rich LWP areas X and Y became weaker consistently with longer DA windows. Experiment (C) still showed clear X and Y cells but with weaker peak rain intensity. Experiments (D) and (E) somewhat showed similar cells, but they were much weaker and dislocated northwestward. If we look at temperature field, we find a large temperature gradient exists along to the rich LWP band (white circles in Fig. 5b). Maejima et al. (2019) mentioned its formation mechanism by Kusabiraki et al. (2011). Namely, a significant cold pool and outflow near the surface existed in the northern part of the rainband, whereas from the southern side, warm and moist southwestern winds flowed into the rainband. The southern side temperature reached 32 degrees Celsius, whereas the northern side was only less than 29 degrees Celsius. The temperature gradient is located in a similar region to the nature run in Experiments (A). Although the southern side temperature in Experiment (B) is 1-degree-C lower than that in Experiment (A), the location of the temperature gradient area is similar to the nature run. However, in Experiments (C), (D) and (E), the warm area in the southern side was shifted northward, and the temperature gradient were shifted similarly. According to Kusabiraki et al. (2011), the meso-scale frontal structure was significant features of this rainstorm event, and Experiments (A) and (B) well reproduced them (Figs. 5a and 5b). However, in the other experiments, both the LWP and temperature gradients in the circled area were underestimated. The LETKF cycles with 3-minute window or faster improved the analysis fields in this regard, leading to better forecasts in this rainstorm event.

Conclusion and recommendations
This study performed additional OSSEs following Maejima et al. (2019) using the 4D-LETKF with various DA window lengths (Hunt et al. 2004;Miyoshi and Aranami 2006) and investigated the tradeoff between the computational time and forecast accuracy.
We investigated six scenarios to evaluate the forecast accuracy and computational costs. Running full 1-minute cycles showed the best forecast accuracy but with the highest computational cost. The computational time could be shortened by taking a long DA window, but the forecast accuracy became worse even though the same number of observations were used. The 3-minute DA window saved about a half of the computational costs of the 1-minute DA window, but predicted the peak rainfall amount well. Therefore, 3-minute DA window is more cost-effective. Although 5-minite window or longer were cheaper, the heavy rainfall was not predicted well.
We found a significant gap between the 3-minute window and 5-minute window. The 3-minute window maintained similar rainfall amount to full 1-minute cycles. However, with the 5-minute window or longer, rainfall amount was drastically underestimated. The time series of the RMSE showed a gap between the 3-minute window and 5-minute window up to about 25-minute forecasts.
To summarize, a 3-minute or shorter DA window would be a good choice to maintain the forecast accuracy while reducing the computational cost based on the results from this particular single case.
To generalize the implication, further experiments with dif- ferent cases would be necessary. Although this study investigates only a single case of heavy convective rainfall, we need to investigate other cases including different types of events. The BDA technology opened the door to very precise, 100-m-mesh, 30second-update severe weather prediction, and this study provides the first step toward a failsafe workflow for real time application of the BDA-based NWP system.