Toward flood risk prediction : a statistical approach using a 29-year river discharge simulation over Japan

A statistical approach that considers the bias and uncertainty of models is proposed for interpreting the simulated river discharge as a flood risk. A 29-year simulation was performed to estimate parameters of the Gumbel distribution for the probability of extreme discharge. The estimated discharge probability index (DPI) showed clear agreement with observed values. Even more strikingly, high DPI in the simulation corresponded to actual flood damage records. This indicates that the real-time simulation of the DPI could potentially provide flood warnings. This paper also suggests an application using the same statistical method for real-time flood risk prediction that overcomes the lack of sufficiently long simulation data through the use of a pre-existing long-term simulation to estimate statistical parameters. A preliminary flood risk prediction that used operational weather forecast data for 2003 and 2004 gave results similar to those of the 29-year simulation for the Typhoon Tokage (T0423) event on October 20th 2004, demonstrating the transferability of the technique to real-time prediction, which is differently biased.


INTRODUCTION
As a result of great improvements in major floodcontrol infrastructure, Japan has rarely experienced more than 100 casualties due to floods in any year since the late twentieth century.In 2004 however, a combination of torrential rainfall events and the largest number of typhoon landings in history killed more than 200 people.These incidents exposed the need to develop a new flood prediction method that could be applied for a large number of small basins.This is because the deaths of most of the victims, with little warning in advance were a result of flooding in small tributary basins.
Various prediction methods have been developed recently.For example, in the United States, the National Oceanic and Atmospheric Administration (NOAA) began the Advanced Hydrologic Prediction Service (McEnery et al., 2005), which incorporates statistical approaches with ensemble forcing generated from numerically predicted hydrologic fields (Perica et al., 2000).In the United Kingdom, Moore et al. (2005) shows that flood warning have particular emphasis on precise rainfall estimation.In Japan, the soil water index, a sta-tistical risk estimation of landslides (Okada et al., 2002), has been upgraded by the Japan Meteorological Agency (JMA) to apply to river discharge; discharge prediction in the Yodo River basin was implemented using a distributed rainfall-runoff model with an algorithm of dam control by Sayama et al. (2006).
In view of this background, this paper proposes a new and effective way of predicting flood risk in Japan, with a statistical method that uses a system of physically based models.The following section describes the algorithm, including the input forcing data, validation data and the models.The third section presents the results and validation of a 29-year simulation and introduces the concept of a Discharge Probability Index (DPI) based on statistical analyses and comparisons with actual flood records.This concept is applied to real-time prediction in the fourth section.The final section contains the summary and conclusions.

Data
Meteorological Forcing Data A surface meteorological forcing dataset was prepared using 29 years (1976 2004) of observed data that included precipitation, temperature, wind direction and speed from over 1,300 automated meteorological data-acquisition system (AMeDAS) sites.Air pressure, vapor pressure and cloud cover were obtained from 155 in situ weather stations.These observation sites are distributed across Japan at intervals of about 20 km on average.Other observational data, such as radar, were excluded because of poor long-term consistency in the characteristics of the data.The meteorological data were interpolated to 0.1 degree (about 10 km) using the inverse distance weighting method for the cloud cover, the zonal and meridional components of wind, and the amount of precipitation.Other variables were subjected to additional processes: surface temperature and pressure were corrected for altitude following the hydrostatic assumption; the surface specific humidity was obtained from the interpolated relative humidity, corrected temperature and pressure; and the hourly short-and long-wave radiation were estimated using the interpolated temperature, humidity and pressure following the quasiempirical formulae of Kondo and Miura (1985).
Observed River Discharge and Flood Records To validate the simulation, observed river discharge data were obtained from the Water Information System managed by the Ministry of Land, Infrastructure and Transport, Japan (MLIT) (data available at http://www1.river.go.jp).The data includes digitized hourly to daily discharge records for nearly 400 sites, with the earliest from the 1960s, although consecutive long-term daily data for more than 15 years exist for only about 32 sites.Records of flooding periods, the affected area, damage costs, the cause of damage, and many other factors are listed for each municipality in the Annual National Survey on Flood Damage published by the River Bureau of the MLIT.Although statistical surveys dating from the 1960s was available, older data before 1992 were excluded because of changes in the hazard criteria after 1992.The data on damaged municipalities was relocated to a 0.1-degree grid-coordinated database.

Models
An extended version of MATSIRO (Minimal Advanced Treatments of Surface Interaction and Runoff) land surface model (LSM) developed by Yoshimura et al. (2006), and a 0.1-degree version of Total Runoff Integrating Pathways (TRIP) by Oki et al. (1999) were used.MATSIRO was originally developed by Takata et al. (2003) as a land surface model attached to an atmospheric general circulation model; it solves the vertical one-dimensional energy and water budgets in each grid cell explicitly.The sum of the surface and subsurface runoff is inputted into TRIP to calculate river channel storage and flow.One of the unique characteristics of TRIP is its simplicity, for example the use of a constant effective velocity for river flow.A set of sensitivity experiments was performed to examine its applicability in the regional domain.

Evaluation of Sensitivity Experiments
In the sensitivity experiments, three different effective flow velocities were applied.The main differences between the experiments were the timing and the magnitudes of the peaks, although systematic biases of similar degree existed in all of the experiments.The experiments were evaluated for six major rivers for 1993 and 1994 using four different scores: root mean square error (RMSE), mean bias, correlation coefficient, and Spearman's rank correlation coefficient.The evaluations were not consistent at all sites; the slower velocities gave better results for the discharge data observed at Iwazu station in the Yoshino River basin and at Hirakata station in the Yodo River basin, whereas the faster velocities gave better results at Ojiya station in the Shinano River basin and at Senoshita station in the Chikugo River basin.One possible reason is that the flow speed is physically dependent on the geographical characteristics of each basin.This implies the need to improve the physical parameterization of TRIP, which can currently have only a single constant effective flow velocity.In this paper however, an effective velocity of 0.8 m/s was selected for subsequent analyses as it is consistent with Yoshimura et al. (2007) and as it gave the best scores overall.

Calculating the Discharge Probability Index
As stated in the previous section, there are significant systematic biases between the simulations and reality, regardless of the various values of the effective velocity.This is inevitable because of both the errors in the forcing data and the native uncertainty of the models (Oki et al., 1999).Nevertheless, the results are reasonable in terms of the timing and the relative magnitudes of the peaks in the simulations.Subsequently, the statistical probabilities of the simulated river discharge data were calculated and are referred to hereafter as the Discharge Probability Index (DPI).For the probability distributions of the intervals and the magnitudes of peak values that exceeded a specific threshold value, the Poisson distribution and the exponential distribution were assumed respectively; a Gumbel distribution was introduced for the probability distribution of the annual maximum discharge (Hoshi, 1998): where F is the cumulative distribution function (CDF) of annual maximum values, D is the discharge, G is the CDF of the values that exceed a specific threshold value, is a constant of annual occurrence frequency, and and are the scale and the location parameters of the Gumbel distribution.
Based on the theory, the first to M th maxima of the daily maximum discharge during the 29-year simulated results were collected.The daily maximum discharge was used to avoid collecting consecutive peaks within a day.The estimates of the scale and location parameters of the Gumbel distribution, ˆand ˆ, were given by the method of moments as follows: where D with subscript i indicates the i th maximum; M and N are the numbers of samples and years respectively, which give , a constant for annual occurrence frequency.DPI ( ) for all the daily maximum discharges was calculated as The unit for DPI is years, meaning that the probability to exceed the discharge D in a year is 1/ and the expected occurrence is once in years if the discharge is an annual maximum value.
To verify the assumed exponential distribution of the probability distributions of the magnitudes of peak values that exceeded a specific threshold value, several tests with different were performed.Figure 1 shows histograms of the extreme daily maximum discharge and fitted probability density functions (PDF) for the observations and the simulation to exponential distribution.The standardized discharge in the figure is defined as Ds = (D DM)/(D1 DM).As verified at 32 observation sites with sufficient data, = 3 was found to be the most appropriate value based on the SLSC (standard leastsquares criterion; Takara and Takasao, 1988), the RMSE and correlations.
The estimated DPI from the observations were compared with those of the simulation at the 32 sites.In -23 - terms of dichotomous flood detection, in which a DPI longer than two years is regarded as a "flood" in both the simulation and observations, the threat score of the simulation was 0.39 (267 correctly detected flood events divided by 505 total observed events and 176 false alarms), whereas the bias score was about 0.88 (443 total simulated flood events divided by 505 total observed flood events).For the correctly detected flood events, 60% of the events (162 events out of 267 total events) were estimated within a two-fold range (1/2 to 2) of the estimations from the observations, and there were 71 underestimated events and 34 overestimated events.Even though there is no comparable study, these numbers might be useful metrics for future development of similar studies.

Simulated DPI and Flood Damage
Figure 2 shows the observed daily precipitation, the simulated daily maximum discharge, DPI, and observed flood-damaged area on October 20th, 2004 when Typhoon Tokage (T0423) hit western Japan.The use of the DPI reveals more clearly where statistical extremes occurred compared to only the use of the absolute precipitation amount or the simulated discharge.Furthermore, the regions with intense precipitation did not necessarily correspond to damaged regions, particularly in the eastern part of Shikoku Island (the region about 34°N and 134°E) and the northwestern part of the Kinki region (about 35°N and 135°E).Not only the effects of the horizontal water transport by the river model, but also explicit consideration of the influence of soil wetness on the rainfall-runoff process in the land surface model presumably contributed to why the DPI estimation using this system gave the specific signs to the floods and actual flood-damaged regions.Even though we cannot separately evaluate the contributions of these impacts due to the lack of detailed observations on soil wetness and runoff to river channels etc., the similarity to the historical records demonstrate the accuracy of our rather simple system for simulating extreme events.
To investigate the relationships of the index and flood damage more quantitatively, the probability of detection (POD) and correct alarm rate (CAR) of the simulated DPI were calculated for the data from 1993 to 2004 according to the screening of the statistical flood damage data as described above.In this analysis, several variations in the flood risk detection criteria are proposed to allow some flexibility in space and time regarding the simulation error.Readers should be reminded that the grid-coordinated statistical data also includes some error, as a result of the conversion process that causes inconsistencies between the locations of municipality centers and the exact regions damaged and between multi-day damage events on the statistical records and the exact timings when floods occur.Detection criteria involving only the exact day and the exact grid are called D0S1, the same criteria for the day after the exact day are called D1S1, and criteria for the immediately adjacent surrounding grids are called D1S9; similarly, criteria that include the next nearest surrounding grids are called D1S25. Figure 3 shows the POD of the various detection methods.D1S25 has a POD of almost 70%, whereas D0S1 has at most a 30% POD.Similarly, the highest CAR was obtained for D1S25 detection (figure not shown).Overall, 10% of the alarms with DPI exceeding 200 years corresponded to actual flood events.This low percentage partly reflects the fact that although flood damages were not recorded for unpopulated areas such as mountainous regions, the DPI was estimated for these as well as populated areas.This was also due to the fact that flood-control infrastructures effectively prevented damages.Improvement of the models will be necessary for more practical and reliable alarms.Nevertheless, these initial results matching the simulated discharge statistics and actual flood damage are encouraging.

TOWARD REAL-TIME FLOOD RISK PREDICTION
In real life, the DPI needs to be estimated a certain time before an actual event.However, due to the limited period of the data from a real-time numerical weather forecast (usually several years), it is unrealistic to estimate the statistical parameters directly from the data.Therefore, as an alternative, we tried merging the -24 -  statistical information, which was already obtained from the 29-year data in the previous section, to the short-term data, with the ultimate purpose of being able to predict reliable DPI in real time.In this context, the operational mesoscale spectrum model (MSM) forecast grid point value (GPV) data from the JMA for 2003 2004 were used as meteorological forcings to the system.
Major meteorological variables of MSM 6-hourly forecast at a 0.1-degree horizontal resolution were stored in hourly time steps up to 18 hours ahead.The first 6-hours of data for each forecast were used in this trial.Which forecast period to use should be carefully evaluated because the MSM forecast during the first few hours may not be the most reliable due to a spin-up problem.However, this issue will be left for further studies.Similar to the 29-year simulation presented above, a 2-year river discharge simulation using the identical model was performed.Hereafter, the 29-year simulation with the AMeDAS data and the new 2-year simulation are referred as AMD and MSM, respectively.
The probability of exceeding the i th largest discharge Di in the N-year data in any 1/ year is given using the Gringorten plotting position for an exponential distribution function (Gringorten, 1963): where M = N as shown in Eq. ( 2).By substituting this equation and Eq. ( 2) into Eq.( 1), the following equation is obtained: where M is the threshold rank that was determined as M = 87, which was derived from = 3 and N = 29 in the previous section.Now, let us assume that the extreme events in the two-year period are similarly simulated so that the relative ranks over the long term are conserved as in the 29-year period, i.e., if the largest daily maximum discharge in 2003 is the i th in the 29-year simulation, it is assumed that the largest daily maximum discharge in the two-year simulation in 2003 is the i th in the 29-year period.This assumption is partly justified because there is sufficient agreement with the same simulation in Yoshimura et al. (2007).Furthermore, Figure 4(a) shows the distribution of Spearman's rank correlation between AMD and MSM daily maximum river discharge for 2003 to 2004.For this calculation, the first to 120 th maximum discharge from each of the two datasets was compared with an allowance of ±2-day time lag for detecting the same flood event.Though the correlations were poor particularly for large parts of Hokkaido (the northern island) and regions facing the Japan sea, perhaps indicating the difficulty of snowfall amount estimation by MSM, other regions (about 2/3 of all) certainly showed positive correlations.It was therefore concluded reasonable to proceed with this assumption, albeit with great caution.
Subsequently, by applying several sets of (i, DMSM, i) to Eq. ( 5) and estimating the linear regression line in the ln((M + 0.12)/(i 0.44)) versus DMSM, i field, the slope and intercept of the line give the estimated MSM and DMSM, M respectively, where the subscript MSM indicates the parameters of the MSM simulation.The largest candidate of i for this process was 120, so that the estimation of DMSM, M is not an extrapolation, but an interpolation, conferring additional reliability.By applying MSM and DMSM, M to Eq. ( 2) and Eq. ( 3), the DPI for a discharge or the discharge for a certain DPI (e.g.once-in-100-year discharge) can be calculated.

SUMMARY AND CONCLUSIONS
To improve real-time flood prediction and warnings, this paper proposes a new statistical approach for interpreting the simulated river discharge.A 29-year qualityconsistent simulation using observed surface meteorological forcings and a river discharge simulation system that consists of a rainfall-runoff model, MATSIRO, and the river routing model TRIP on a 0.1-degree grid was performed.
In addition to direct validation of the river discharge, we validated the model using a statistical method that assumes the Gumbel distribution to estimate the discharge probability index (DPI) of extreme daily maximum discharges.The threat score of the simulation was 0.39, whereas the bias score was about 0.88.For those of correctly detected flood events, 60% of the events placed within a two-fold range (1/2 to 2) and there were more underestimated events than overestimated events.The DPI was also compared with actual flood damage records and showed reasonable agreement; the possibility of detection increased to 70% and the correct alarm rate increased to 10% with the inclusion of some flexibility in time and space to the simulated flood detection.
This paper also suggests an applied technique that uses the same statistical method for real-time flood risk prediction, and which overcomes the lack of sufficiently long simulation data by using a priori long-term simulations to estimate statistical parameters.The data from -25 - the operational mesoscale model forecast were used for a preliminary simulation of 2003 2004.The results are similar to those of the 29-year simulation for the Typhoon Tokage event on October 20th 2004, indicating good transferability of the statistical flood risk estimation technique to a rather short-term prediction dataset.Currently, this prediction method for the DPI is not sufficiently accurate, but its accuracy can be improved by a longer simulation period for determining better statistical distributions, better-quality meteorological forcings, and advances in the models.

Figure 1 .
Figure 1.Fitted exponential PDFs and histograms for data sampled at Iwazu on the Yoshino River in the Shikoku region for (a) observations and (b) simulation.

Figure 3 .
Figure 3. Differences in the detection criteria and their probability of detection (POD).

Figure 4
Figure 4(b) shows predicted DPI for October 20th 2004 (the day Typhoon Tokage hit Japan), which was issued at 6 am using the MSM meteorological forcings.Regardless of the systematic bias between MSM-and AMeDAS-derived simulations, estimated DPI by MSM (Figure 4(b)) and AMD (Figure 2(c)) are very similar and comparable to Figure 2(d), indicating good transferability of the technique to a rather short-term and differently biased prediction dataset.However, false DPI was issued in some parts, particularly in the western part of Kii Peninsula and the southern part of Shikoku Island.This was partly due to less accuracy in the parameters of the probability distribution derived from the shortterm data, and lower accuracy in the precipitation and other meteorological forcings.

Figure 4 .
Figure 4. (a) Distribution of Spearman's rank correlation coefficients between AMD and MSM daily maximum river discharge for 2003 to 2004 (see main text for detail description), and (b) the daily maximum DPI from 18hour predicted discharge at 06:00 JST on October 20th 2004, using the MSM forecast data.