Monthly adjustment of Global Satellite Mapping of Precipitation ( GSMaP ) data over the VuGia – ThuBon River Basin in Central Vietnam using an artificial neural network

The performance of Global Satellite Mapping of Precipitation data (GSMaP_MVK, version 5.222.1) over the VuGia–ThuBon River basin and surrounding areas in central Vietnam was examined on a monthly basis in comparison with rainfall gauged at eight meteorological stations and a gridded rainfall product of the Asian Precipitation – HighlyResolved Observational Data Integration Towards Evaluation of Water Resources project (APHRODITE, V1003R1). APHRODITE represented in situ observations well, whereas GSMaP had very low performance over the study area for the period 2001–2007. Particularly, GSMaP exhibited large negative rainfall biases for the winter monsoon period from October to December and the biases tended to increase as the elevation decreased. A correction method using an artificial neural network (ANN) was implemented for the GSMaP rainfall over the VuGia–ThuBon River basin. Validation showed that the ANN correction method significantly improved the GSMaP quality in terms of spatial correlation, rainfall amplitude, and Nash–Sutcliffe efficiency coefficient for both the dependent period 2001–2005 and the independent period 2006–2007.


INTRODUCTION
In the Asian monsoon regions, particularly in Vietnam, water-related disasters have taken many lives and devastated infrastructure such as roads, government buildings, power and telecommunication lines (Adikari and Yoshitani, 2009).In order to predict such damaging events and mitigate their negative effects, accurate and timely rainfall information is required.While providing direct and precise information, a traditional rain-gauge network often fails to capture the spatiotemporal variability in rainfall.To overcome this deficiency, satellite products have recently become an important alternative source of rainfall data.Various data products are available, such as the Tropical Rainfall Measuring Mission (TRMM) precipitation data (Simpson et al., 1996), the Climate Prediction Center Morphing (CMORPH) product (Joyce et al., 2004), the Precipitation Estimation from Remotely Sensed Information using Artificial Neural Networks (PERSIANN) product (Sorooshian et al., 2000), the Naval Research Laboratory (NRL) blended product (Turk and Miller, 2005), and the Global Satellite Mapping of Precipitation (GSMaP) product (Ushio et al., 2009).
Besides the remarkable advantages in spatiotemporal coverage, there are still many issues with satellite data that need to be resolved.For example, satellite rainfall data still yield poor performance, especially over mountainous and coastal areas (e.g., Kubota et al., 2007;Shige et al., 2013).Thus, more research on satellite rainfall correction processes and rainfall characteristics over specific land areas are needed.Some correction approaches have already been discussed and made available.For example, Vila et al. (2009) applied a combination of additive and multiplicative bias correction schemes to satellite rainfall data in order to obtain the lowest bias in comparison with rain-gauge values.Tian et al. (2010) proposed an approach that establishes a statistical relationship between satellite estimates and historical gauge measurements by using Bayesian logic, which is then applied to real-time satellite estimates when gauge data are unavailable.Ozawa et al. (2011) introduced another method of correction by estimating rainfall area movement based on the satellite rainfall data itself.This method was successfully applied to Taiwan.
Facing the ocean in the east and having somewhat complicated topography in the west, central Vietnam has peculiar rainfall characteristics.While most Southeast Asian regions experience a simple summertime rainfall maximum during the Asian summer monsoon period, the maximum rainfall period in central Vietnam is from November to December (Matsumoto, 1997;Yokoi et al., 2007).Although some studies have been conducted to understand the mechanisms and characteristics of rainfall over Vietnam (e.g., Yokoi and Matsumoto, 2008;Takahashi and Yasunari, 2008;Chen et al., 2012;Nguyen-Thi et al., 2012;Takahashi, 2012), large knowledge gaps remain.One of the reasons is that the rain-gauge network in Vietnam is still not dense enough to meet the demands of practitioners and researchers.Therefore, utilization of satellite-derived rainfall data may become a useful alternative that helps to overcome the shortcomings of the current surface observations.This study analyzes the performance of the GSMaP data over the VuGia-ThuBon River basin in central Vietnam and subsequently proposes a method to correct the GSMaP data over the basin by using an artificial neural network.Note that the topography of the VuGia-ThuBon River basin is very steep with an approximately 100-km width (Figure 1).Consequently, water-related disasters such as floods occur often in this region.Therefore, accurate information on rainfall distribution over wide areas within the region is required in order to mitigate the damage caused by such water-related disasters.

DATA
The GSMaP_MVK data, version 5.222.1, which have been available since August 2012, are used in this study.The data have a horizontal resolution of 0.1° and a temporal resolution of 1 h, covering the domain from 60°N to 60°S and the period from March 2000 to November 2010.Hereinafter, these data are simply called GSMaP.The technique used to generate GSMaP was described by Ushio et al. (2009).In this study, we consider only the precipitation on land for the VuGia-ThuBon River basin and surrounding area within the region 14.5°N to 16.5°N and 107.2°E to 109.4°E.
To compare with and validate GSMaP, two different precipitation datasets are used: (1) Monthly rainfall observations from eight meteorological stations located within the study region for the period 2001-2007 (see Figure 1 and Table I for details).( 2) The daily gridded data from the Asian Precipitation -Highly-Resolved Observational Data Integration towards Evaluation of Water Resources (APHRODITE) project (Yatagai et al., 2009(Yatagai et al., , 2012)).Version V1003R1 of the precipitation data for the Asian monsoon region was used in this study.The data have a resolution of 0.25° and cover the period 1951-2007.Hereinafter, these data are simply called APHRODITE.

PERFORMANCE OF GSMAP OVER THE VUGIA-THUBON RIVER BASIN
The overlapping period of GSMaP, APHRODITE, and the observations at eight stations was used to compute the 2001-2007 monthly series for each dataset.For each station, the values of the nearest grid point of GSMaP and APHRODITE were used to directly compare with the gauge observations.Figure 2 shows a Taylor diagram (Taylor, 2001) of GSMaP (cyan circles) and APHRODITE (yellow circles) compared to the rain-gauge observations for monthly precipitation.In this Taylor diagram, the ratio of the standard deviation of GSMaP or APHRODITE to that of the surfacegauged rainfall is represented by radial distance, and the correlation with in situ observations is represented by polar angle.Hereinafter, the standard deviation ratio is denoted by STDR.Each circle corresponds to the quality of a monthly series of GSMaP or APHRODITE at a predefined station for the period 2001-2007.The rain-gauge observations are represented by a point on the horizontal axis (unit correlation) at unit distance from the origin (no error in the standard deviation).Hence, the linear distance between each circle and this point is proportional to the root mean square error (RMSE).
Figure 2 shows that GSMaP was negatively correlated  with the rain-gauge observations for all eight stations.However, the correlation between APHRODITE and the observations was positive and high, commonly between 0.7 and 0.99.STDR was less than 0.6 at all eight stations for GSMaP but was small at only three stations for APHRODITE.STDR was largely underestimated by GSMaP.This means that the amplitudes of the rainfall variations at these stations were lower than those observed by the gauges.As seen in Figure 2, APHRODITE represented the in situ observations well, whereas GSMaP had very low performance over the study region.The high performance of APHRODITE was because the dataset was based on the available rain-gauge observations (Yatagai et al., 2009(Yatagai et al., , 2012)).Taking advantage of the regularly gridded dataset in subsequent analysis, APHRODITE was used as a proxy for surface observations in order to adjust GSMaP.To overcome the different resolutions of the data grids, APHRODITE was bilinearly interpolated on the 0.1° grid of GSMaP.Only the gridded rainfall data over the VuGia-ThuBon River basin were used.The basin boundary was derived from the hydrological data and maps based on shuttle elevation derivatives at multiple scales (HydroSHEDS data) (Lehner et al., 2008).
Figure 3 shows the 2001-2007 monthly rainfall averaged over the VuGia-ThuBon River basin for GSMaP, APHRODITE, and GSMaP minus APHRODITE (hereinafter denoted by ΔP).As seen in Figure 3a, GSMaP had large negative rainfall biases for the winter monsoon period (September to April), but reproduced the observed precipitation fairly well for other months.Particularly large negative biases were found during October, November, and December.As seen in Figure 3b, the performance of GSMaP over the basin was highly related (except for June, July and August) with the National Centers for Environmental Prediction (NCEP) Reanalysis 2 of the local 925-hPa zonal wind speed (Kanamitsu et al., 2002).One should note that the zonal wind speed is negative if the wind flows from east to west and vice versa.
Figure 4 indicates a definite linear relationship between ΔP and the elevation at each basin grid point from August to December.In August, when the summer monsoon is active and the VuGia-ThuBon River basin is on the leeward side of the TruongSon Mountain range (see Figure 3b), ΔP tended to increase as the elevation decreased.From September onward, when the winter monsoon is active, |ΔP| decreased as the elevation increased; i.e., GSMaP had more biases in the downstream basin near the coast.The different slopes for different months seen in Figure 4 suggest that ΔP depends not only on elevation but also on monsoon activity.Note that the flooding season in the VuGia-ThuBon River basin occurs mainly from October to December; therefore, obtaining correct rainfall information over the basin is particularly important during the winter monsoon periods.

Description of the methodology
To correct the large biases of GSMaP, we constructed an artificial neural network (ANN) model based on the feed- forward multilayer structure with back-propagation algorithm (Rivolta et al., 2006).The ANN model involves one input layer, one hidden layer, and one output layer.The neurons of the input layer feed an input signal to the neurons of the hidden layer, and then the signal proceeds to the output layer.The GSMaP input signal consisted of monthly rainfall at each basin grid point (83 grid points totally).The APHRODITE data were interpolated into the GSMaP grid over the VuGia-ThuBon River basin and considered a proxy for the observations in the ANN training phase.The output o i n of a neuron i in layer n was estimated as where is a log-sigmoid activation function, are the synaptic weights connecting the input of the ith neuron in layer n to the output of the jth neuron in the preceding layer n − 1, and N n − 1 is the number of neurons in layer n − 1.
The ANN model was trained in order to minimize the error function by the steepest descent back-propagation algorithm.During the training phase, the synaptic weights were updated as follows: where α is the learning rate, β is the momentum parameter, E is the error function, and t is the training step.
In this study, the synaptic weights were estimated for each month for the entire basin area.The training period was 2001-2005, while the independent testing period was 2006-2007.

Results from the ANN correction
Figure 5 shows that the rainfall data corrected by the ANN method (hereinafter simply called ANN) represented an improvement over the GSMaP in terms of the spatial distribution for both the dependent (2001)(2002)(2003)(2004)(2005) and independent (2006)(2007) periods.As Nguyen-Thi et al.
(2012) showed, rainfall over Central Vietnam is concentrated during September, October and November (SON).Figure 5 thus focuses on the SON months for reasons of brevity.Both APHRODITE and ANN revealed that the downstream region of the VuGia-ThuBon River basin had more rainfall than the upstream region, which was not clearly represented by GSMaP.For the dependent period 2001-2005, the ANN SON rainfall was generally less than its APHRODITE counterpart.The APHRODITE SON rainfall showed that 2006 was relatively dry compared to the 2001-2007 average, so the ANN method overestimated the APHRODITE values.In contrast, the ANN values were small compared to APHRODITE values for a wet period such as SON 2007.
Figure 6 displays the correlation of the monthly APHRODITE with the GSMaP and ANN products over the VuGia-ThuBon River basin.For the dependent period 2001-2005, the monthly ANN exhibited good correlations of 0.75-0.9.The correlation of the adjusted data was a particular improvement on that of the unadjusted data (GSMaP) in the downstream region.The independent test for [2006][2007] showed that the correlations between ANN and APHRODITE ranged from 0.7 to 0.9, which are higher than those between GSMaP and APHRODITE.
As mentioned above, APHRODITE is considered a proxy for surface observations.The ANN correction method applied for GSMaP has been shown to bring the satellite rainfall data closer to the gauge-based gridded observations.To consolidate the results, an additional validation of the ANN method was performed by comparing the gridded The differences between the datasets and the rain-gauge observations can be quantitatively expressed using the Nash-Sutcliffe efficiency coefficient (Nash and Sutcliffe, 1970), which was defined as follows: , where P o is the gauge rainfall, P g is the gridded rainfall (GSMaP, APHRODITE, or ANN), is the rainfall amount at time t, and is the observed mean rainfall over the specified period [1, T].
By definition, Nash-Sutcliffe efficiencies can range from −∞ to 1.The closer the efficiency to unity, the more accurate the gridded rainfall at the station.An efficiency of zero indicates that the gridded data are as accurate as the mean of the gauge rainfall, whereas efficiency less than zero indicates that the mean of the gauge rainfall is a better predictor than the gridded data.The efficiency coefficient of GSMaP was the lowest among those of the datasets for both stations (Table II).For the dependent period 2001-2005, the mean of the gauge rainfall was a better predictor than GSMaP (E = −0.13)at the TraMy station.The ANN correction method clearly improves the quality of rainfall data compared to the GSMaP product for both the dependent and independent periods, particularly at the downstream DaNang station.Table II also shows that the gridded datasets were more efficient (i.e., the Nash-Sutcliffe efficiency is closer to unity) at the coastal DaNang station than at the upstream TraMy station.

CONCLUSIONS
This study has shown that the rainfall data of GSMaP are negatively correlated and have much lower standard deviations in comparison with those observed at the eight meteorological stations in central Vietnam.Over the VuGia-ThuBon River basin, GSMaP has large negative rainfall biases, particularly from October to December.There was also a definite linear relationship between the GSMaP rainfall biases and the elevation at each basin grid point from August to December.During the winter monsoon period, GSMaP biases decreased as the elevation increased, i.e., the data has more biases in the downstream basin near the coast.The large differences suggest that corrections should be applied before the data are used for any localscale applications.
The ANN model based on the feed-forward multilayer structure with back-propagation algorithm was successfully applied in this study to correct GSMaP over the VuGia-ThuBon River basin in central Vietnam.The ANN correction method significantly improved GSMaP in terms of spatial correlation, rainfall amplitude, and Nash-Sutcliffe efficiency coefficient for both the dependent period 2001-2005 and the independent period 2006-2007.
The current ANN method is based on the assumption that APHRODITE is as good as the real observations made at the surface, although large differences between APHRODITE and the gauge data remain (e.g., the entries for TraMy station in Table II).A denser rain-gauge network would thus be very helpful for making better corrections.A strategy for using the ANN method to adjust the data on an hourly time scale and applying the corrected GSMaP to water resource management will be our next challenge.

Figure 1 .
Figure 1.Topography of the VuGia-ThuBon River basin (red boundary) and surrounding areas.Locations of eight meteorological stations used in the study are indicated by red circles

Table I .
List of the meteorological stations within the VuGia-ThuBon River basin and surrounding areas.The TraMy and DaNang stations are located within the basin.

Table II .
The Nash-Sutcliffe efficiency coefficients between APHRODITE, GSMaP, or ANN and the rain-gauge observations at the TraMy and DaNang stations for the dependent period 2001-2005 and the independent period2006-2007.