Performance evaluation of Global Satellite Mapping of Precipitation ( GSMaP ) products over the Chaophraya River basin , Thailand

GSMaP_NRT (Near Real Time) is a viable tool to provide satellite-based precipitation data for further analysis. Its usefulness can be evident in the areas where continuous precipitation data is vital. This is why GSMaP_NRT performance has been evaluated globally. In this study, we evaluate its performance in terms of 1) rainfall detection capability based on Probability of Detection (POD) and False Alarm Ratio (FAR) and 2) estimation capability based on correlation coefficient and Root Mean Square Error (RMSE) over the Chaophraya River basin during 2009– 2010. A non-realtime GSMaP_MVK (Moving Vector with Kalman filter) is also evaluated. Our results show that, at daily scale, both GSMaP_NRT/GSMaP_MVK performs well in rainy season (POD and FAR can reach 0.75/0.94 and 0.45/0.49, respectively) with acceptable RMSE of 14.64/ 14.23 mm. GSMaP_NRT tends to under-estimate whereas GSMaP_MVK slightly over-estimates the rain rates with correlation coefficient of 0.70 and 0.75. We conclude that GSMaP_NRT is considered good but not sufficient for nearrealtime rainfall monitoring applications; whereas GSMaP_MVK is suitable for climate change studies.


INTRODUCTION
Global climate change effects have been observed through occurrences of more frequent and more severe flood and drought cycles in Thailand during recent years.These events are considered as rain/no-rain triggered disasters.With precipitation data from reliable rainfall monitoring systems, inevitable losses and damages could be mitigated.Several near global coverage precipitation products using satellite observations have been developed worldwide such as Climate Prediction Center MORPHing technique -CMORPH (Joyce et al., 2004), Precipitation Estimation from Remote Sensing Information using Artificial Neural Network -PERSIANN (Hsu et al., 1997) and Global Satellite Mapping of Precipitation -GSMaP (Ushio et al., 2009).For monitoring purpose, satellite precipitation products with high spatial and temporal resolution are of more interest.Most of them combine data from passive microwave (PM) and thermal infrared (TIR) sensors.Among these products is GSMaP which offers several versions of precipitation products for various applications.However, since rainfall estimation models usually vary by climatic regions, a thorough accuracy assessment should be evaluated before applying these products for regional use.Although GSMaP_NRT products have been evaluated globally (Dinku et al., 2010), the validation of GSMaP_NRT in Thailand has just started (Suwanprasert et al., 2011); which showed that GSMaP_NRT overestimates the number of rainfall occurrences at daily time scale.However, for GSMaP_NRT to be effectively used for monitoring purpose, the validation of hourly rain rate is also necessary.
In this paper, two GSMaP precipitation products called GSMaP_NRT and GSMaP_MVK (the dataset v5.222 was used (gsmap_mvk.20101101.0000.v5.222.dat.gz) which covers March 2000 to November 2010) are chosen for validation against rain gauge data at various time scales (hourly, daily, and monthly).The former is a near-realtime product for monitoring purpose whereas the latter is a more complicated, non-realtime product for climate change studies, e.g., drought and flood.The accuracy assessment was performed over a two-year period from 2009 to 2010.

STUDY AREA AND DATA SOURCES
Our study area is the Chaophraya River basin which is the most fertile paddy field area of Thailand.It is located between 13.5°-16.1°Nin latitude and 99.5°-101°E in longitude with the area of 20,125 km 2 .The basin is regarded as a principal waterway and is heavily affected by major floods and droughts.The rain gauge data used in this study were collected from 70 stations (locations shown in Figure 1) during the year 2009-2010 and were provided by Thailand Meteorological Department (TMD).
Among six GSMaP products, two versions of hourly GSMaP datasets were evaluated using rain gauge data.They use TIR derived motion vectors for propagating PM estimates in time and space using the Kalman filter approach, resulting in 0.1 degree spatial resolution and 1-hour temporal resolution.GSMaP_NRT, is the near-realtime version with 4-hour data latency that simply contains only the propagation process forward in time while GSMaP_MVK, the standard version, contains both forward and backward propagation processes to provide more accuracy.Their differences are summarized in Table I.Both datasets were selected and downloaded during the same period of rain gauge data.

METHODOLOGY
This study evaluates GSMaP performance in two aspects: rainfall detection capability based on rain/no rain events and rainfall estimate capability based on correlation coefficient and root mean square error (RMSE) of rain rates.Rainfall detection evaluation starts with defining four scenarios to express the relationship between rain/no rain events detected by GSMaP_NRT and GSMaP_MVK and observed by rain gauges as shown in Table II.Consequently, performance indices which are accuracy (ACC), probability of detection (POD), and false alarm ratio (FAR) can be developed.Their expression and description are shown in Table III.
Rainfall estimate evaluation is based on a correlation and RMSE between rain rates estimated by GSMaP_NRT and GSMaP_MVK and observed by rain gauges located in

RESULTS AND DISCUSSION
Figure 2 shows the monthly rain scatter plots of data from rain gauge versus GSMaP_NRT and GSMaP_MVK whose correlation coefficients are 0.70 and 0.75, respectively.From the zero-crossing trend line, GSMaP_MVK is more likely to detect rainfall than GSMaP_NRT.This is due to higher POD and FAR of GSMaP_MVK as will be discussed later.
Figure 3 (a)-(c) elaborates on the daily rainfall detection performance.We can see that both GSMaP_NRT and GSMaP_MVK share similar trends.In rainy season, more rain events are observed and detected, hence, hits improve significantly whereas false alarms increase at lower rates.Table IV illustrates that GSMaP_MVK can detect more rain events which lead to higher hits and false alarms compared to GSMaP_NRT.This results in higher POD (>0.7 for GSMaP_NRT and >0.9 for GSMaP_MVK) and lower ACC (<0.55 for GSMaP_NRT and <0.5 for GSMaP_MVK).FAR also drops to nearly 0.3 for both due to improved hits.In general, GSMaP_MVK yields better rainfall detection performance confirming its effectiveness.
Figure 3 (d) shows the daily average RMSE which increases to more than 10 mm in rainy season for both GSMaP_NRT and GSMaP_MVK.This is because correct negatives turn into hits, false alarms, and misses as depicted in Table IV.Note that GSMaP_MVK achieves noticeably lower RMSE in most cases except during Sep-Oct 2010 possibly due to heavy rainfalls which caused flooding in October 2010.
Table V shows the daily statistics developed from occurrences in Table IV.The more complex algorithm of GSMaP_MVK has resulted in more hits and false alarms and becomes more evident during the rainy season.If we deliberate on each pair of scenarios, namely Hit/Miss and False alarm/Correct neg., we can deduce that GSMaP_MVK increases one and reduces the other at almost the same percentage.The first pair improves rainfall detection capability and therefore yields better POD.Nevertheless, ACC slightly decreases as the second pair accounts for more Table III.Description of ACC, POD, and FAR

Index
Expression Implication

ACC
The fraction of the correct detection to the number of overall events

POD
The ratio of correct rain detection to the number of rain events observed FAR The fraction of rain detection that turn out to be wrong  ----------------------------- VI.GSMaP_MVK still behaves similarly to the daily statistics, especially its POD which is almost double that of GSMaP_NRT.For applications requiring near-realtime monitoring such as floods and landslides, GSMaP_MVK is considered a good candidate when its data can be available in realtime whereas GSMaP_NRT alone is not sufficient for this purpose.
Table VIII compares our results with Suwanprasert et al. (2011).Evidently, using a different number of rain gauges can produce quite different results.Using Inverse Distance Weighting (IDW) interpolation with only 17 gauge stations may be insufficient to represent rainfall observation over the basin.On one hand, rain events that could have been observed with rain gauges were incorrectly interpolated as no-rain events -hits were turned into false alarms.This explains lower ACC and POD, and much higher FAR (even in the rainy season).The extremely low coefficient implies that GSMaP_NRT is not suitable for hourly rainfall estimation (i.e., insufficient for near-realtime applications).

CONCLUSIONS
This work evaluates performance of GSMaP_NRT and GSMaP_MVK using rain gauge data over the Chaophraya River basin at various temporal scales.From monthly rain scatter plot, GSMaP_NRT is under-estimated, but GSMaP_MVK is slightly over-estimated with correlation coefficient of 0.70 and 0.75, respectively.From daily rain statistics, both GSMaP_NRT and GSMaP_MVK perform better in terms of POD and FAR during the rainy season.RMSE is higher as a result of less correct negatives.However, GSMaP_MVK outperforms GSMaP_NRT in terms of POD and RMSE because of its complex algorithm.This advantage becomes more evident at hourly rain statistics where POD almost doubles (up to 0.78).Although GSMaP_MVK performance is suitable for near real-time applications such as flood monitoring, its computation latency is prohibitive.Alternatively, GSMaP_NRT can be used but is insufficient for this purpose.

Figure 2 .
Figure 2. Monthly rain scatter plots of rain gauge versus GSMaP_NRT (left) and GSMaP_MVK (right) indicating that GSMaP_NRT is underestimated whereas GSMaP_MVK is overestimated

Table II .
Scenarios description

Table I .
Differences between GSMaP_NRT and GSMaP_MVK Figure 1.Thailand meteorological department rain gauge network consisting of 70 stations in the study area the same 0.1 deg grid interval (or pixel).Monthly rain scatter plot (1 point/gauge/month) is used to find the correlation coefficient between rain gauge data versus GSMaP_NRT and GSMaP_MVK products.At daily scale, there are two parts of analysis.Firstly, performance indices grouped by month are plotted to indicate how they exhibit over a number of years.Secondly, performance indices are divided into four cases: 2009, 2010, 2009_rain, and 2010_rain (the last two cases focus on the rainy season from May to October) to see how they exhibit during the rainy season compared to the whole year.At hourly scale, performance indices based on a similar four cases as daily scale are shown.These finer grain indices are substantial indicators for near-realtime applications.

Table IV .
Percentage of scenarios based on daily rain

Table V .
Performance indices based on daily rain