Tropical cyclone size identification over the 1 Western North Pacific using support vector 2 machine and general regression neural network

Knowledge about tropical cyclone (TC) size is essential for disaster prevention and mitigation strategies, but due to the limitations of observations, TC size data from the open ocean are scarce. In this paper, several models are 35 developed to identify TC size parameters, including the radius of maximum wind 36 (RMW) and the radii of 34 (R34), 50 (R50), and 64 (R64) knot winds, using 37 various machine learning algorithms based on infrared channel imagery of 38 geostationary meteorological satellites over the Western North Pacific (WNP). 39 Through evaluation and verification, the trained and optimized support vector 40 machine models are proposed for RMW and R34, while the general regression 41 neural network models are set up for R50 and R64. According to the independent-sample evaluations aircraft Joint Center best mean absolute errors of R34, R50, R64, and RMW are / 34 / 38, N/A / 21, 25 / 25 km, are 39 / 46, 34 / 31, N/A / 17, and 17 / 19 km, is an overall slight underestimation of the parameters, which needs to be analyzed and improved in future study. aircraft observations of TCs in the WNP ceased new dataset of TC sizes a thorough estimation of wind


/ 49
methodology are unclear. 79 Various approaches have been employed to investigate TC size, including 80 using synoptic charts (Brand 1972;Merrill 1984 (radius of the TC eye and RMW) using satellite cloud images, radar, and aircraft 120 observations. Compared with aircraft observations, the MAE of the RMW was 121 2.8 km, which is better than that of Kossin et al. (2007). The sample size in the 122 7 / 49 above studies was relatively small, and the estimation method involved utilizing 123 multi-platform observations (Kossin et al., 2007), including satellite IR imagery, 124 radar, and aircraft observations. Therefore, the method is not easily applicable 125 in operational use, especially for some agencies that find it difficult to obtain 126 multi-platform observations in real time. 127 Knaff et al. (2011,2014,2016) successively developed a TC surface wind  This paper establishes the nonlinear models between observations 164 obtained from geostationary meteorological satellites and TC size using ML. 165 We carry out an objective TC size estimation and construct a TC size climate 166 9 / 49 dataset with fine structural characteristics in the WNP. Section 2 introduces the 167 data, whilst the ML methods and TC size estimation tests are discussed in 168 Section 3. The construction and validation of the TC size dataset are illustrated 169 in Section 4. A summary and conclusions will be given in Section 5.  190 In this study, the R34, R50, and R64 in the northeast, southeast, southwest,

244
The five machine learning algorithms are given in Table 1

250
Previous studies have shown that TC intensity, wind structure, and TC  is also regarded as an additional input to the R50 and R64 estimation models.

334
The test results are shown in Fig. 2 and Table 2. There is little difference 335 between the estimation errors of different methods as the input BTP radius 336 moves from the inner core (10 grid points from the TC center) to the outer edge 337 (80 grid points from the TC center). The estimation errors decrease and then 338 increase with R for both the R50 (Fig. 2a) and R64 (Fig. 2b). The mean in the best estimation of the R50 and R64. Therefore, 20 grid points is chosen 343 as the optimal model input. Table 2 shows that the GRNN algorithm performs  the MAE levels out and then begins to increase. We find that there is a minimum 362 estimated MAE for both the R50 and R64 when the bandwidth is set to 9.8 and 363 23.8 in the GRNN models, respectively. We note that all of the above models     (Fig. 4), with a correlation coefficient of 0.39, which is statistically significant at 401 the 95% confidence level (T-test was used for all tests of statistical significance).

402
The blue ellipse in Fig. 4, which is the 95% confidence interval based on a 403 normal distribution, contains most of the samples. There are few outliers (red 404 crosses). The figure shows that the estimated R34 is consistent with that from 405 the JTWC best track data. However, the centroid of the data is slightly lower 406 than the fitting line, indicating that the overall estimated values of R34 are 407 slightly smaller than the best track data; i.e., R34 is slightly underestimated.   The error bars of R34 estimation for different months (Fig. 7) are variable:   The spatial distribution of estimation bias of R34 (Fig. 8)     Nevertheless, the estimation errors of this study are still smaller than those 529 from operational wind radii estimates, which can be as large as 25%-40% of   The estimated median error was 40 km, which is slightly larger than the value 536 in this study (39 km, compared with aircraft observations). However, in this 537 study, more TC size parameters are estimated and much more detailed 538 information about the TC wind structure is provided, including the R34, R50, 539 and R64 in four quadrants, as well as the RMW. Moreover, the ML algorithm 540 used in this study may be able to reveal the nonlinear relationship between The datasets generated and/or analyzed in this study are available from the 611 corresponding author on reasonable request.

612
The data supporting the findings of this study are available from National

819
The figure illustrations are the same as in Fig. 1.