Global high-resolution estimation of cropland suitability and its comparative analysis to actual cropland distribution

: Global 500-m (18-arcsec) resolution cropland suitability was estimated using recently developed high-resolution global cropland data and a digital elevation model. The high-resolution estimation more precisely represented topo‐ graphical constraints to agriculture that were not adequately reflected in previous low-resolution estimates. It also suc‐ cessfully suppressed the overestimation of cropland suit‐ ability on areas with steep slopes. Furthermore, the distinc‐ tion of rainfed and irrigated cropland removed suitability overestimation induced by human agricultural intervention and enabled more natural and realistic estimation of crop‐ land suitability. The comparative analysis between the esti‐ mated land suitability and actual cropland distribution revealed that, if only natural condition is considered, it is possible to expand cropland area by 9.25 million km 2 , which is more than needed in the future while socio-economic factors controlling cropland suitability should be considered for more practical assessment. The newly developed highest-resolution cropland suitability map is expected to contribute to solving upcoming water, energy, and food issues, by integrating with water resource models and biomass studies.


INTRODUCTION
Today, we are aware of the importance of the waterenergy-food security nexus (Taniguchi et al., 2017). To predict the future water use demand, it is required to estimate the growing demand for food and the subsequent increase of cropland expansion. In particular, the agricultural sector accounts for a large part (70%) of annual 3600 km 3 human water use (Food and Agricultural Organization, 2003), and in the past 40 years from 1960 to 2000, the water demand increased by 800 km 3 in accordance with the 40 million km 2 expansion of cropland (Shiklomanov, 2000;FAO, 2020). Moreover, compared with the cropland areas in 2005, an additional 10~25% of cropland is predicted to be needed by 2050 in accordance with different climate change and socio-economic scenario (Schmitz et al., 2014). Therefore, the increase of crop production is inevitable in the future. While future yield growth will play a substantial role, it is of great importance to understand the possibility of future cropland expansion when considering sustainable water resource management.
Natural conditions required for agriculture comprise climate, soil, and topography (Baker and Capel, 2011). There have been several studies about global cropland suitability based on these factors. The concept of cropland suitability was introduced by FAO (1976), and it is defined as "the fitness of a given type of land for an agricultural use." One of the common methods to estimate cropland suitability is the Agro-Ecological Zoning method introduced by Fischer et al. (2002). Climatic suitability is first estimated based on crop thermal and water requirements, and then the overall suitability is comprehensively evaluated by considering soil and topographic constraints. However, as the method is a comprehensive assessment, it is difficult to identify which factor constrains agriculture, and the analysis is limited in qualitative evaluation.
Another type of the method was proposed by Ramankutty et al. (2002), that quantitatively evaluated cropland suitability on a global scale as the product of two normalized functions (ranging from 0 to 1) of climate and soil conditions. This quantification made it easy to analyze the suitability from the perspective of each individual condition. Moreover, each evaluation function was estimated through the comparison between climate and soil data and actual cropland distribution, and this enabled a more realistic representation of how much limitation each condition imposed on crops. Thus, the method of Ramankutty et al. (2002) is favorable for the comparative analysis to the actual cropland, which is crucial but has hardly been discussed even in recent studies (Fischer et al., 2012;Zabel et al., 2014). The method by Ramankutty et al. (2002), however, had the following problems: topography was not considered as a natural constraint to agriculture; and the spatial resolution of their estimate was 0.5 degrees, which is not fine enough to be utilized for regional analysis.
Satellite observation and remote sensing capabilities have recently increased remarkably, and higher resolution satellite data, such as Landsat-8 and Sentinel-2, are now available on a global scale. A considerable improvement in computational capacity has enabled faster processing of these massive high-resolution data in recent years (Gorelick et al., 2017). Given this technological background, highresolution topographic (Yamazaki et al., 2017) and cropland (Thenkabail et al., 2016;Teluguntla et al., 2017) data are highly developed nowadays, which has made it possible to estimate global cropland suitability with comparison to actual cropland distribution.
In this study, we revised the method of Ramankutty et al. (2002) and estimated cropland suitability on a global scale at the highest ever resolution of 500-m (18-arcsec). With reference to the newly calculated cropland suitability map, the effect of topographic condition and the superiority of high-resolution estimates were discussed. Cropland suitability with the distinction of rainfed and irrigation cropland was further calculated and compared to cropland suitability assessment without distinction. In addition, the future cropland expansion potential was investigated through comparison between the estimated cropland suitability and actual cropland distribution.

Dataset
Recently developed high-resolution global cropland data GFSAD30 (Teluguntla et al., 2017) and GFSAD1KCM (Thenkabail et al., 2016) were used to obtain actual cropland distribution. GFSAD30 is the latest global cropland data at 30-m resolution for the year 2015, and consists of 12 major crops. Two machine learning algorithms were applied to two remote sensing data (Sentinel-2, Landsat-8) and a digital elevation model (SRTM30), resulting in a much higher-resolution cropland map than previously developed (Monfreda et al., 2008). GFSAD1KCM is 1-km resolution global cropland data with the distinction of rainfed and irrigation. This data was used for the estimation of cropland suitability with the distinction.
Climate data was collected from FLDAS (McNally and NASA/GSFC/HSL, 2019), NASA Earth Observation (NEO) (2019), and Global Wind Atlas (GWA) (Badger et al., 2019). Recently developed LandGIS data on soil organic carbon (Hengl and Wheeler, 2018) and soil pH (Hengl, 2018) were used as indicators of soil constraints. MERIT DEM (Yamazaki et al., 2017), which has high accuracy due to the removal of multiple errors, was used as the topographical data.
The spatial resolution of each dataset is 1 arcsec (30 m) for GFSAD30, 32.4 arcsec (1 km) for GFSAD1KCM, 0.1 degrees for FLDAS, 0.25 degrees for NEO, 8.1 arcsec (250 m) for GWA and LandGIS, and 3 arcsec for MERIT DEM (90 m), but in this study all data were converted to a resolution of 18 arcsec (500 m). Cropland distribution (GFSAD30) was upscaled by taking the mode values. Slope (MERIT DEM), wind speed (GWA), and soil organic carbon and pH (LandGIS) were upscaled by taking the average values. Other data were downscaled by nearest neighbor interpolation.

Cropland suitability and indices for evaluation
The cropland suitability evaluated by Ramankutty et al. (2002) only considered climate and soil as limiting conditions. This study added topographical constraints (i.e. slope) to the method of Ramankutty et al. Cropland suitability (LS) was defined as in Equation (1), assuming that it can be expressed as the product of functions representing each natural condition.
S clim , S soil , and S slp represent climate, soil, and topographical suitability respectively with values ranging from 0 to 1. Each is calculated as per Equation (2), where f 1 (GDD), f 2 (α), g 1 (C soil ), g 2 (pH soil ), and h(slp) represent the evaluation functions of each of the indices GDD, α, C soil , pH soil , and slp. GDD (Growing Degree-Days) is the annual sum of average daily temperature above the base temperature required for a crop to grow. α is evapotranspiration ratio, that is, the ratio of actual evapotranspiration to reference evapotranspiration (Text S1). Reference evapotranspiration was used in this study instead of potential evapotranspiration in Ramankutty et al. (2002) because it has now gained significant acceptance among scientists. C soil and pH soil are soil organic carbon content (kg m -2 ) and pH of soil moisture based on LandGIS data from 0 to 30 cm below the surface. slp is the slope of the cropland (%; m/100 m), and was calculated from MERIT DEM. Each evaluation function denotes how much the growth of crop is restricted by each index, and its deviation is briefly described in the next section.

Estimation of evaluation function
Because cropland suitability is a function of multiple conditions, an analysis that eliminates the effects of other variables is necessary to properly evaluate the constraints imposed by one variable. Therefore, we first determine the range of values for variables that are not considered to constrain the crop growth (sufficient condition). The pixels where the other variables satisfy the sufficient condition are analyzed to identify the evaluation of the constraint by one variable.
In this study, the sufficient conditions for each variable were determined as shown in Equation (3).
The relation between actual cropland distribution and each index was examined using these sufficient conditions. Taking slp as an example, pixels (500-m resolution) satisfying the sufficient conditions for the other four variables were extracted, and the ratio of cropland pixels to those target pixels for varying slp value were plotted (Figure 1(a)). The single dot indicates the average proportion of the cropland distribution for a given slope value. The evaluation function of slp, that is h(slp), was defined as the fitting curve of this plot with normalization ( Figure 1(b)). Because it is generally believed that the gentler the slope is, the more likely cropland is to exist, we assumed a suitability of 1 for a 0% slope and excluded slopes below 1% (red dots in Figure 1(b)) from the fitting. f 1 (GDD), f 2 (α), g 1 (C soil ), and g 2 (pH soil ) were also obtained in the same way ( Figure S2). Global cropland suitability at 500-m resolution was estimated on the basis of these evaluation functions. Each evaluation function was normalized so that it took a value from 0 (unsuitable) to 1 (suitable). Thus, the value of LS was also given from 0 to 1, which was set to take a low value when an indicator was unsuitable for cropland (Equation (1), (2)). In the present study, cropland with 0 ≤ LS < 0.33 was evaluated as unsuitable and 0.33 ≤ LS < 1 as suitable in reference to FAO (1976) and Zabel et al. (2014).

Effect of topographical constraint and superiority of high-resolution estimation
The global map of cropland suitability estimated in this study is shown in Figure 2(a). The granaries of the world (the Corn Belt and Prairies of the United States, the Pampa of Argentina, the Eastern European Plains, the Eastern Indian Plains, and the North China Plains) were evaluated to be suitable for agricultural land. On the other hand, arctic and subarctic zones above 50 degrees north latitude, arid regions (the Sahara Desert, the Arabian Peninsula, West Asia, the Mongolian-Tibetan Plateau, and the Australian subcentral region), and rainforest areas (the Selva in Brazil and the jungles of Southeast Asia) had low values of cropland suitability and were not suitable for agriculture. This was due to low temperature for high latitude areas, lack of moisture for arid regions, and slightly alkaline soil for rainforest areas. The distribution of cropland suitability was close to that estimated in previous studies (Fischer et al., 2012;Zabel et al., 2014). Figure 2(b) shows the cropland suitability without the constraint by slope. In addition to the suitable areas raised above, regions such as the Rocky Mountains, the Mexican Plateau, southern Europe, Turkey, the southern part of the Himalayas, and the Tailing Mountains in China were also estimated to be suitable. However, these areas are steep, making it difficult to cultivate land (Figure 2(d)(e)). By considering the slope, the overestimation of cropland suitability in such mountainous regions could be avoided (Figure 2(c)).
Estimation at high resolution is of importance to represent the topographical constraint to agriculture. The relation between slope and cropland distribution ratio at 0.5-degrees and 0.005-degrees (18-arcsec) resolution is shown in Figure  3. The variation of cropland distribution ratio at 0.5 degrees was obtained by calculating the cropland rate within the 0.5 × 0.5 degrees grid for cropland data and by taking the average over the same size of grid for slope data. The comparison of the two curves shows, at the low resolution of 0.5 degrees, that the cropland distribution ratio is smaller on gentle slopes, whereas the ratio is larger on steep slopes, which contradicts the fact that most cropland exists on relatively flat land (Baker and Capel, 2011). When upscaling the slope data, there is a mix of gentle and steep slopes in one grid, but if the cropland is on a gentle slope only, the averaged value could indicate that there exists a certain area of cropland on medium slopes. This leads to an over-(under-) estimation of cropland suitability on steep (gentle) slopes at 0.5-degrees resolution. Therefore, low-resolution data cannot adequately reflect the topographical constraint, demonstrating the effectiveness of using high-resolution data in estimating cropland suitability.

Comparison to actual cropland distribution
Actual cropland distribution was compared with the esti- Figure 3. Relation between slope and cropland area ratio at horizontal resolution of 0.5 and 0.005 degrees mated cropland suitability (Figure 4). The table in Figure 4 classifies the relationship between the value of LS and the existence of cropland into four groups, and corresponds with the colored areas in Figure 4(a)(b)(c). In this study, #1 is defined as suitable cropland (where it is suitable for cropland and it actually exists), #2 is defined as unsuitable cropland (unsuitable; actually exists), #3 is defined as suitable space (suitable; does not actually exist), #4 is defined as unsuitable space (unsuitable; does not actually exist).
The area of suitable space significantly decreased from 1.81 × 10 7 km 2 without topographical constraint to 1.04 × 10 7 km 2 with it (Figure 4(c)). This decrease of suitable space was observed mostly in mountainous regions including the Pacific Rim and the Alpine Himalayan orogenic belt (Figure 4(a)(b)). As stated in the previous section, by taking slope into account, the cropland suitability in mountainous areas, on which crops are hard to grow, decreased, thereby allowing a more realistic assessment of suitability when expanding cropland.
The area of unsuitable cropland increased in accordance with the decrease of suitable cropland. This could be because crop types were not taken into consideration in the estimation of cropland suitability. The GFSAD30 cropland data does not classify the multiple crop types, and only represents the existence of cropland. Thus, each evaluation function and cropland suitability were estimated assuming that crops grown on the plains and those grown in mountainous areas were identical. Consequently, the estimation with the consideration of slope could classify cropland on steep slopes as unsuitable cropland. As not only topographical condition but climate and soil condition vary by crop types, it would be crucial to calculate cropland suitability for each crop separately.

Cropland suitability with the distinction of rainfed and irrigation
The cropland suitability evaluated in the previous section did not distinguish rainfed cropland (water from local precipitation) from irrigated cropland (water artificially withdrawn). Therefore, we divided GFSAD30 cropland into rainfed and irrigated based on GFSAD1KCM, and recalculated cropland suitability by estimating the evaluation function of each index targeting rainfed cropland only ( Figure 5(a)). The newly estimated cropland suitability was compared with the one without cropland classification ( Figure 5(b)).
The Corn Belt, the Eastern European Plains, and the Northeast Plains of China were estimated to still be suitable for cropland. In Prairie, Pampa, northern India, and the North China Plain, however, where the LS without distinction had high values (Figure 2(a)), the value of cropland suitability decreased in the estimation with classification, as shown in Figure 5(b). The reduction in these areas had a bearing on the lack of soil carbon content, and the areas corresponded to regions with relatively large irrigated lands. It is known that soil organic matters rapidly decompose in arid and semiarid regions and irrigation helps increase organic carbon in cropland soil (Trost et al., 2013). This means the soil fertility cannot be well maintained without irrigation. Therefore, when irrigated cropland was excluded from the target cropland pixels, that is, under pure natural condition, the land was no longer fertile enough for  Figure S4 for reference Figure 5. (a) Cropland suitability (LS) with distinction of rainfed and irrigation and (b) their difference by subtracting "with distinction" from "without distinction." (c) Comparison of cropland suitability with distinction and actual cropland distribution and (d) its areas of each evaluation group in which bars represent "with distinction" and lines represent "without distinction." #1, #2, and #3 represent suitable cropland, unsuitable cropland, and suitable space respectively agriculture and was estimated to have low suitability.
The comparison between actual cropland distribution and the cropland suitability with the distinction showed that suitable cropland decreased while unsuitable cropland increased instead (Figure 5(d)). This change was mostly observed in regions that are heavily irrigated, such as China, India, and Central America (Figures 4(b), 5(c)). As stated above, irrigated areas are considered to be difficult to cultivate without human intervention. Thus, it is reasonable that cropland suitability would decrease and there would be less suitable cropland and more unsuitable cropland in these areas.
From the above discussion, cropland distinction could lead to a better assessment of cropland suitability on lands that would be difficult to farm without human intervention. The area of suitable space calculated based on the cropland suitability with the distinction was 9.25 million km 2 , which is far larger than the cropland area of 1.2 million km 2 additionally required in the future (Bruinsma, 2009). However, it should be noted that many of the areas evaluated as suitable space for cropland expansion support rich and valuable ecosystems (e.g. Yucatan Peninsula, Cerrado, Congo Basin, Savannas). Thus, ecological preservation and trade-offs need to be considered in the practical expansion (Zabel et al., 2019).

SUMMARY AND CONCLUSION
Global cropland suitability was estimated at 500-m resolution using recently developed high-resolution cropland and terrain data. Compared to previous studies, the consideration of topographical constraint led to lower cropland suitability for land with steep slopes. The use of highresolution data enabled the appropriate representation of the slope constraint. Furthermore, the distinction of rainfed and irrigated cropland removed the bias due to human intervention, and cropland suitability under pure natural conditions was estimated. The comparison between the newly estimated cropland suitability and actual cropland distribution implied that cropland could be expanded over larger areas than those expected to be needed in the upcoming decades.
The cropland suitability map estimated in this study is expected to be useful in planning and implementing future agricultural policies. However, there are some points to be improved to provide more precise information. The first is the limitation in cropland data. Although GFSAD30 has an unprecedented high resolution, it does not distinguish between crop types. In practice, the appropriate climate, soil, and topography vary by crop types, thus cropland suitability should be evaluated considering these variations. The second is the temporal analysis of cropland suitability. This study utilized only the data for the year 2015 due to data availability, but multiple years of data should be utilized for evaluation in order to make more robust estimates. Variations in climate within one year would also need to be taken into account since climate condition, such as temperature and precipitation, vary seasonally. The third is that the evaluation functions for each constraint could be further improved. For example, following the simple approach from a previous study (Zabel et al., 2014), we assumed a monotonically decreasing function for slope in Figure 1(b). However, considering the difficulty of drainage on flat areas, it may be more realistic that cropland suitability on nearly 0% slope land is smaller than that on a 1~2% gentle slope. The fourth is the implementation of social constraint. Crop production varies not only with natural conditions, but also with socio-economic conditions, such as distance from cities, market size, and difference in crop demand and prices. Therefore, cropland suitability should be evaluated with consideration of socio-economic factors at the same time.
Although there is still plenty of room for improvement, the cropland suitability estimated in this study has the highest ever resolution and its validity is also high compared to previous studies. Future development and application research (e.g. the estimation of future cropland suitability based on climate models, consideration of irrigation water availability through the integration with water resource models, and yield evaluation of biomass crops using cropland suitability) are expected to lead to an integrated assessment of the water-energy-food nexus (Taniguchi et al., 2017).

SUPPLEMENTS
Text S1. Calculation of GDD and α Text S2. Reasoning for sufficient condition Text S3. Assumptions to derive fitting curves Figure S1. Relation between slope and actual cropland Figure S2. Evaluation functions for GDD, α, C soil , and pH soil Figure S3. Evaluation functions with cropland distinction Figure S4. Global map of slope distribution