How Does Land Use / Land Cover Map ’ s Accuracy Depend on Number of Classification Classes ?

A land use/land cover map is an important input for different applications. However, the accuracy of land cover maps remains a great uncertainty and mapping accuracy assessment is not well-documented. The objective of this paper is to examine the relationship between overall accuracy and the number of classification classes by conducting a literature review of land cover/ land use studies. The results revealed a weak negative correlation between the map’s accuracy and the number of classes. The paper suggests a decrease of 0.77% map’s overall accuracy with respect to the increase of 1 land cover class. The average overall accuracy produced by 05 sensor types does not show the big difference. In addition, high spatial resolution sensor such as Airborne might not be always advantageous for producing high overall accuracy map since its accuracy depends on several factors including the number of land cover classes. (Citation: Thinh, T. V., P. C. Duong, K. N. Nasahara, and T. Tadono, 2019: How does land use/land cover map’s accuracy depend on number of classification classes? SOLA, 15, 28−31, doi:10.2151/sola.2019-006)


Introduction
The land cover/land use is an important contributing factor to the climate system. For example, agriculture, forestry, and other land use contributed 24.5% of total greenhouse gas emission globally (IPCC 2014). Another example, land cover impacts on precipitation (Sugimoto et al. 2015), generation of dust in the atmosphere (Kimura 2012), water resource (Sawaya et al. 2003), distribution of wild-fire (Keramitsoglou et al. 2008), ecosystems, and biodiversity (Delalieux et al. 2012). Therefore, producing accurate land cover/land use map are essential to the different fields of study and management processes.
However, the accuracy of land cover products is inconsistent from one to another. For example, the accuracy of some global scale land cover products such as MODIS 1 km, GLC2000 1 km and IGBP DISCover 1.1 km are quite different, ranging from 67% to 78% (Herold et al. 2008). Another example, a wide range of overall accuracy from 42% to 98% is seen in studies using images from different sensors such as Landsat, Hyperion, IKONOS, Quick Bird (Laba et al. 2002;Sawaya et al. 2003;Arroyo et al. 2010;El-Zeiny and Effat 2017).
The inconsistency of the map accuracy might depend on some factors such as spatial resolution, the homogeneity or heterogeneity of land surface (Ma et al. 2017) or classification algorithms (Gómez et al. 2016). Among the factors, the number of land cover classes is crucial. Generally speaking, the map accuracy can decrease when the number of classes (e.g., water, urban, forest) increases, because of more chances of misclassifications among classes. In fact, few studies have shown the negative correlation between overall accuracy and the number of land cover classes (Dronova 2015;Ma et al. 2017) for some specific methods or targets (such as object-based classification or wetland study). This paper aims to build a benchmark of the relation between overall accuracy and the number of land cover classification classes in a wide range of studies without restriction about classification method, targets, sensor types, or spatial scale. The paper firstly describes how the data is collected and analyzed, then the result about the relationship of map accuracy and number of classes will be showed and discussed.

Publications collection
The selected papers in this study were collected from the Science Direct website of Elsevier publishing company. The website provides the accessibility to a large database of scientific research including many papers on well-known remote sensing journals such as "ISPRS Journal of Photogrammetry and Remote Sensing", "Remote Sensing of Environment" and "International Journal of Applied Earth Observation and Geo-information". The relevant literature was obtained by using the advanced search function in the website. The search keywords are: "land cover mapping", "land use mapping" and "accuracy assessment". By searching based on these keywords, the website returned 163 relevant publications. From these publications, a quick scanning for titles and abstracts was implemented to eliminate irrelevant papers. After the quick scanning process, 99 papers were selected for further reading and data collection. In addition, several rules were applied to manually screen out papers which provided necessary data for overall accuracy assessment. The rules are described as following: -Exclusion of studies focusing on an individual land cover class which affects significantly to the result of analyzing the correlation between land cover accuracy and the number of land cover classes, for example, Zhang et al. (2015). -Removing articles failed to provide useful data such as overall accuracy, number of land cover classes which will be used for quantitative analysis in this paper. -Exclusion of review papers relates to land cover, and land use accuracy assessment. For example, Costa et al. (2018). After the careful reading based on the above rules, 64 related papers were selected for further analysis.

Data analysis
A data file containing 10 fields (Table 1) was established to collect information from 64 selected papers. Specifically, the file includes some general information such as title, author, year of publication, and journal. In addition, it contains other fields including sensor types, number of classes and overall accuracy. Within 10 fields of data, this paper concentrated on synthesizing and analyzing the relationship between overall accuracy and number of land cover classes. The sensor types information then was added as a complement factor to overall accuracy assessment. The other factors such as maps' resolution and classification algorithm will not be described in detail. With the hypothesis that

How Does Land Use/Land Cover Map's Accuracy Depend on
Number of Classification Classes?

Land cover map's accuracy synthesized from multiple studies a. The dependence of overall accuracy on the number of classification classes
This paper used data derived from a variety of land cover studies to examine the dependence of overall accuracy on the number of classification classes. The selected publications are different in method, classification algorithms, sensor types and study areas. Overall, the mean accuracy of all 64 studies is 83.7% with 10 classes on average. The highest accuracy achieves 98.7% with 4 classified classes (Wardlow and Egbert 2008) while the lowest accuracy is 42% with 29 classes (Laba et al. 2002). Figure 3 presents the correlation between overall accuracy and the number of classes for three cases: "all studies", "MODIS studies", and "Landsat studies". Obviously, the "MODIS studies" and the "Landsat studies" are also included in the "all studies" case. The result shows a negative correlation between overall accuracy and the number of classes for all three cases. Furthermore, a weak the number of class was among the most important factors to land cover maps overall accuracy, the study conducted the analysis of variance (ANOVA) separately for four factors including number of classes, map's resolution, classification algorithm, and sensor types. The p-value produced by ANOVA process showed that the number of classes is the most affected factor ( p-value = 0.005), following by classification algorithm ( p-value = 0.029), map's resolution ( p-value = 0.296) and sensor types ( p-value = 0.723). Therefore, the "Result and discussion" section will mainly present the result of analyzing the dependence of overall accuracy on the number of classes. Additionally, the analysis of sensor types and overall accuracy will aim to emphasize the importance of the number of classes to map's accuracy.

Land cover map's accuracy in individual studies
To a single study, if mappers increase the number of classification classes, it can be a high possibility of decreasing the overall accuracy. For example, Gessner et al. (2015) showed that the accuracy of a multi-sensor land cover map was 80% at 9 classes while it decreased to 73% in 14 classes. The similar results were shown in other studies such as Colditz et al. (2011), Van Lier et al. (2011), and Parent et al. (2015. To demonstrate the relationship between overall accuracy and the number of classes, two studies were selected including Herold et al. (2008) (using MODIS images) and McCombs et al. (2016) (using Landsat image). From the error matrices in these studies, the number of land cover classes were reduced at every 1 class interval and the overall accuracy was recalculated accordingly. The results in Fig. 1 and Fig. 2 show a clear strong negative correlation between overall accuracy and the number of the defined classes with R 2 equal to 0.93 and 0.95 for the MODIS study and the Landsat study respectively. Although the number of classification classes is 23 and 13 for two studies respectively, the higher 95% of overall accuracy can only be achieved when the number of classes is less than 6. Interestingly, the combination of some land cover classes did not significantly improve the overall accuracy of both studies. For example, it is recognized that the number of classes from 10 to 13 in Fig. 1 show approximately equal overall accuracy and a similar situation for the classes from 8 to 12 in Fig.  2. Particularly, based on the error matrix of the study of Herold et al. (2008), the map overall accuracy remains at 91.5% when combining the class "Palustrine Emergent Wetland" and the class "Estuarine Forested Wetland". This is because of none misclassification between these two classes. In the study of McCombs et al.
(2016), we obtained a 0.3% increase of overall accuracy when the "Deciduous Forest" classes and the "Evergreen Forest" classes are combined. This might be explained by the 1.4% misclassifying "Evergreen Forest" into "Deciduous Forest" and 7.5% misclassifying "Deciduous Forest" into "Evergreen Forest". Overall, the combination of classes that have low a chance of misclassification from each other could not considerably increase overall accuracy."   correlation is seen for both "all studies" and "Landsat studies" with R 2 equal to 0.25 and 0.23 respectively. This weak negative correlation is consistent with the conclusion of Ma et al. (2017) and Dronova (2015). In contrast, the MODIS case studies showed a negative moderate correlation with R 2 equal to 0.54. However, this moderate correlation might be affected by the small number of samples (9 studies). Statistically, the overall accuracy is 100% when the number of class is 1. Therefore, initially, this paper considered using the restricted regression line with intercept point is x 0 = 1, y 0 = 100. However, this regression line showed a not good fit with the given set of data (with R 2 = −0.047). Alternatively, using the regression line without the restricted interception point provided a better fit to sample data (R 2 = 0.25, p < 0.001). The slope value 0.77 of the equation in Fig. 3 means the decrease of 0.77% overall accuracy with respect to the increase of 1 land cover class.
Two outlier values in Fig. 3 are of Landsat studies. One is Laba et al. (2002), in which the authors mapped 29 land cover classes for 12 million hectares area with 42% of overall accuracy. The other is Sesnie et al. (2008), in which 93.3% of overall accuracy was achieved for mapping 32 land cover classes of 8 hundred thousand hectares area. There is a big gap in overall accuracy between these two studies despite the approximately equal number of classes. By carefully considering the method provided in the studies, the reason of this gap might be explained by the intensive use of reference data (62,154 pixels) to map land cover for a small area of 8,000 km 2 in the study of Sesnie et al. (2008). In this study, the inflation of overall accuracy is likely to happen due to the spatial correlation between training pixels and validation pixels. Figure 4 shows the average overall accuracy performed by 5 different sensor categories. The studies using images from more than 2 sensors were categorized in "Multiple sensors". In addition, "Other" category consists of the studies using a single sensor apart from "Landsat", "MODIS" and "Airborne". As can be seen in Fig.  4, Landsat studies accounted for 40.6% (26 studies), followed by "Multiple sensors", "Others", "MODIS" and "Airborne" with the proportion are 20.3%, 17.2%, 14.1%, and 7.8% respectively. The majority of Landsat images might because of the free-downloading and its 30 m spatial resolution is popular for many studies at the national and local scale.

b. The average overall accuracy performing by sensor types
Most of the sensor types performed the higher overall accuracy than 80% except Airborne sensors (79.3%). Surprisingly, the mean classification accuracy of studies using Airborne images is the lowest; however, their spatial resolution is much higher than Landsat and MODIS images. Specifically, the resolution of There is a high percentage of multiple sensor studies (20.3%) which account for 13 studies out of 64 in total. The emerging of multiple sensors approach in land cover classification might be encouraged by the development of many satellites mission such as ALOS, Quickbird, IKONOS and so forth. To some extent, the overall accuracy of studies using images from different sensors has improved accordingly. For example, by using the combination of optical and SAR images, Zhang et al. (2015) reached the overall accuracy of the land cover map to 98.4%.

Conclusion
Through assessing the relationship between land cover overall accuracy and the number of classification classes, some conclusions are drawn as follows: (1) A negative correlation exists between overall accuracy and the number of classification classes. It is easy to recognize the strong negative correlation between overall accuracy and the number of class in a single study while it shows a weak negative correlation when it comes to multiple studies. It is the trade-off between accuracy and the detail of land cover map and this remains as inner uncertainty of land cover mapping. However, as the suggestion in this paper, the land cover mappers should consider about 0.77% decrease of overall accuracy when they increase 1 land cover classification class.
(2) There is a small gap in the average overall accuracy produced by different sensor types. In addition, high spatial resolution sensor such as airborne might not be always advantageous for producing high overall accuracy map since its accuracy depends on several factors in which the number of land cover classes played an important role.
Due to the incomplete understanding about land cover map accuracy, we highly recommend more practical and review studies on accuracy assessment of land cover and land use mapping.

Acknowledgments
We thank the Project for Human Resource Development Scholarship (JDS) by Japanese Grant Aid for providing the scholarship to authors. This paper is supported under the ecosystem group of Earth Observation Research center (EORC) and ALOS-3 project of JAXA.
Edited by: M. Huang