Global trends in land-atmosphere CO 2 exchange fluxes: an analysis of a flux measurement dataset and comparison with terrestrial model simulations

The increasing amount of measurement data on land - atmosphere flux has made it possible to assess the interannual and longer processes that are driven by environmental change and disturbance of terrestrial ecosystems. In this study, I used a global dataset of carbon dioxide ( CO 2 ) fluxes at eddy-covariance tower sites ( FLUXNET2015 ) to investigate long-term trends of net ecosystem CO 2 exchange ( NEE ) , gross primary production ( GPP ) , ecosystem respiration ( RE ) , and related variables. From 118 sites with records of at least 5-years duration, I extracted 1198 site-years of data for use in my analyses. Applying moderate screening by data quality, I found that 58 % of the sites showed increasing trends as net CO 2 sinks, in which median slopes of annual NEE of - 1.4 and - 4.1 g C m - 2 y - 2 were obtained by linear regression and Sen’s slope estimator. Both GPP and RE showed increasing trends at different slopes; their slopes were positively correlated among sites. Across-site variation of NEE trends was analyzed by generalized linear mixed modeling; the best statistical model used temperature, stand age, and biome type as explanatory variables. The trend of increasing CO 2 sinks differed among biome types, from almost none in grassland and savanna to steep slopes in deciduous broad-leaved forest sites. The flux trends derived from terrestrial model simulations showed that the increasing sink trend also prevails over the land. The global model simulations implied that the increasing land sink is primarily attributable to elevated CO 2 concentration. These results demonstrate the usefulness of flux measurement datasets, especially in conjunction with models, to deepen our understanding of long-term terrestrial ecosystem processes.


Introduction
Terrestrial ecosystems play important roles in the global biogeochemical and climate systems, which are under increasing pressure from human activities Bonan and Doney, 2018;IPCC, 2019 . The exchange of carbon dioxide CO 2 between terrestrial ecosystems and the atmosphere has attracted particular attention because it strongly affects the spatial and temporal distributions of atmospheric CO 2 , the most important anthropogenic greenhouse gas Running et al., 1999;Schimel et al., 2015;Tian et al., 2016 . Micrometeorological techniques such as the eddy-covariance method e.g., Baldocchi et al., 1988;Goulden et al., 1996;Aubinet et al., 2000 have been developed for conducting continuous measurement of the exchange fluxes of heat, vapor, CO 2 , and trace gases such as methane. At present, more than 500 eddy-covariance tower measurement sites form an international network, FLUXNET Baldocchi et al., 2001 , and its regional sub-networks; these provide indispensable climatic and biological data to the research community Yu et al., 2006;Mizoguchi et al., 2009;Saigusa et al., 2013;Novick et al., 2018;Kang and Cho, 2021 . The dataset is also valuable for terrestrial modeling studies for the purposes of model development and validation Friend et al., 2007;Huang et al., 2021;Ito and Ichii, 2021 . Studies using data from multiple sites is required to extract broad-scale information from the flux measurements, because the typical footprint of a single site measurement is within a few square kilometers, rendering it difficult to fully cover environmental and biological heterogeneity. A number of studies have conducted analyses across FLUXNET sites to capture the broad-scale properties of ecosystem CO 2 exchange, such as differences among biome types and along environmental gradients e.g., Valentini et al., 2000;Falge et al., 2002;Hirata et al., 2008;Mahecha et al., 2010;Yu et al., 2013;Gerken et al., 2019 . These analyses provided novel insights into the land surface processes and attributes Chang et al., 2021 . Moreover, empirical modeling of ecosystem CO 2 fluxes, including recent machine learning techniques, made it possible to scale up point measurement data to globally continuous fields e.g., Tramontana et al., 2016;Ichii et al., 2017;Zscheischler et al., 2017;Jung et al., 2020 . In conjunction with satellite remote sensing, which allows global high-resolution monitoring of the Earth's surface, it is now possible to capture the spatial variability of the terrestrial CO 2 budget.
However, capturing temporal variability from in situ measurements is still difficult, especially at interannual and longer time periods Chu et al., 2017;Baldocchi, 2020 . At sub-daily to seasonal spans, atmosphereecosystem CO 2 exchange is mainly determined by physiological and phenological responses driven by meteorological conditions such as solar radiation, temperature, and humidity e.g., Law et al., 2002;Stoy et al., 2009 . However, at interannual or longer time scales, many non-periodic and slowly evolving factors exert crucial influences on terrestrial CO 2 exchange. For example, terrestrial ecosystems, especially forests, show age-dependent trends from active growth and CO 2 uptake at a young stage to a gradual decline when old e.g., Ryan et al., 1997;Volkova et al., 2018 . Historical human-induced environmental changes, such as atmospheric CO 2 rise and nitrogen deposition, are likely to have slow but significant impacts on biogeochemical cycles e.g., Magnani et al., 2007;Ueyama et al., 2020 . Meteorological perturbations such as the El Niño -Southern Oscillation, which are associated with anomalous temperature and precipitation, cause interannual variability in photosynthetic productivity of a vast area of tropical to temperate vegetation e.g., Fang et al., 2017 and subsequently affect the atmospheric CO 2 growth rate e.g., Keeling et al., 1995;Piao et al., 2020 . Furthermore, local but extreme weather and disturbances such as drought and wildfire exert unexpectable impacts on ecosystem dynamics and processes, leading to spatially heterogeneous ecosystem carbon budgets e.g., Desai et al., 2007;Diao et al., 2020;Zhang and Yuan, 2020 . To understand the consequences of these effects, long-term observational data from a variety of ecosystems is necessary.
Thanks to careful maintenance of measurement systems and data quality management Aubinet et al., 2000;Moffat et al., 2007 , interannual or longer records are available for many FLUXNET sites. Indeed, operation of many sites has been interrupted due to discontinuity of research resources including funding, personnel, and equipment; several sites were damaged by devastating disturbances such as wildfire and tropical cyclones e.g., Dore et al., 2008;Sano et al., 2010 . The longest record was obtained at the Harvard Forest in Massachusetts, starting in the autumn of 1990 Wofsy et al., 1993 , and now many sites have records longer than 10 years. The accumulation of FLUXNET data has provided increasing opportunities for analysis of interannual to decadal patterns of land processes. For example, Dunn et al. 2007 analyzed a 10-year data record 1994 -2004 from a Manitoba spruce forest and found a source-to-sink transition of ecosystem CO 2 exchange; similar trend analyses have been conducted at diverse sites e.g., Grünwald and Bernhofer, 2007;Urbanski et al., 2007;Aguilos et al., 2014;Ueyama et al., 2014;Froelich et al., 2015;Moore et al., 2018 . Furthermore, multi-site analyses of such long-term data have allowed inference of general tendencies from the data. For example, Amiro et al. 2010 investigated age-dependent changes in ecosystem CO 2 exchange after disturbance, using 180 site-years of data from North American sites. Keenan et al. 2013 assessed temporal trends of ecosystem water-use efficiency the uptake of carbon by vegetation per unit of water loss by using data for 1992 to 2010 from 14 sites. Xu et al. 2019 andWang et al. 2019 analyzed the long-term trends in gross primary production at 41 and 56 sites, respectively, during recent decades, focusing on the phenological responses of vegetation to meteorological conditions. A global dataset of FLUXNET measurement has been made available to the public, with the aim of stimulating atmosphereecosystem interaction studies. Compared with previous datasets e.g., the Marconi and La Thuile data , the new dataset contains a larger amount of data compiled with a standardized protocol. The improved data quality and quantity are expected to allow researchers to address the overarching questions of whether terrestrial ecosystems are changing, what is the prevailing driver of the change, and whether we can predict the change. Specifically, in the present work I examined 1 whether net and gross terrestrial CO 2 exchange fluxes showed significant trends at sites for which long-term measurement data are available, 2 which environmental and biological factors are responsible for the long-term trends, and 3 whether the dataset can be used to improve terrestrial ecosystem models in terms of simulating the long-term trends. The third point is important, because terrestrial ecosystem models are used for future projections but are still immature in terms of capturing the spatial and temporal patterns in the CO 2 budget.

Data
The data used in this study came from a global dataset of flux measurements originally released from FLUXNET in December 2015 and updated in February 2020 FLUXNET2015 v4; https://fluxnet.org/data/fluxnet2015-dataset/ . The dataset was compiled by flux-data experts using a consistent and uniform processing protocol ONEFlux processing pipeline; Pastorello et al., 2020 , in collaboration with researchers from regional projects and networks Papale et al., 2006;Vuichard and Papale, 2015 . From the full dataset, which contains numerous site variables at time steps ranging from half-hourly to annual, I selected the following data records for this study, based on the indicated criteria. 1 Tier 1 dataset: Tier 1 data files are open to the public and available at no charge for scientific purposes. As of February 2020, data from 206 sites were included. This number was clearly increased from the 166 sites in the v3 dataset; in the future, the site coverage is expected to further increase as data collection and processing proceed. 2 Monthly data files: Monthly temporal resolution is most suitable for the present purpose of long-term analysis of trends and seasonality unit of CO 2 flux: g [ =10 -3 kg ] C m -2 month -1 . Moreover, monthly aggregated data are likely to be less affected by remaining short-term noise. Annual fluxes g C m -2 y -1 were obtained by summing from January to December of each year. 3 Sites with data records of at least 5 years: Detection of a statistically significant trend requires a sufficiently long time series Ito, 2012 , and the threshold length was selected by both data quality and quantity. The robustness of the present analysis was confirmed by analyses conducted with time series of different lengths. 4 Data records of different variables representative of CO 2 exchange between the atmosphere and terrestrial biosphere: these included a Net ecosystem CO 2 exchange NEE using variable u-star thresholds for each year variable name in FLUXNET2015; NEE_VUT_REF , b gross primary production GPP from nighttime partitioning variable GPP_NT_VUT_REF , and c ecosystem respiration RE from nighttime partitioning variable RECO_NT_VUT_REF Reichstein et al., 2005;Papale et al., 2006;Barr et al., 2013 . The gross fluxes GPP and RE are shown by absolute positive value; NEE is defined as RE -GPP and then net sinks take negative values. Note that because this dataset contains flux values derived by using different u-star thresholds and different flux partitioning methods, it was possible to examine the influence of data processing. The data quality flag for NEE NEE_VUT_REF_QC , which takes values between 0 and 1 bad to good quality and represents the fraction of good quality gap-filled data, was used for data selection. The steps of quality assurance and quality control of the FLUXNET data were fully described by Pastorello et al. 2020 . 5 Meteorological data consolidated from gap-filled, site-measured data and reanalysis climate data: These data include air temperature TA_F , incident shortwave radiation SW_IN_F , precipitation P_F , and ambient CO 2 concentrations CO2_F_MDS . For these variables, the marginal distribution sampling method MDS; Reichstein et al., 2005 was used to fill gaps in the data. For additional analyses, data for latent heat flux LE; LE_MDS , sensible heat flux SH; H_MDS , and net radiation RN; NETRAD were used. In total, 1198 site-years of data from 118 sites Table S1 were extracted from the original FLUXNET2015 v4 dataset. Note that for all variables, data flagged as missing data value of -9999 were rejected in the analyses presented here. Note that when applying data screening according to the NEE_VUT_REF_QC value, data from several sites were excluded from the analyses because of insufficient length of useful data. The 118 sites covered a wide spectrum of climate conditions and land-cover types, from tropical forests to arctic tundra and cropland Fig. 1 .
For ancillary analyses, several metrics were derived from the flux variables in the dataset. Water-use efficiency WUE, g C kg -1 H 2 O is defined for ecosystem-scale fluxes as WUE = GPP / γ· LE , where γ is a conversion factor for LE to evapotranspiration converting from W m -2 to kg H 2 O m -2 month -1 or y -1 . Radiation-use efficiency RUE, g C MJ -1 is defined as GPP per unit incident solar radiation. Biome types and ages of forest stands were obtained from 1 site descriptions provided by the FLUXNET web page or 2 site descriptions in previously published research that used data from individual sites, and 3 a meta-analysis paper on the relationship between forest age and carbon balance Besnard et al., 2018 .

Trends in time-series
Using data at each site, I calculated the linear trends of annual CO 2 fluxes g C m -2 y -1 y -1 and relevant variables over the observational period by both parametric and non-parametric methods. Most anomalous data were removed by data screening see below , but several sites showed complicated temporal patterns see Fig. S1 for examples , leading to low confidence of the estimated slopes. I then obtained the frequency distributions of the linear trends across the FLUXNET sites.
The least-squares linear regression method, in which deviations are assumed to follow a Gaussian distribution, was used to obtain the parametric linear slopes LS; Sokal and Rohlf, 1995 . Sen's slope estimator Sen, 1968 was used to obtain the non-parametric linear trends, because this method does not assume a Gaussian error distribution and is believed to be less prone to biases caused by outlier values. Sen's slopes SS were defined as follows: where y i and y j are data e.g., GPP, RE, or NEE at times i and j, respectively. The statistical significance of the slopes was examined by the Student's t-test for LS and the Mann -Kendall rank statistic test Kendall, 1938;Mann, 1945 for SS.
Before conducting in-depth analyses, I examined the influence of data quality and selection on the results. In particular, it is necessary to examine the effect of outlier data on the slopes obtained by regression, with the aim of obtaining robust results. The LS and SS of NEE with different severities of data selection by NEE_VUT_REF_QC QC, hereafter values are plotted in Fig. 2. Clearly, by using a more severe criterion of data selection e.g., QC > 0.9 , a smaller number of data about 12 of all data are available for the trend analysis, meaning that 88 of data, including outliers, are rejected. Remarkably, the difference in data selection affected the data spread but did not greatly affect the across-site median values of LS and SS, implying the robustness of the present trend analysis. Then, I selected the threshold QC value of 0.6, so that the extracted dataset represents all biomes covering a wide range of climatic zones Fig. 1 and rejects anomalous trend values Fig. 2 . Approximately 52 of the data were rejected; thus, data from 68 sites were used to calculate the trends.

Factor analysis
First, I calculated the correlation among the trends in climatic variables mean annual temperature and precipitation , fluxes GPP, RE, NEE, LE, SH, and RN , and their efficiencies RUE and WUE . The correlation matrix can clarify the inter-relationships of these parameters. Then, to specify the influential factors for the slopes obtained from the flux dataset, I conducted a generalized linear mixed model GLMM analysis; explanatory variables are site latitude LAT , mean annual temperature MAT , mean annual precipitation MAP , stand age AGE , and biome type Table S1 . Site ID number was considered as a random effect in the GLMM. The model fit was calculated by the maximum-likelihood method, and the best statistical model was selected on the basis of Akaike's information criterion AIC . These statistical analyses were conducted by using the "lme4" package in R software version 4.0.3 R Core Team, 2020; https://www.r-project.org/ .

Comparison with model-simulated CO 2 flux trends
I examined the trends obtained from terrestrial ecosystem model simulations for two purposes: 1 validation of model estimation and 2 further factorial analysis. For the first purpose, I compared the frequency distributions of the trends in GPP, RE, and NEE between the observations and the simulations with a process-based model, Vegetation Integrative SImulator for Trace gases VISIT: Ito, 2019 . For the second purpose, I used the outputs by multiple models, which differ in environmental responsiveness, to obtain robust results. Global long-term simulations were conducted by 15 terrestrial ecosystem models with a protocol of the Multi-scale Terrestrial Model Intercomparison Project MsTMIP: Huntzinger et al., 2017 . Three sensitivity simulations, which consider historical changes in 1 climate only; 2 climate and land-use change, and 3 climate, land-use change, and atmospheric CO 2 rise, were conducted to separate the factorial effects. By comparing experiments 2 and 1 , the effect of land-use change could be evaluated. Similarly, the CO 2 fertilization effect could be evaluated by comparing experiments 3 and 2 . For each model, I calculated slopes of annual GPP, RE, and NEE at each grid; then, I used the mean results of terrestrial ecosystem models to reduce model-specific biases.

Trends in NEE
At the 118 Tier-1 sites, annual NEE ranged from -2623.1 g C m -2 y , respectively. When applying the threshold QC > 0.6, mean and median annual NEE values of -246.7 and -174.9 g C m -2 y -1 , respectively, were obtained from 68 sites.
The small impact of data selection implies the robustness of the present multi-site analyses. Furthermore, the prevailing net terrestrial CO 2 sink is qualitatively consistent with the global carbon budget synthesis Friedlingstein et al., 2020 , which used up-to-date ground and atmospheric data, inventories, and models. There is a wide scattering of annual NEE over the 118 sites  After applying data screening QC > 0.6 and at least 4 years, 68 sites , the frequency distributions of LS and SS for annual NEE were similar Fig. 4a , and their median values were thus comparable between the two slope estimation methods: LS, -1.37 g C m -2 y -2 and SS, -4.08 g C m -2 y -2 . Note that these median values were much smaller than the mean values and those obtained without data selection, calling attention to the selection of a typical trend value. Indeed, several sites showed extremely large positive increasing CO 2 source trends and biases that affected the mean value e.g., LS of 330 g C m -2 y -2 in SD-Dem, savanna . The median value of the trend of SS was significantly non-zero p = 0.0102, Wilcoxon test , whereas that of LS was not p = 0.126 . Slopes for NEE were statistically significant p < 0.1 at 19 28 of 68 sites for LS and 10 15 of 68 sites for SS. Mean and median slopes of the data from these statistically significant sites were consistent: mean, LS, -20. 7 g C m -2 y -2 and SS, -20.1 g C m -2 y -2 : median, LS, -21. 7 g C m -2 y -2 and SS, -18.5 g C m -2 y -2 . When using the threshold data lengths of 10 and 15 years, effective data were available at only 25 and 4 both evergreen needle-leaved forest sites, respectively. They would allow us to obtain more consistent results on long-term trends but make it difficult to cover the diversity of biomes and climatic zones. Mean and median slopes with 10-year thresholds were -6.09 and -0.63 g C m -2 y -2 for LS and -7.56 and -3.64 g C m -2 y -2 for SS. The results of 5-year and 10-year thresholds e.g., -4.08 vs. -3.64 g C m -2 y -2 for median SS values are comparable and so encouraging for the credibility of the present analysis.
These results obtained from the FLUXNET2015 data indicate that net CO 2 uptake and its temporal trend differed widely among sites Fig. 4 . Although mean and median trends implied increasing CO 2 sinks by terrestrial ecosystems, we need more long-term data and in-depth analyses to confirm their weak to moderate slopes.

Trends in GPP and RE
Annual GPP and RE differed widely among the sites and varied considerably through time Fig. 3b and 3c ; their means and standard deviations were 1318 746 and 1065 652 g C m -2 y -1 , respectively. The increasing trends of these gross fluxes, especially GPP, were clearer than those of NEE Fig. 4b and 4c . When applying the data screening of QC > 0.6, most of the 68 sites showed positive incremental trends for both LS and SS 46 sites for GPP and 39 sites for RE . The increasing trend in observed GPP during the last few decades is consistent with other independent estimates such as satellite observations e.g., Stocker et al., 2019 , atmospheric composition data e.g., Campbell et al., 2017 , and vegetation model simulations e.g., Huntzinger et al., 2017 . Although the flux dataset was more or less biased, such increasing trends in GPP imply enhancement of land ecosystem productivity. Negative GPP and RE trends occurred in, for example, the CA-Gro mixed forest , US-PFa mixed forest , and AU-ASM savanna sites. In terrestrial ecosystems, decremental trends of GPP and RE for interannual and longer periods can take place for several reasons such as senescence, illness, and pollution, and then be associated with ecosystem degradation. However, it is difficult to specify the reason for the decremental trends of gross CO 2 fluxes on the basis of the dataset alone. The median slopes of GPP 11.9 g C m -2 y -2 for LS and 11.4 g C m -2 y -2 for SS were steeper than those of RE 3.2 g C m -2 y -2 for LS and 6.3 g C m -2 y -2 for SS , consistent with the trend of increasing net sink in NEE.

Fig. 3.
Annual CO 2 fluxes 1990 -2015 at 118 FLUXNET sites. a Net ecosystem CO 2 exchange NEE , b gross primary production GPP , and c ecosystem respiration RE , and. Trend lines were obtained by applying Sen's slope estimator to data with QC > 0.6: forested sites in green, grassland and shrubland sites in yellow, and cropland sites in red. Thin, medium, and thick lines correspond to p > 0.1, p = 0.05 -0.01, and p < 0.01 from a Mann-Kendall test, respectively.
In contrast, more sites showed statistically significant trends for RE than for GPP, probably because RE has smaller ranges of interannual year-by-year variability that can obscure linear trends. Interestingly, the global data synthesis of soil respiration, which accounts for a large part of RE, showed a clear increasing trend at an acceleration rate of about +0.1 y -1 or ~1 g C m -2 y -2 e.g., Bond-Lamberty and Thomson, 2010;Hashimoto et al., 2015 . Because these observations were made by independent chamber methods, their trends seem to support the incremental trends in RE.

Trends in energy fluxes and metrics
The mean annual RN of 73 sites, which provided 4-yr data with QC > 0.6, was 2579 790 MJ m -2 y -1 81.8 25.1 W m -2 .
More than 70 of sites showed positive trends at median slopes of 20.3 MJ m -2 y -2 LS and 20.4 MJ m -2 y -2 SS . Because RN is determined by longwave and shortwave radiation budgets, further assessment is required to specify the underlying mechanisms of the increasing RN trend e.g., changes in downward radiation and surface albedo . Compared with RN, trends in heat fluxes were less clear. Trends in LE differ widely among the sites Fig. 4d , with median slopes of -5.0 MJ m -2 y -2 for LS and -6.0 MJ m -2 y -2 for SS approximately -2 mm y -2 of evapotranspiration , which are not significantly different from zero. Also, trends in SH are distributed around zero Fig. 4e , with median slopes of -2.0 MJ m -2 y -2 for LS and -2.2 MJ m -2 y -2 for SS. Note that energy balance closure was not constrained in the dataset and few sites provided data of another heat flux, ground heat flux. The tendency for decreasing evapotranspiration or LE was also found in the up-scaled flux data Jung et al., 2010 and attributed primarily to limited moisture supply. It was observed in pan evaporations e.g., Roderick et al., 2007 and was attributed largely to changes in solar radiation and wind speed. In contrast, satellite and model studies imply an increasing trend of land evapotranspiration e.g., Zeng et al., 2012;Pan et al., 2020 . These studies, including the present one, might indicate the difficulty in detecting a consistent trend. The site-average value of annual WUE was obtained as 2.96 1.41 g C kg H 2 O -1 ; this value is comparable with those obtained by previous studies e.g., Beer et al., 2009 . However, trends in WUE were not consistent among the sites, showing both positive and negative trends Fig. 4g . The median values of the WUE trend, -0.0032 g C kg H 2 O -1 y -1 for LS and -0.0015 g C kg H 2 O -1 y -1 for SS, were not significantly different from zero p = 0.608 and 0.615, respectively, from a Wilcoxon test . The WUE trends of the present study were less clear than those obtained in previous flux-based studies e.g., Keenan et al., 2013;Frank et al., 2015;Guerrieri et al., 2019;Mathias and Thomas, 2021 ; note that these studies used data from smaller numbers of sites and differed in their definitions of water-use efficiency e.g., based on vegetation transpiration and evaporative demand . However, the decline in the temporal trend of WUE is, at least qualitatively, consistent with the result of a long-term tree-ring analysis by Adams et al. 2020 ; the high spatial variability in the WUE trends is also consistent with the present study.

Statistical analyses of the trends 3.2.1 Inter-site correlation among the trends
Inter-site relationships among the long-term trends of fluxes and related variables are shown with a correlation matrix Fig. 5 . The trends in site temperature TMP were negatively correlated with trends in precipitation PRC and positively with trends in vapor pressure deficit VPD . Note that these environmental variables differ in typical spatial scales of variability Jung et al., 2017 : i.e., local for precipitation and regional for temperature. The strongest correlation was seen between GPP and RUE trends r = 0.92 , but it could be apparent by the definition of RUE, i.e. GPP per incident radiation. The trends of RE were well correlated with those of GPP r = 0.68 , implying an interplay of biogeochemical processes within ecosystems; specifically, a site with an increasing GPP trend also has a tendency to show an increasing trend of RE, probably because both were driven by common drivers e.g., temperature and precipitation and by biological interactions e.g., growth respiration driven by photosynthetic production . The correlation between GPP and RE trends resulted in a strong correlation between RE and RUE trends. The trends of NEE were poorly correlated with those of GPP r = -0.13 but well correlated with those of RE r = 0.53 , implying the importance of respiratory processes in determining long-term ecosystem carbon budgets. Such an influence of RE was found in previous studies; for example, Valentini et al. 2000 concluded that the latitudinal gradient in ecosystem carbon budget is largely attributable to the variation in RE. The trends of WUE were moderately correlated with those of GPP, whereas they were not correlated with those of LE. This result might imply that primary physiological factors and processes determine the WUE trend, e.g., unclosed energy balance and aerodynamic conductance, as suggested by Knauer et al. 2017 .

Statistical model analysis of NEE trends
To examine the influential factors of the inter-site differences in NEE trends, GLMM analysis was conducted Table S2 . The model with the lowest AIC score Table 1 has independent variables fixed effects of mean annual temperature, stand age forested ecosystems , and biome types, with a site-specific intercept random effect . The coefficient for MAT, -2.59 1.67 mean standard error , indicates that the slope of the NEE trend became more negative under warmer conditions. In addition, the coefficient for stand age, -0.13 0.11, indicates that the NEE trend became more negative in older stands. Adding latitude and precipitation did not improve the AIC score, whereas inclusion of biome-specific coefficients improved model performance.
The difference in biome-specific coefficients is consistent with the difference of annual fluxes and their trends among the biomes. Figure 6 compares the NEE trend among biome types, implying that deciduous broad-leaved forests had steeper NEE   slopes median, -10.8 g C m -2 y -2 for LS and -17.5 g C m -2 y -2 for SS see Table S3 for mean fluxes and their slopes for each biome . It is noteworthy that grassland and savanna sites showed less negative mean NEE and smaller trends, which could be attributable to large interannual variability in semiarid regions e.g., Ahlström et al., 2015 . Due to the lack of information, it is difficult for the present study to relate the flux trends to vegetation composition such as dominance of C 4 plants, which have lower sensitivity to elevated CO 2 concentration. The dependence of NEE and its trend on stand age has been proposed by regional syntheses Magnani et al., 2007;Amiro et al., 2010;Besnard et al., 2018;Curtis and Gough, 2018 , but there is still a lack of consensus on its generality and mechanism. For example, Luyssaert et al. 2008 found that old-growth forests still act as net carbon sink, implying a weaker age-dependence of NEE. The relationship between NEE and stand age derived from the FLUXNET2015 dataset is illustrated in Fig. 7. Annual mean NEE Fig. 7a showed a positive correlation with stand age older stands are less of a sink , indicating a quasi-neutral carbon budget at a stand age of approximately 300 years. The relationship was clear in evergreen broad-leaved forests and weak in deciduous broad-leaved forests. The trend in NEE did not show a consistent relationship with stand age Fig. 7b , but mature forests > 100 years appear to increase as sinks through time. The inconsistent results between the age dependence of NEE decreasing sink with age and its trend increasing sink irrespective of age imply that they were caused by different mechanisms. The age-dependent decline of NEE could be attributable to biological mechanisms such as senescence and resource limitation. In contrast, the age-independent increasing trend of NEE, which has been observed at most for 20 years, could be caused by environmental drivers such as elevated atmospheric CO 2 concentration and climate change. Further long-term studies are required to scrutinize the relationship between stand age and ecosystem carbon budget.

Comparison with terrestrial ecosystem models 3.3.1 Comparison with the VISIT simulation
The trends of land CO 2 fluxes derived from the FLUXNET2015 were compared with those obtained from   , is slightly smaller than that obtained from the selected set of FLUXNET2015 data LS, 8.7 g C m -2 y -2 . The global mean RE trend, 3.8 g C m -2 y -2 , is slightly larger than that obtained from the flux dataset LS, 3.1 g C m -2 y -2 . The global mean NEE of the simulation, -52.4 g C m -2 y -1 , is much smaller than that derived from the flux dataset -247 g C m -2 y -1 , but the simulated trend, -1.5 g C m -2 y -2 , is close to that of the observations mean, -6.2 g C m -2 y -2 ; median, -1.4 g C m -2 y -2 .
Note that such large net sinks derived from the flux dataset are still arguable e.g., Jung et al., 2011;Bodesheim et al., 2018;Zeng et al., 2020 , and are probably caused by several reasons. In addition to biases in the eddy covariance flux measurement such as advection effect, the lack of non-CO 2 carbon emissions and lateral carbon flows could result in the underestimation of simulated land sinks Ito, 2019;Kondo et al., 2020 . Remarkably, the sensitivity simulation using a fixed CO 2 concentration i.e., driven by climate and land-use only showed much smaller trends in GPP, RE, and NEE, implying that elevated CO 2 was the primary driver of the increasing net terrestrial sink. For example, during the last 20 years, global mean GPP and NEE showed the sensitivity to elevated CO 2 of 0.15 g C m -2 y -1 ppmv -1 and -0.01 g C m -2 y -1 ppmv -1 , respectively; the CO 2 effect accounted for 54 of the total GPP slope. These results demonstrate a comparison between flux dataset and model simulations would allow us model validation and provide insights into the carbon budget, although they differ in spatial representativeness.

Comparison with MsTMIP simulations
In addition to the single-model examination in the previous section, the observational CO 2 flux trends were compared with those derived from simulations by 15 MsTMIP models Fig. S3 . Because these models cover a wide range of estimation uncertainty, I could specify which of the global change factors climate, atmospheric CO 2 , and land-use made the primary impacts on the flux trends with higher confidence. The simulation results which differ in structure, parameterization, and environmental responsiveness, allowed extraction of more confident results for the global trends. The model-mean NEE shows consistent trends of increasing net CO 2 sinks, mainly in tropical, temperate, and boreal forests Fig. 9g . By removing the effect of elevated CO 2 , a wide area of terrestrial ecosystems showed decreasing sink trends, e.g., tropical forests in the Amazon, subtropical forests in South and Southeast Asia, and boreal forests in eastern Europe Fig. 9a and 9d ; namely, historical climatic change and land-use change suppressed the net land sinks in these regions. In the simulation forced by climate, land-use, and atmospheric CO 2 changes, both GPP and RE showed increasing trends in a wide area of forested ecosystems. The spatial distributions of GPP and RE trends Figs. 9h and 9i look similar; this result is consistent with the good correlation between GPP and RE trends through the flux measurement sites Fig. 5 . The comparison of trends between the observational dataset and terrestrial ecosystem models clarifies the efficiency of flux measurements for validating simulated long-term variations. Previous studies have used flux data mainly for validating annual CO 2 budget, seasonal cycle, and interannual variability e.g., Bonan et al., 1997;Ichii et al., 2013;Restrepo-Coupe et al., 2017 , and in this study I demonstrate a further usage of the flux dataset. The simulation analysis with process-based models was effective for interpretation of flux measurement results, by separating the effects of different driving forces in this study, climate, land-use, and elevated CO 2 . Note that large uncertainties remain in the simulated landatmosphere CO 2 exchange, as shown by the difference among spatial patterns of NEE trends in the MsTMIP models Fig. S3 . This result indicates that utilizing the FLUXNET2015 dataset not only for comparison, but also for improvement of model performance is necessary for future studies.

Concluding Remarks
The global flux measurement dataset, FLUXNET2015, allows investigation of long-term trends over the global terrestrial biosphere. The data contained in the dataset result from the great efforts of the researchers working at each site to maintain continuous, high-quality observations. This study demonstrated that the growing number of long-term CO 2 flux measurements is effective for deriving insights into the carbon cycle at ecological time scales. The flux data differ from other ecological data such as field-survey and tree-ring records, because they have a high temporal frequency and a close correspondence with micrometeorological factors. This characteristic of flux data is particularly effective for validation and improvement of terrestrial ecosystem models, which need to be constrained at different time scales Ito and Ichii, 2021 . The models allow decadal or longer simulations to be performed e.g., Fig. 8 , and their weakness in validation could be solved by the observational dataset. In this study, I focused on long-term trends, and interannual and seasonal i.e., phenological variability would allow extraction of further insights from the dataset.
Here I focused on linear trends that allow approximate capture of ecological processes; however, many biogeochemical processes have non-linear trends and complicated environmental responsiveness e.g., Wu et al., 2018;Duffy et al., 2021 . The global model simulations Fig. 9 implied that the linear GPP and NEE trends were attributable largely to historical CO 2 concentrations, which showed a roughly linear increase in the last several decades. Recent studies e.g., Wang et al., 2020 implied that the CO 2 -associated trends are declining due mainly to physiological limitations; detection of such nonlinearity from flux data would be an interesting challenge. Nonlinear patterns could be produced by local to regional drivers such as disturbances and extreme meteorological events e.g., drought, flood, and frost damage . These phenomena may have short-term but strong influences on ecosystem structure and functions, sometimes altering the direction of the linear trend of the flux. Continuous measurements obtained from a pre-disturbance intact state to a post-disturbance recovering state are still limited.
The effectiveness of flux data is expected to be enhanced with further data accumulation through time and expansion over space. In this regard, the present dataset coverage has substantial gaps e.g., Chu et al., 2017;Haughton et al., 2018;van der Horst et al., 2019 , particularly in the tropics that have strong influences on the global carbon budget. Only 12 of 118 sites in the dataset used are from latitudes lower than 25 , and tropical rain forest is represented by only four sites GF-Guy, BR-Sa1, BR-Sa3, and MY-PSO . Enhancement of flux networking in such underrepresented areas is obviously important for the research community. Accordingly, future improvements in data standardization and sharing Bond-Lamberty, 2018; Papale, 2020 will surely make essential contributions to a wide area of research on landatmosphere interactions and biogeochemical cycles.