Vegetation plays an important role in regulating global carbon cycles and is a key component of the Earth system models (ESMs) that aim to project Earth's future climate. In the last decade, the vegetation component within ESMs has witnessed great progress from simple “big-leaf” approaches to demographically structured approaches, which have a better representation of plant size, canopy structure, and disturbances. These demographically structured vegetation models typically have a large number of input parameters, and sensitivity analysis is needed to quantify the impact of each parameter on the model outputs for a better understanding of model behavior. In this study, we conducted a comprehensive sensitivity analysis to diagnose the Community Land Model coupled to the Functionally Assembled Terrestrial Simulator, or CLM4.5(FATES). Specifically, we quantified the first- and second-order sensitivities of the model parameters to outputs that represent simulated growth and mortality as well as carbon fluxes and stocks for a tropical site with an extent of

Earth system models (ESMs) are abstract representations of nature used to simulate physical, chemical, and biological processes across the interacting domains of the Earth system to estimate past, present, and future climate

LSMs typically contain a suite of different parameters to resolve the carbon, water, and energy fluxes and pools at the land–atmosphere interface

There are two types of uncertainty and sensitivity analysis studies. One type of study aims to understand the model behaviors by exploring the baseline sensitivity of model outputs to parameter changes, which is normally an equal amount of deviation from the mean values of default parameters. This is commonly referred to as model sensitivity or elasticity analysis (e.g.,

Today, many uncertainty and sensitivity analysis techniques are available

The goal of this study is to conduct a comprehensive sensitivity analysis for a land surface model (Community Land Model) coupled to a demographic vegetation model (Functionally Assembled Terrestrial Simulator), or CLM4.5(FATES), at a tropical site with an extent of

CLM4.5(FATES) is an open-source land surface model coupled with a demographically structured dynamic vegetation model designed to predict climate–vegetation interactions. The land surface Community Land Model (CLM) represents surface heterogeneity and simulates land biogeophysics, the hydrologic cycle, biogeochemistry, human dimensions, and ecosystem dynamics

In this original version of CLM4.5(FATES), there are two challenges for the model to simulate tropical forests. First, it is difficult for the model to represent the coexistence of PFTs due to the dominance of growth and reproductive feedbacks and potentially the absence of additional stabilizing mechanisms

The CLM4.5(FATES) tracks different size classes of plants (generally

Global sensitivity analysis aims at quantifying the contributions of input variables to the variability of the outputs of a physical model by simultaneously sampling values of parameters from their corresponding statistical distributions. There are many methods for global sensitivity analysis. Two popular variance-based approaches are the Sobol method

In this study, we quantify both the first- and second-order sensitivities of the model parameters using FAST. It is possible to identify higher-order interactions with FAST; however, because of the sample size limitations for a larger trivariate parameter space, the FAST-based estimation of third-order sensitivity indices would be less reliable

To better understand how parameters affect specific CLM4.5(FATES) output variables (i.e., the relationship between model parameters and outputs), we also fitted cubic splines to the scatterplots between samples of parameters identified as important by FAST and the corresponding output variable of interest using the R SemiPar package

Recycled climate drivers for the study area including annual mean precipitation, relative humidity, and air temperature for the years 1948–1972. The annual radiation and air pressure are not plotted as they are quite stable across years.

In total, there are more than 200 parameters for all land surface processes including surface energy exchange, hydrology, biogeochemistry, plant physiology, and demographic processes within CLM4.5(FATES). In this study, we focus solely on vegetation components and select 87 parameters that are relevant to vegetation processes, including parameters for photosynthetic processes, temperature response, allometry description, radiative transfer, recruitment, turnover, and mortality. See Tables D1–D4 in the Appendix for a complete list of the parameters used in this study, with corresponding description, units, default values, and applied ranges. Refer to Appendix A for the allometry equations, Appendix B for the temperature response curve (photosynthesis) equations, and Appendix C for the carbon storage equations used in CLM4.5(FATES).

To conduct the sensitivity analysis, we have extracted many parameters in the model that were “hard-wired” in CLM4.5(FATES). The FAST algorithm requires valid ranges to be chosen for each parameter, which creates the possible parameter space to sample from. In theory, each parameter has a corresponding observational distribution that produces the ideal space for sampling

Simulated temporal dynamics in diameter at breast height (dDBH; cm yr

Using FAST, 5000 parameter combinations are sampled from the parameter space. The sample size was determined using the heuristic method of

In this analysis, we assume the majority of CLM4.5(FATES) parameters to be non-correlated with uniform probability because our study is focused on the model parametric sensitivity for model behaviors and there is a limitation of data for estimating covariance among the

Simulated temporal dynamics in tree density (NPLANT; N ha

In this study, the CLM4.5(FATES) model simulations are set up for a

In this section, we highlight the outputs of CLM4.5(FATES) from the 5000 simulations obtained for the FAST analysis and then show the important parameters that control variance in the outputs. We first investigate the forest demographic dynamics, diagnosing the growth and mortality processes simulated in CLM4.5(FATES), i.e., outputs representing the change in diameter at breast height (dDBH), the mortality rate, and the resulting basal area (BA). Then, we analyze the carbon fluxes and stocks in the model simulations including gross primary production (GPP), net primary production (NPP), LAI, and total forest biomass.

One of the key properties of CLM4.5(FATES) is that vegetation is represented as cohorts of varying sizes for more realistic simulation of light competition in the canopy. To illustrate how different parameters impact different size classes of trees, we group various cohorts of trees into three size classes for analysis purposes: small, medium, and large trees. Since the model runs are initialized from a near-bare-ground state, all simulated plants are considered “small” with an initial density of half-centimeter diameter saplings.

For the stem growth (dDBH averaged per tree; Fig.

Simulated temporal dynamics in tree mortality rates (fraction yr

In this analysis, carbon starvation emerged as the main driver for tree mortality (Fig.

The simulated basal area (BA) of the forest, which is the total stem cross-sectional area per ground surface area, results from the combination of both DBH growth and mortality. The BA reaches equilibrium for different sizes of trees around year 70 (Fig.

For the second-order sensitivity analysis, parametric interactions between stem allometry coefficient

To investigate the key parametric control on carbon fluxes and stocks, we specifically investigate parameter sensitivities for GPP, NPP, LAI, and total forest biomass. Our results show that GPP and NPP increased consistently for the first 10 years of the simulations, which is expected for a forest growing from bare ground (Fig.

The first-order sensitivity analysis based on FAST shows that, for carbon fluxes of GPP and NPP, the photosynthetic capacity parameter (

To understand how climate will impact sensitivity results, we also calculated the Spearman rank correlation coefficients between the first-order sensitivity index and the corresponding climate drivers. Our results show that the sensitivity of target carbon storage and maintenance respiration rate is negatively correlated with annual mean precipitation and relative humidity but is positively correlated with annual mean air temperature. This suggests that they are more important during the period of stressed conditions comprised of low precipitation, low humidity, and high temperature (Fig.

Our bivariate spline analysis

Simulated temporal dynamics in basal area (BA, m

While second-generation vegetation demographic models such as CLM4.5(FATES) provide new opportunities to predict the global carbon cycle, the larger number of parameters also creates challenges for identifying key processes for further investigation. In this study, we apply a global sensitivity analysis to determine the influential parameters over a specified region of the parameter space. So far, several uncertainty and sensitivity analyses have been conducted for size-structured land surface models

In our analysis, we observed a number of key similarities in model response to parameter variations in photosynthetic capacity, mortality, and respiration parameters

Simulated temporal dynamics in carbon fluxes and stocks, and the corresponding first-order parametric sensitivity indices. The left panels show the simulated ranges for

The goal of our study is not to reproduce the observations but instead to identify important parameters that can be better estimated for the model to fit observations. Thus, we lay out potential parameter estimation improvements to achieve this goal. We do want to highlight three caveats. First, improved estimation of the most sensitive parameters may not be most efficient if they have relatively small uncertainty or variability across different species and locations. Second, even if the estimates for most sensitive parameters are perfect, we may still not be able to fit model predictions to observations if there is deficiency in the representation of key processes in the model. Third, the recycled climate drivers from 1948 to 1972 may not match the observational periods. Given observation data limitations for our site, we conduct a qualitative comparison of our model simulations to ranges reported in the literature for the tropics. Not surprisingly, our model results show a variation of model–data mismatch for key vegetation states. For LAI (Fig.

Spearman's correlation coefficients between climate drivers and six most important parameters identified for

We compare our mortality simulations with an extensive dataset of observed mortality of 1781 species from 14 pan-tropical large-area ForestGEO forest dynamics plots

Another reason for the data–model discrepancy could result from the limited representation of diverse tropical species or traits with the simulation of a single PFT. This is a limitation of many LSMs, as they typically only have two to three PFTs for tropical forests (e.g., only evergreen and deciduous for tropical trees within CLM). CLM4.5(FATES) has the potential to better represent the trait diversity through trait filtering under different environmental conditions

In addition to directly comparing the model outputs to observations, we want to highlight that the sensitivity analysis will also allow us to explore the functional relationships between model parameters and outputs. Future synthesis studies that show these functional relationships using data across different sites could be very useful to evaluate the fidelity of model structure to represent the key processes that control these relationships.

Relations between outputs of CLM4.5(FATES), including GPP, NPP, LAI, and biomass (units shown in Fig.

Our study is the first global sensitivity analysis for CLM4.5(FATES); however, it is subjected to several limitations that could be improved for future studies. First, our study uses an arbitrary choice of parameter ranges (

Second, we only consider the correlation in pairs of parameters that determine temperature responses for deactivation energy and entropy in photosynthesis. We do want to point out that the potential correlation among other parameters

Third, it is possible that the parameter sensitivity could be different if we use different model inputs, different sites, and different structures of subcomponents within the model. For example, using site-level climate drivers, instead of the reanalysis meteorological drivers used in this study

Finally, although the FAST method presented in this study can provide a comprehensive analysis of the parameters that control vegetation demography, it is mostly built on statistical relationships (e.g., Fig.

LSMs have many parameters that could potentially affect the outcome of their simulations. In this study, we use the FAST analysis to conduct a high-dimensional global sensitivity analysis on CLM4.5(FATES). We use an intermediate complexity of simulation: runs are sufficiently long to permit short-term physiological variance to propagate into the long-term forest demographic structure. Even though we do not explore competitive dynamics between different PFTs, our sensitivity analysis will guide us on the selection of key plant traits for the consideration of trait trade-off and coordination in order to improve PFT coexistence within CLM4.5(FATES).

Our analyses show that the target carbon storage and stem allometry parameters are important for the simulation of DBH growth for individual trees and tree mortality. The photosynthetic parameter,

The results of the sensitivity analysis presented here can be utilized to construct the parameter-output response surface for the CLM4.5(FATES) model, which can assist future efforts for model calibration or diagnosis. These findings may help us better understand the overall model structure and guide the estimation of key model parameters with significant control over vegetative processes in these models for better model fitting to data. The FAST analysis provides a promising means of analyzing complex LSM components and can be a powerful tool in understanding the necessarily high-dimensional representation of living systems within Earth system models.

To access the FATES source code, visit

The following equations are cohort-based calculations for allometry in CLM4.5(FATES). Interested readers are referred to

The height (m) is calculated based on DBH (cm) as follows:

The parameters used for the temperature response curve equations include the equation to calculate the maximum carboxylation rate,

The target carbon storage is the cushion parameter shown in Table D3. Specifically, a higher value of this parameter will lead to a higher allocation of carbon to storage and thus a lower allocation to growth at the specific time step. Also, carbon storage plays an important role for the simulated mortality through the parameter that controls the mortality rate under stress, stress_mort in Table D3. The tree will be under stress when it has low carbon storage (

Carbon storage,

The fraction of the carbon balance for each cohort allocated to the carbon storage pool (

Carbon storage also plays an important role for the mortality. Specifically, carbon starvation mortality (

Comparison of first-order parametric sensitivity for medium (

Simulated total change in diameter at breast height (dDBH) from CLM4.5(FATES) for all trees and its fractional distribution for small (diameter

Impacts of stem allometry on the change in diameter at breast height (dDBH) averaged over the simulation years 100–130 for trees of different sizes. The shaded area shows the 95 % confidence interval of these relations.

Mortality outputs from CLM4.5(FATES), including the mechanisms of M1 – background mortality, M2 – hydraulic failure, M3 – carbon starvation, and M4 – impact mortality. The bottom panel shows the total mortality, which is the sum of M1–M4. Shown is the 95 % (light grey) spread of the simulation ensemble, along with the mean simulation (black lines).

Impacts of minimum crown spread on the basal area (BA) averaged over the simulation years 100–130 for trees of different sizes. The shaded area shows the 95 % confidence interval of these relations.

First-order parametric sensitivity indices for tree density of all trees

Second-order sensitivity index of the model parameters for the basal area (BA) outputs from CLM4.5(FATES) for

Second-order sensitivity index of the model parameters for the change in diameter at breast height (dDBH) outputs from CLM4.5(FATES) for

Second-order sensitivity index of the model parameters for the mortality outputs from CLM4.5(FATES) for

Comparisons of parametric sensitivities and the corresponding 95 % confidence intervals for different model outputs at year 130. Shown are the identified four most important parameters for

Fraction of total biomass for trees of different sizes, including small (diameter

Fraction of total GPP for trees of different sizes, including small (diameter

Second-order sensitivity index of the model parameters for the GPP, NPP, LAI, and biomass outputs from CLM4.5(FATES). Shown are the top eight most important parameter interactions in order of importance based on the mean parametric sensitivity across years (red is the most important and blue is the least important).

Mortality outputs from CLM4.5(FATES) for trees with DBH smaller than 5 cm

Parameter sets used in this study – part 1.

Parameter sets used in this study – part 2.

Parameter sets used in this study – part 3.

Parameter sets used in this study – part 4.

All authors contributed to the manuscript writing. CX designed the numerical experiments, developed scripts for sensitivity analysis, and analyzed model results; ECM implemented the model runs, extracted the model outputs, and analyzed model results; RAF, RGK, CDK, CX, and BOC contributed to the model development and simulations; JAH, DMR, SPS, and APW provided suggestions on sensitivity analysis, and LW provided support for model simulations; DJJ provide data on model evaluations; NGM, LMK, JQC, and JAV provided support and guidance on the experiment and manuscript.

The authors declare that they have no conflict of interest.

Model simulations were made possible thanks to the Conejo supercomputing system at the Los Alamos National Laboratory (LANL). We thank the four reviewers for their very helpful comments that substantially improved our manuscript.

This work was supported by the United States Department of Energy (US DOE) Office of Science Next Generation Ecosystem Experiment at Tropics (NGEE-T) project, the DOE Graduate Student Researcher (SCGSR) Fellowship, and the UC-Lab Fees Research Program (grant nos. 237285 and LFR-18-542511). Shawn P. Serbin was also partially supported by the United States Department of Energy contract no. DE-SC0012704 to Brookhaven National Laboratory. A portion of Elias C. Massoud's contribution to this research was carried out at the Jet Propulsion Laboratory, California Institute of Technology, under a contract with the National Aeronautics and Space Administration, Copyright 2019.

This paper was edited by Christoph Müller and reviewed by Xiangtao Xu, Nancy Kiang, and Sebastian Lienert.