Crop Physiology Calibration in CLM

Farming is using more of the land surface, as population increases and agriculture is increasingly applied to non-nutritional purposes such as biofuel production. This agricultural expansion exerts an increasing impact on the terrestrial carbon cycle. In order to understand the impact of 5 such processes, the Community Land Model (CLM) has been augmented with a CLM-Crop extension that simulates the development of three crop types: maize, soybean, and spring wheat. The CLM-Crop model is a complex system that relies on a suite of parametric inputs that govern plant growth under 10 a given atmospheric forcing and available resources. CLMCrop development used measurements of gross primary productivity and net ecosystem exchange from AmeriFlux sites to choose parameter values that optimize crop productivity in the model. In this paper we calibrate these parameters for 15 one crop type, soybean, in order to provide a faithful projection in terms of both plant development and net carbon exchange. Calibration is performed in a Bayesian framework by developing a scalable and adaptive scheme based on sequential Monte Carlo (SMC). The model showed significant 20 improvement of crop productivity with the new calibrated parameters. We demonstrate that the calibrated parameters are applicable across alternative years and different sites.

1 Introduction Development of Earth system models (ESMs) is a challenging process, involving complex models, large input datasets, and significant computational requirements.As models evolve through the introduction of new processes and through improvement of algorithms, the ability of the models to accurately simulate feedbacks between coupled systems improves, although results may not have the desired impact on all areas.For example, Lawrence et al. (2012) estimate that changes to the hydrology parameterization may be responsible for the warm bias in high-latitude soils in the 35 Community Land Model (CLM) version 3.5 to switch to a cold bias in CLM4.0.Although testing of ESMs is extensive, ensuring after new developments are merged that the model can still perform with limited (if any) degradation, on rare occasions model behavior can be negatively affected.

40
The strong nonlinearity of such models also makes parameter fitting a difficult task; and as global models are developed by several different user groups simultaneously, combinations of multiple alterations make identifying the specific cause that leads to a new model output challenging.The 45 CLM has been augmented with a CLM-Crop extension that simulates the development of three crop types: maize, soybean, and spring wheat (Drewniak et al., 2013).The CLM-Crop model is a complex system that relies on a suite of parametric inputs that govern plant growth under a given at-50 mospheric forcing and available resources.CLM-Crop development used measurements of gross primary productivity (GPP) and net ecosystem exchange (NEE) from AmeriFlux sites to choose parameter values that optimize crop productivity in the model.

55
Global climate models have historically been tuned or calibrated to meet certain requirements, such as balancing the top of the atmosphere radiation budget (Bender, 2008;Hourdin et al., 2012;Mauritsen et al., 2012).Various techniques have been applied to models to adjust parameters, includgies can be traced to a Bayesian approach that in most cases is simplified (e.g., MVFSA) or augmented with assumptions that make the problem tractable (e.g., EnKF).The tuning parameters that are not directly observed may be stated as an inverse problem (Tarantola, 2004).Inverse problems are, in 2 I. Bilionis et al.: Crop Physiology Calibration in CLM general, very challenging especially when the data are sparse, the models are complex, and the state space is large.This is the case for CLM-Crop model, as well as for ESMs.
Our goal is to calibrate some of the CLM-Crop parameters in order to improve model projection of plant development and carbon fluxes.To this end, we follow a Bayesian approach (Tarantola, 2004;Kaipio and Somersalo, 2004).We start by summarizing our initial state of knowledge in a prior probability distribution over the parameters we wish to calibrate.After making some observations, our updated state of knowledge is captured by the posterior distribution.Since the posterior is not analytically available, we attempt to approximate it using an ensemble of particles (samples) from it.To construct this particle approximation, we employ ideas from sequential Monte Carlo (SMC) Doucet et al. (2001).Basically, we define a one-parameter family of distributions of increasing complexity that starts at the prior and ends at the posterior.Starting from a particle approximation of the prior, we gradually move it toward the posterior by sequentially applying importance sampling.The scheme is highly par-90 allelizable, since each particle of the approximation can be computed independently.The way we move the particle approximation towards the posterior is adjusted on the fly using the ideas developed by Bilionis and Koutsourelakis (2012) and Bilionis and Zabaras (2014).Each intermediate step of 95 our scheme requires Markov chain Monte Carlo (MCMC) (Metropolis et al., 1953) sampling of the intermediate distributions.One of the novelties of this work, is the automatic construction of MCMC proposals for those intermediate steps using Gaussian mixtures (Blei and Jordan, 2005).

100
The result is an algorithmic framework that can adjust itself to the intricacies of the posterior.As demonstrated by the numerical examples, our scheme can perform model calibration using very few evaluations and, by exploiting parallelism, at a fraction of the time required by plain vanilla MCMC.

105
We present the results from a twin experiment (selfvalidation) and calibration results and validation using real observations from two AmeriFlux tower sites in the midwestern United States, for the soybean crop type.The improved model will help researchers understand how climate affects 110 crop production and resulting carbon fluxes, and additionally, how cultivation impacts climate.
2 The CLM-Crop model CLM-Crop was designed and tested in the CLM3.5 model version (Drewniak et al., 2013) and in CLM4 (Levis et al., 2012).The crop model was created to represent crop vegetation similarly to natural vegetation for three crop types: maize, soybean, and spring wheat.The model simulates GPP and yield driven by climate, in order to evaluate the impact of climate on cultivation and the impact of agriculture on cli-120 mate.Crops are modeled within a grid cell sharing natural vegetation; however, they are independent (i.e., they do not share the same soil column).This approach allows management practices, such as fertilizer, to be administered without disturbing the life cycle of natural vegetation.For a full de-125 scription of the crop model see the study by Oleson et al. (2013); the harvest scheme is described by Drewniak et al. (2013).
Crops are modeled similarly to natural vegetation with the main exception of how allocation is defined via a different 130 growing scheme, which is separated into four phases: planting, emergence, grain fill, and harvest. .Each phase of growth changes how carbon and nitrogen are allocated to the various plant parts: leaves, stems, fine roots, and grain.During planting, carbon and nitrogen are allocated to the leaf, repre-135 sentative of seed.This establishes a leaf area index (LAI) for photosynthesis, which begins during the emergence phase.The emergence phase allocates carbon and nitrogen to leaves, stems, and roots using functions from the Agro-IBIS model (Kucharik and Brye, 2003).During the grain fill stage, de-140 creased carbon is allocated to leaves, stems, and roots in order to fulfill grain requirements.When maturity is reached, harvest occurs: all grain is harvested, while leaves, stems, and roots are turned over into the litter pool.Residue harvest is not active in the model.

145
The allocation of carbon to each plant part is driven largely by the carbon-nitrogen (CN) ratio parameter assigned to each plant segment.CLM first calculates the potential photosynthesis for each crop type based on the incoming solar radiation and the LAI.The total nitrogen needed to maintain the 150 CN ratio of each plant part is calculated as plant demand.If soil nitrogen is sufficient to meet plant demand, potential photosynthesis is met; however, if soil nitrogen is inadequate, the total amount of carbon that can be assimilated is downscaled.

155
During the grain fill stage, a nitrogen retranslocation scheme is used to fulfill nitrogen demands by mobilizing nitrogen in the leaves and stems for use in grain development.This scheme uses alternate CN ratios for the leaf and stem to determine how much nitrogen is transferred from the leaves 160 and stems into a retranslocation storage pool.The total nitrogen transferred at the beginning of the grain fill stage from the leaf and stem is represented by 165 C leaf and C stem are the total carbon in the leaf and stem, respectively; leafcn and livewdcn are the pregrain fill CN ratios for the leaf and stem; and fleafcn and fstemcn are the postgrain fill CN ratios for the leaf and stem.All of the CN ratios 170 are fixed parameters, which vary with crop type; default values are reported in Table 1.
In addition to the above, CLM-Crop has a fertilizer application and soybean nitrogen fixation, described by Drewniak et al. (2013).Planting date and time to maturity are based on temperature threshold requirements (Levis et al., 2012).For the calibration procedure, we used the actual planting date reported for the Bondville site for the year 2004.Crops are not irrigated in the model, nor do we consider crop rotation.Although rotation will have an impact on the carbon cycle both above and below-ground, CLM does not support crop rotation at this time.
The version of CLM-Crop detailed by Drewniak et al. (2013) was calibrated against AmeriFlux data for both the Mead, NE, and Bondville, IL, sites' plant carbon measurements, for both maize and soybean, using optimization techniques to fit parameters.When available, parameter values were taken from the literature or other models.Remaining parameters were derived through a series of sensitivity simulations designed to match modeled carbon output with Amer-190 iFlux observations of leaf, stem, and grain carbon at the Bondville, IL, site and total plant carbon at the Mead, NE, (rainfed) site.
When CLM-Crop was ported into the CLM4.5 framework, the parameter values were no longer optimized as a result of various changes in model processes that affected how crops fit into the model framework.In addition, a new belowground subroutine of carbon and nitrogen cycling is included in CLM4.5 (Koven et al., 2013), which has a strong influence on crop productivity.Therefore, we needed to retune 200 the model parameters that represented crops with a more sophisticated approach described later in this paper.

Parameters affecting the crops
Over 100 parameters are defined in CLM4.5 to represent crops.Many of these parameters are similar to those that gov-205 ern natural vegetation, but some are specific to crops.These parameters define a variety of processes, including photosynthesis, vegetation structure, respiration, soil structure, carbon nitrogen dynamics, litter, mortality, and phenology.To add further complication, parameters are assigned in various parts of the model; some parameters are defined in an external physiology file, some are defined in surface datasets, and others are hardcoded in the various subroutines of CLM4.5.
Performing a full model calibration for all parameters would be a monumental task, so we began our calibration process by narrowing down the parameters that are used only in crop functions or might have a large influence on crop behavior.Of this list, parameter values can be fixed across all vegetation types (or crop types), vary with crop type, or vary spatially and by crop type.We chose to limit the parameters to those that are either constant or vary with crop type.
Crop parameters are taken from the literature (when available) and used to determine a range of values appropriate for each crop type.When parameters are not available, optimization techniques are used to estimate parameter values based on CLM performance.Determining a full range of acceptable values was difficult for several parameters, and in some cases not possible.Of the full list of parameters in need of calibration, we began our approach with the ten parameters listed in Table 1 that may have a large influence on crop pro-230 ductivity and have the greatest uncertainty because the values are based on optimization from a previous model version.Six of the parameters are the carbon-nitrogen (CN) ratios for the various plant parts (leaf, stem, root, and grain).Since the leaf and stem account for nitrogen relocation dur-235 ing grain fill, they are represented by two separate CN ratios, to separate pre-and postgrain fill stages of plant development.They influence how carbon and nitrogen are allocated, thereby affecting growth, nutrient demand, photosynthesis, and so on, and are included as part of the physiology data 240 file.Four additional parameters are included in the calibration process.The leaf/stem orientation is used to calculate the direct and diffuse radiation absorbed by the canopy, the specific leaf area at the top of the canopy is used with the leaf CN ratio to calculate the LAI, and the growth respiration 245 factors determine the timing and quantity of carbon allocated toward respiration of new growth.

Description of the observational dataset
We used observations from the Bondville, IL, AmeriFlux tower located in the midwestern United States (40.01 • N, 250 88.29 • W) using an annual no-till corn-soybean rotation; a full site description is given by Meyers and Hollinger (2004).The site has been collecting measurements since 1996 of wind, temperature, humidity, pressure, radiation, heat flux, soil temperature, CO 2 flux, and soil moisture.Soy-255 beans were planted in 2002 and 2004, and corn was planted in 2001, 2003, and 2005.We used daily averaged eddy covariance measurements of NEE and derived GPP in our model calibration procedure, which are categorized as Level 4 data published on the AmeriFlux site, gap filled by using 260 the Marginal Distribution Sampling procedure outlined by Reichstein et al. (2005).GPP is derived as the difference between ecosystem respiration and NEE, where ecosystem respiration is estimated by using the method of Reichstein et al. (2005).In addition, biomass information (which we convert 265 to carbon assuming half of the dry biomass is carbon) and LAI have been collected for years 2001-2005 for the various plant segments, including leaf (LEAFC), stem (STEMC), and grain (GRAINC), which are reported on the AmeriFlux website (http://public.ornl.gov/ameriflux).The frequency of 270 biomass measurements is generally every seven days, beginning a few weeks after planting and continuing through the harvest.We chose to calibrate against the Bondville Ameri-Flux site because of the availability of unique biomass data collected.By performing the calibration against site data that 275 includes crop rotation, we hope to indirectly include the effects of crop rotation on GPP and NEE in the model.Finally, in order to assess the transferability of the calibrated parameters across sites, we perform one more validation experiment using observations from the Mead, NE, AmeriFlux site, lo-IL site growing a corn-soybean rotation under no-till conditions.Although there are three fields, only one is under rainfed conditions.The site was initialized in 2001, a description can be found in (Verma et al., 2005).The data collected is the same as the Bondville, IL Ameriflux site.Soybeans were planted in 2002 and 2004.
The time-dependent observations are denoted by z(t) = {z 1 (t), . . ., z 6 (t)}, where the indices correspond to GPP, NEE, GRAINC, LEAFC, STEMC, and TLAI.Because of uncertainties in fertilization use and measured data, we focused on the peak observed values, as well as the growth slope for GPP, NEE, LEAFC, and STEMC.To remove the atmospheric induced noise in the NEE and GPP measurements we filtered the time series by applying a moving average op-295 erator with a width of 30 days.These operations are denoted by the map here z represents the filtered z and the slope is calculated 300 in the beginning of the plant emergence phase, resulting in one maximum and one slope per variable per year.The observed GPP and NEE slopes were computed as the slope between the 208th day and 188th day for 2002 and between the 180th day and 160th day for 2004.The observed LEAFC and 305 STEMC slopes were computed based on observed values on 7/16-8/13 and 7/23-9/10 for 2002, and on 6/8-7/27 and 6/8-8/10 for 2004, respectively.

Initial conditions and spin-up
CLM requires a spin-up to obtain balanced soil carbon and 310 nitrogen pools, which are responsible for driving decomposition and turnover.A global spin-up of the model is provided with the model, using the below-ground biogeochemistry and spin-up method provided by Koven et al. (2013).Crops are then interpolated to a higher resolution over the Bondville, IL, site.
The meteorological forcing data used for the calibration procedure (post spin-up) is from the Bondville, IL flux tower site.The atmospheric data covers the years 1996-2007, but we focus on 2002 and 2004 for this experiment.The model is run in point mode, meaning only one grid cell is simulated at a resolution of roughly 0.1 • × 0.1 • .

Calibration strategy
We represent the CLM-Crop model output relevant to Eq. ( 3) by f (θ) = (f 1 (θ), . . ., f q (θ)), where θ = (θ 1 , . . ., θ d ) are the d time-independent parameters that we wish to calibrate and q = 10 is the number of outputs.The slopes estimated from numerical simulations were computed as the variable slopes between the date when the fraction of growing degree days to maturity reaches 0.3 and 20 days prior to this point, where 330 growing degree days are accumulated each day by subtracting the minimum temperature for growth (10 • Cor soybean) from the average daily temperature; see (Oleson et al., 2013).
We consider a set of ten calibration parameters that were indicated by the model as being highly uncertain.This set 335 consists of: xl, slatop, leafcn, frootcn, livewdcn, grperc, grpnow, graincn, fleafcn, and fstemcn.See Table 1 and Sect. 2.1 for details.
The model calibration strategy aims to merge model predictions that depend on parameters θ with observational 340 datasets.We assume that the relationship between observation data and the true process follows a relationship of type where θ * are the perfectly calibrated parameters and ε rep- resents the observational errors.This holds under the assumption that the model is a perfect representation of reality (Kennedy and O'Hagan, 2001).The problem statement can be extended to account for imperfect models, but then the statistical description of ε tends to become much more 350 complicated.Therefore, for this study we start by considering a perfect model assumption.
Following a Bayesian approach, we assume a prior distribution on the calibration parameters: (5) where θ i,min and θ i,max are the minimum and maximum allowed values for the parameter θ i , respectively; 1 A (x) is the indicator function of a set A (i.e., 1 A (x) is one if x ∈ A and zero otherwise); and C(θ) models any physical constraints 360 that are known a priori.For the parameters we are considering, the constraints are as follows: We define the likelihood as where N (x|µ, Σ) is the Gaussian probability density with mean µ and covariance matrix Σ.The covariance matrix Σ obs is taken to be diagonal, namely, Σ obs = diag σ 2 obs,1 , . . ., σ 2 obs,q , with each diagonal component σ 2 obs,i 370 being the square of 10% of the corresponding observed value.This choice of Σ obs is equivalent to a priori assuming 10% observational noise.
Our state of knowledge about the parameters θ after observing y (see Sec. 2.2) is captured by the posterior distribu- (8) 4 Approximating the posterior We are going to construct a particle approximation of Eq. ( 8) where N i=1 w (i) = 1, and δ(•) is Dirac's delta function.This is achieved by using a combination of MCMC (Metropolis et al., 1953;Hastings, 1970) and SMC methodologies (Doucet et al., 2001).For more details on the methodological aspects, we refer the reader to the work of Del Moral et al. (2006); Koutsourelakis (2009) and Bilionis and Koutsourelakis (2012).Here we present the material briefly, focusing only on the novel aspect of our approach that concerns automatically tuning the MCMC proposals.
Let us define a sequence of bridging distributions: where Notice that for γ t = 0 we obtain the prior and for γ t = 1 the posterior.The key idea of SMC is to start from a particle representation of the prior (γ t = 0), which is easy to obtain, and gradually increase γ t until it reaches 1, adjusting the weights along the way.We will show later how this sequence can be determined on the fly by taking into account the degeneracy of the particle representations.

Sequential importance sampling
Let w be a particle representation of p(θ t |y, γ t ), with the weigths being normalized (i.e., N i=1 w (i) t = 1).We now examine how this particle representation can be updated to a particle representation corresponding to γ t+1 > γ t .Toward this goal, we introduce a fictitious probability density on the joint space of θ t and θ t+1 by where L t is a backward transition density (i.e., L t (θ t |θ t+1 ) is the probability of θ t given θ t+1 ) properly normalized, that is, L t (θ t |θ t+1 )dθ t = 1.In addition, we introduce an importance sampling density, where K t is a forward transition density (i.e., K(θ t+1 |θ t ) is the probability of θ t+1 given θ t ) properly normalized, that is, This observation immediately suggests that to move the γ t particle representation of Eq. ( 11) to a γ t+1 representation 425 compute the incremental weights, get the unormalized γ t+1 -weights, and get the normalized γ t+1 -weights: 435 4.2 Convenient choices for L t and K t The preceding remarks hold for any backward and forward transition densities L t and K t , respectively.We now seek a convenient choice that will simplify the form of the incremental weights given in Eq. ( 15).Suppose for the moment 440 that K t is given and let us look for the optimal choice of L t .Since q t is the target distribution and η t is the importance sampling density, the best choice of L t is the one that other words, the optimal choice is From a computational point of view, however, it is more convenient to work with the suboptimal choice, which is motivated by the expectation that consecutive densities are similar (i.e.π t ≈ π t+1 ).For this choice the incremental weights of Eq. ( 15) become To get rid of the integral in the denominator, we pick K t to 455 be invariant with respect to π t+1 : This can always be achieved with a suitable choice of a Metropolis-Hastings transition kernel (see below).For this case, the incremental weights simplify to 4.3 Metropolis-Hastings-based K t .
As shown in the previous paragraph, it is convenient to select K t to be invariant with respect to π t+1 .The easiest way to achieve this is to associate K t with one or more steps of the 465 Metropolis-Hastings algorithm.Let h t (θ |θ) be any proposal density (e.g., a simple random walk proposal).The singlestep Metropolis-Hastings forward transition density is where Samples from Eq. ( 14) may be obtained by performing one step of the well-known Metropolis-Hastings algorithm.The forward kernel corresponding to M > 1 Metropolis-Hastings steps is given recursively by 475 The number of Metropolis-Hastings steps, M , at each γ t is a parameter of SMC.This is the forward kernel we use in all numerical examples.Theoretically, M = 1 is enough, since the number of particles N → ∞.Therefore, we will use M = 480 1 in our numerical examples.

Resampling
As SMC moves to higher values of γ t , some of the particles might find themselves in low probability regions.Consequently, their corresponding weights will be small.This de-485 generacy of the weights can be characterized by the effective sample size (ESS) metric, defined by Notice that ESS is equal to N when the particles are equally important (i.e., w t = 1/N ) and equal to 1 when only one 490 particle is important (e.g., w (1) t = 1 and w (i) t = 0 for i = 1) and in general takes values between 1 and N for arbitrary weights.Resampling is triggered when the ESS falls below a prespecified threshold ( 1 2 N in our numerical examples).The idea is to kill particles that have very small weights 495 and let the particles with big weights replicate.This process must happen in a way that the resulting particle ensemble remains a valid representation of the current target probability density.This can be achieved in various ways.Perhaps the most straightforward way is to use multinomial sampling.

500
Let the resulting particles be denoted by . In multinomial resampling the final weights are all equal: The sequence θ(i) for i = 1, . . ., N .

Choosing γ t+1 on the fly
We note that the incremental weights of Eq. ( 22) do not de-510 pend on the γ t+1 -samples obtained in Eq. ( 14).They depend only on the likelihood of the γ t -samples.In this part, we exploit this observation in order to devise an effective way of selecting γ t+1 based on the ESS.The idea is to pick the new γ t+1 so that the resulting particles do not become too degen-515 erate.Their degeneracy is characterized by ESS(γ t+1 ) given in Eq. ( 26).From Eqs. ( 22), ( 16), and ( 17), evaluation of ESS(γ t+1 ) does not require any new likelihood evaluations.We select the new γ t+1 by requiring that where ζ is the percentage of the degeneracy we are willing to accept (ζ = 0.99 in our numerical examples).It is fairly easy to show that ESS(γ t+1 ) is a strictly decreasing function for γ t+1 ∈ (γ t , 1]).Therefore, Eq. ( 29) has a unique solution that can be easily found by using a bisection algorithm.

Adapting the K M t on the fly
Based on the discussion above, we expect that p(θ t |y, γ t ) should be similar to p(θ t+1 |y, γ t+1 ).To exploit this fact, we pick the proposal h t (θ |θ) required by the Metropolis-Hastings kernel K M t given in Eq. ( 25) to be a mixture of Gaussians that approximates p(θ t |y, γ t ).In particular, we pick where the non-negative coefficients c i sum to one, µ i,i ∈ R d , and Σ t,i ∈ R d×d are covariance matrices.The number of components L as well as all the parameters of the mixture fitted to a resampled version of the particle approximation of p(θ t |y, γ t ) (see Eq. ( 28)) using the procedure of Blei and Jordan (2005) as implemented by Pedregosa et al. (2011).

540
SMC is embarrassingly parallelizable.Basically, each CPU can store and work with a single particle.Communication is required only for normalizing the weights (see Eq. ( 17)), finding γ t+1 (see Eq. ( 29)), and resampling.The first two have a negligible communication overhead and can be imple-545 mented easily.Implementation of the resampling step is more involved and requires more resources.However, the cost of resampling is negligible compared with the evaluation of the forward model.

The final algorithm 550
We now collect all the details of SMC discussed above in a single algorithm for convenience; Algorithm 1.Our implementation is in Python and is provided at: https://github.com/ebilionis/pysmc.

555
In this section we present our calibration results for the parameters described in Sect.2.1 by using the observations detailed in Sect.2.2.In this study we focus only on the parameters affecting the soy crop and restrict our calibration to year 2004.With these calibrated parameters we perform a 560 validation experiment by using the data from year 2002.In addition, we forecast 2004 through a 2002-2003-2004 simulation.We recognize that the Bondville observations include crop rotation during 2003 that will influence the sequence of output; but since the model does not support crop rotation, we plant soybean during 2003.The role of the latter experiment is to demonstrate the robustness of the proposed calibration scheme.Moreover, we perform a twin experiment that consists of generating artificial data by using some control parameter values, then applying the calibration strategy 570 to recover the control parameters.
In all our numerical examples we fix the planting and harvest days.This approach is essential in order to avoid overfitting the physiological parameters due to offsets in the growing seasons.The planting dates for 2002 and 2004 are 06/02 575 and 05/07, respectively.The harvest day is controlled via the input variable "hybgdd" (growing degree days for maturity), where growing degree days are defined in Sec. 3. The values of "hybgdd" that give the right harvest days for 2002 and 2004 are 1474 and 1293, respectively.

580
The number of particles we use is N = 1280.Each particle is assigned to a different computational core, i.e., we use 1280 computational cores A simulated year takes about 2 min to complete if data localization is used.Calibration requires approximately 100,000 simulations and completes in 585 about 6 hours.

Validation of the method
We begin the twin experiment with the aim of validating the proposed calibration strategy.We generate artificial observations by randomly sampling θ from its prior Eq. ( 5).We apply 590 the calibration strategy on the artificial observations to see whether the method can recover the ground truth.The adaptively selected γ t sequence is shown in Fig. 1(a).In Fig. 2 we compare the posterior of each parameter with the prior.The true parameters are indicated by red dots.The fit to the arti-595 ficial outputs is shown in Fig. 3.The parameters that are not specified precisely are parameters that have a small (if any) effect on the observed outputs.

Calibration using real data
In our next experiment we calibrate the parameters listed in 600 Table 1.The observational operator (Eq. 3) is defined by taking the annual maximum of the absolute value of LEAFC, LAI, GRAINC, STEMC, GPP, and NEE; and the slope of LEAFC, STEMC, GPP, and NEE as described in Sec.2.2.In Fig. 4, we compare the posterior we obtain with the prior.The 605 default parameters are indicated by red dots.The fit to the artificial outputs is shown in Figures 5.The adaptively selected γ t sequence is shown in Fig. 1(b).In Table 2 we summarize our findings, by showing the median and the p = 0.05 and p = 0.95 quantiles of each calibrated parameter.610

Validation of real data results
To validate the generalization potential of our calibration we perform a one-way validation We use the calibrated parameters to predict the observables in 2002.In Fig. 6 we plot the median and 95% error bars of the calibrated time series 615 and we compare the results with observations and the default parameter output of 2002.We observe a notable improvement in the ability of the model to explain the observations.One of the most important improvements is related to LAI calculations, which comes from improvements to the leaf 620 CN ratio and the specific leaf area.The timing of maximum LAI is important for the carbon allocation; when the crops in CLM4.5 reach peak LAI, carbon allocation shifts from above and below-ground to strictly below-ground (roots).With the default parameter values, peak LAI occurred early in the growing season, resulting in large and unrealistic allocation of carbon to roots and insufficient carbon to leaf, stem, and ultimately grain.The large increase in stem carbon and the slower rate of growth and peak of GPP are clear indications that the shift in allocation to roots no longer occurs 630 with the new calibrated parameter values.The grain carbon is still low, however, a result of the low leaf carbon and the overestimation of stem carbon, which increases the amount of carbon allocated to maintenance respiration at the expense of new growth for this year.The increase in uncertainty is likely a result of a limitation in nitrogen availability in some scenarios.When CN ratios are low, a higher demand of nitrogen from plants contributes to an increase in competition for resources with below-ground decomposition processes.When the nitrogen demand from the two sources exceeds 640 availability, the amount of carbon that can be assimilated is downscaled, resulting in a lower GPP, increase in NEE, and so on.We continue the simulation through 2003 to 2004 and compare the calibrated time series with the observations and the default parameter output in Fig. 7.The differences beibration over one year and didn't consider variability in previous years' carbon and nitrogen pools, demand for nitrogen is likely different when the model is run for multiple years.Since the change in pools is minor, the resulting change in output by the model is also small.This is also likely responsible for the increased uncertainty in GPP and NEE, which occurs from competition for resources as discussed above.Finally, in Fig. 8 show the results of the same validation experiment at the Mead, NE, site.

660
In this paper, we sought to improve CLM-Crop model performance by parameter calibration of a subset of model parameters governing, mostly, the carbon and nitrogen allocation to the plant components.By using a Bayesian approach, we were able to improve the model-simulated GPP, NEE, and carbon biomass to leaf, stem, and grain with the new parameter values.In addition, we demonstrated that the calibrated parameters are applicable across alternative years and not solely representative of one year.
This hibitively expensive.We will address this issue in a future study by including these pools in the calibration procedure.
Our approach has focused on one crop type, soybean, with the intent of determining the effectiveness of the proposed calibration method.We consider the results promising and, 700 as part of future work, hope to expand this research to additional years, crop types, and other parameters.Many other variables are of interest, including fertilization rate, timing of the growth stages, and a few other parameters related to photosynthesis.As the model continues to evolve with the 705 addition of new or improved processes, we also may need to revisit the parameter choices and evaluate their appropriateness.Moreover, a calibration procedure carried for such complex models with relatively little data and a few calibration parameters has the potential to lead to overfitting.To as-710 sess this effect, we performed a validation experiment, which provides good confidence, albeit not proof, of a robust calibration of the parameters.Richer datasets will likely sharpen the results and enhance the confidence intervals.
The introduction of new datasets documenting agriculture 715 productivity or carbon mass will also allow us to determine the applicability of our new parameter values across regions.In general, the calibration results depend on an accurate specification of the observational errors.In this study we did not have access to any information regarding the measure-720 ment process and, therefore, assumed a certain observational noise.These calibration results can be sharpened by annotating the observational data with levels of confidence.The calibration strategy presented in this study has the potential to improve model performance by helping modelers define ) ∝ p(y|θ)p(θ) .

Fig. 2 :
Fig. 2: Twin experiment: Comparison of the posterior with the prior.The red dot indicates the true parameter value.The figure continues on the next page.

Fig. 3 :
Fig. 3: Twin experiment: Comparison of the true model output with samples from the posterior of the calibration.

Fig. 4 :
Fig. 4: Calibration experiment (Bondville, IL): Comparison of the posterior with the prior.The red dot indicates the default parameter value.The figure continues on the next page.

Fig. 5 :
Fig. 5: Calibration experiment (Bondville, IL): Comparison of the observed data for 2004 with calibrated outputs.The grey areas correspond to 95% confidence error bars.

Fig. 6 :
Fig. 6: Validation experiment (Bondville, IL): Comparison of the observed data for 2002 with the model.The grey areas correspond to 95% confidence error bars.

Fig. 7 :
Fig. 7: Validation experimen (Bondville, IL)t: Comparison of the observed data for 2004 with the model started on 2002.The grey areas correspond to 95% confidence error bars.

Fig. 8 :
Fig. 8: Validation experiment (Mead, NE): Comparison of the observed data for 2002 with the model (top).Comparison of the observed data for 2004 with the model (bottom).
globally or even across a region.However, the limited data over agricultural sites constrains our ability to determine 675 parameter values that are relevant at a global scale.In addition, our use of actual planting dates is not a typical approach with CLM4.5, which generally uses temperature thresholds to trigger planting.Thus, the model may plant earlier or later compared with observations, which, if significant, could in- study does have a few limitations stemming from 670 a lack of observation data.Currently our results are suitable at one site across multiple years; testing at multiple sites would give a better indication of how well the model can per-form Bilionis, I. and Koutsourelakis, P. S.: Free energy computations by minimization of Kullback-Leibler divergence: An efficient adaptive biasing potential method for sparse representations, Journal of Computational Physics, 231, 3849-3870, 2012.Bilionis, I. and Zabaras, N.: Solution of inverse problems with lim-

Table 1 :
Prior information on the parameters.

Table 2 :
Posterior information on the parameters.