CH 4 parameter estimation in CLM 4 . 5 bgc using surrogate global optimization

Over the anthropocene methane has increased dramatically. Wetlands are one of the major sources of methane to the atmosphere, but the role of changes in wetland emissions is not well understood. The Community Land Model (CLM) of the Community Earth System Models contains a module to estimate methane emissions from natural wetlands and rice paddies. Our comparison of CH4 emission observations at 16 sites around the planet reveals, however, that there are large discrepancies between the CLM predictions and the observations. The goal of our study is to adjust the model parameters in order to minimize the root mean squared error (RMSE) between model predictions and observations. These parameters have been selected based on a sensitivity analysis. Because of the cost associated with running the CLM simulation (15 to 30 min on the Yellowstone Supercomputing Facility), only relatively few simulations can be allowed in order to find a near-optimal solution within an acceptable time. Our results indicate that the parameter estimation problem has multiple local minima. Hence, we use a computationally efficient global optimization algorithm that uses a radial basis function (RBF) surrogate model to approximate the objective function. We use the information from the RBF to select parameter values that are most promising with respect to improving the objective function value. We show with pseudo data that our optimization algorithm is able to make excellent progress with respect to decreasing the RMSE. Using the true CH4 emission observations for optimizing the parameters, we are able to significantly reduce the overall RMSE between observations and model predictions by about 50 %. The methane emission predictions of the CLM using the optimized parameters agree better with the observed methane emission data in northern and tropical latitudes. With the optimized parameters, the methane emission predictions are higher in northern latitudes than when the default parameters are used. For the tropics, the optimized parameters lead to lower emission predictions than the default parameters.


Introduction and motivation
Methane is the second most important greenhouse gas in terms of radiative forcing (Myhre et al., 2013) and thus a major concern regarding climate change.Natural wetlands as well as human activities such as agriculture (for example, rice cultivation) contribute to the methane emissions (Ciais et al., 2013).The role of wetlands in the total budget of methane, as well as in driving inter-annual variability and changes in the methane growth rate is not well understood (e.g.Bloom et al., 2010;Dlugokencky et al., 2011).The Community Land Model (CLM), which is the land component of the Community Earth System Model (CESM), is equipped with a methane module that models methane emissions (Meng et al., 2012;Riley et al., 2011).There are several parameters in CLM related to the methane emission computations.The methane emissions estimated by the model are sensitive to the exact parameter values although these parameters are not well known (e.g.Meng et al., 2012;Riley et al., 2011;Wania et al., 2010).Riley et al. (2011) and Meng et al. (2012) reported significant differences in model simulations and observations in both site-level methane emissions and Published by Copernicus Publications on behalf of the European Geosciences Union.the global budget.One important source of uncertainty is associated with the parametrization since the methane module has numerous parameters and they are yet to be identified empirically due to the lack of field data (Riley et al., 2011).In this study our goal is to use surrogate model optimization techniques in order to adjust the methane-related parameters of the CLM such that the differences between the simulated and observed methane emissions at 16 sites around the globe are minimized.
For computing an objective function value, we have to do a computationally expensive simulation with CLM4.5bgc in order to obtain the methane emission predictions at each observation site.CLM4.5bgc and related codes are deterministic models, i.e., the simulated CH 4 emissions for a given parameter set will always be the same whenever we run the model for the same parameter set.In an optimization framework where the goal is to find the best set of parameters to minimize the objective function, one obstacle is the computation time that is needed to obtain a single objective function value.Only a few hundred function evaluations can be allowed in order to obtain a solution within a reasonable time.Moreover, the objective function value must be computed by running a simulation model, and thus an analytic description of the objective function is not available (black-box).Therefore, gradient information, which is important for many optimization algorithms, is not available.Due to the black-box nature of the objective function, it is also not known whether or not the objective function is convex and has only one local minimum (which corresponds to the global minimum) or if there are several local and global minima in the objective function landscape.
These characteristics of the objective function (computationally expensive, black-box, possibly multi-modal) do not allow the application of a gradient-based optimization algorithm because, on the one hand, the derivatives would have to be computed numerically (which may be inaccurate and requires many expensive function evaluations), and, on the other hand, gradient-based algorithms generally stop at a local minimum if the initial guess is not close to the global minimum.
For calibrating the parameters of other CLM modules, Markov Chain Monte Carlo (MCMC) methods and Kalman filters have been used in the literature (Lo et al., 2010;Prihodko et al., 2008;Schuh et al., 2010;Solonen et al., 2012;Sun et al., 2013;Tian et al., 2008;Turner et al., 2009;Zeng et al., 2013).MCMC, however, requires generally thousands of function evaluations (Ray and Swiler, 2014) and is thus not applicable for obtaining solutions in an acceptable time for computationally expensive problems.When using Ensemble Kalman Filters, assumptions about the underlying parameter distributions must be made and generally a large number of observations is necessary for the method to be effective.Furthermore, evolutionary strategies such as simulated annealing, particle swarm, and differential evolution methods have been used for parameter tuning in the climate area (Yang et al., 2012(Yang et al., , 2013)).However, these methods generally require many function evaluations in order to obtain good solutions.
Other methods that have recently gained interest for parameter tuning are based on data assimilation (see, for example, Han et al., 2014;Moore et al., 2008).In order to produce good parameter estimates, these methods require in general many observations.In our optimization problem, however, the number of observations at each site is very low (between 10 and 79 observations distributed over 1 to 3 years), and thus data assimilation techniques are not suitable because of the low number of observations.Ray and Swiler (2014) use a computationally cheap surrogate for CLM on which MCMC is used to reduce the number of costly simulations required during the optimization.In contrast to Ray and Swiler (2014), we apply an adaptive surrogate model during the optimization.Instead of relying on a surrogate that is based only on a limited number of initial sample points, we iteratively improve our surrogate by incorporating new data (new objective function values) that become available during the optimization.
We use surrogate model based global optimization algorithms because they have been shown to find near-optimal solutions within a few hundred function evaluations for computationally expensive multimodal black-box problems (Aleman et al., 2009;Giunta et al., 1997;Regis, 2011;Simpson et al., 2001).Surrogate models are used as computationally cheap approximations of the objective function.During the optimization, information from the surrogate model is used to carefully select a new promising point in the variable domain at which the computationally expensive objective function will be evaluated.The surrogate model is updated throughout the optimization whenever new data are obtained.
Several surrogate model algorithms have been developed in the literature that use different surrogate model types.The efficient global optimization algorithm by Jones et al. (1998), for example, uses a kriging surrogate model and selects a new sample point by maximizing an expected improvement function.Gutmann (2001) uses radial basis function (RBF) surrogate models to approximate the expensive objective function and a new sample point is selected by minimizing a so-called bumpiness measure.Regis andShoemaker (2007, 2013) also use RBF models and new function evaluation points are selected by a stochastic method.Müller and Piché (2011) developed a framework for automatically computing ensembles of various surrogate model types and Müller and Shoemaker (2014) extended the study to investigate the influence of different sampling strategies on the solution quality.Here for the first time we apply a state-of-the-art RBF surrogate optimization algorithm to the problem of land surface emissions of methane and describe the results.As far as we know, no other groups have applied optimization techniques to find better parameters for methane emission models, and thus our work represents an innovative approach to an important land-atmosphere interaction.
The remainder of this paper is organized as follows.In Sect. 2 we briefly describe the CLM and the configuration we used for predicting the methane emissions and we give information about the individual observation sites.We also provide the mathematical description of the optimization problem.In Sect. 3 we summarize the methane-related parameters in CLM4.5bgc and show the results of a sensitivity analysis with which we determined the parameters that are most important for the optimization.We describe the surrogate optimization approach for solving the problem in Sect. 4. Section 5 contains information about the setup of our numerical experiments and we discuss the results of the optimization.We draw conclusions in Sect.6.The Appendix contains additional information about the methane equations and the observation sites.
2 Model description, configuration, and mathematical problem description

Model description
We used the Community Land Model Version 4.5 (CLM4.5), a land component of the Community Earth System Model (CESM) (Hurrell et al., 2013) which contains a detailed biophysics, hydrology, and biogeochemistry representation (Koven et al., 2013;Oleson et al., 2013).CLM4.5 is fully prognostic with respect to the carbon and nitrogen state variables in the vegetation, litter, and soil organic matter, as well as methane emissions (Koven et al., 2013;Thornton et al., 2007Thornton et al., , 2009) ) and it is the most updated version of the model available.
We selected the latest version of CLM with improved biogeochemistry (CLM4.5bgc)over CLM4.0-CN.The major improvements in CLM4.5bgc include the incorporation of vertically resolved soil carbon dynamics, an alternate decomposition cascade from the Century soil model, and a more detailed representation of nitrification and denitrification based on the Century nitrogen model (Koven et al., 2013).The hydrology of CLM4.5 has been improved to better represent the hydraulic properties of frozen soils, perched water tables, snow cover fraction, and lakes (Subin et al., 2012;Swenson and Lawrence, 2012;Swenson et al., 2012).
In previous versions, simulation of ecosystem productivity was too low in high latitudes and perhaps too high in low latitudes (Thornton et al., 2007(Thornton et al., , 2009)).However, CLM4.5bgc has substantially increased the productivity in high latitudes, which may be overpredicted (Koven et al., 2013).
We used a mechanistic methane emission model, which is a module integrated in CLM4.5bgc (Meng et al., 2012;Riley et al., 2011).The model simulates the physical and biogeochemical processes regulating terrestrial methane fluxes such as methane production, methane oxidation, methane and oxygen transport through aerenchyma of wetland plants, ebullition, and methane and oxygen diffusion through soil (Riley et al., 2011).Meng et al. (2012) added constraints on methane emissions such as the effects of redox potential and soil pH to improve the predictions of methane emissions as well as the ability to simulate satellite-derived inundation fraction (Prigent et al., 2007;Ringeval et al., 2010).
The model has been compared to the limited site-level observations of methane emissions (many of the sites have very sparse spatial and temporal data coverage, and directly measured climate forcing was unavailable at any of the sites) (Meng et al., 2012;Riley et al., 2011).Additionally, the model was compared with the results from three recent global atmospheric inversion estimates of methane emissions (Riley et al., 2011).In these comparisons, simulated emissions agreed relatively well with the observed emissions at some of the sites.However, there are considerable differences in seasonality and magnitude at other sites.The simulated patterns and magnitudes of annual-average methane emissions are consistent with the results from atmospheric inversion across most latitude bands.The limitations are discussed in Riley et al. (2011).

Model configuration and data
Although the land model can be used interactively within CESM, we use it at specific points driven by appropriate meteorology (Oleson et al., 2013).At each site, we forced the model with NCEP/NCAR's reanalysis atmospheric forcing data sets (Qian et al., 2006).These data sets include precipitation, temperature, wind speeds, and solar radiation.We also forced the model with transient atmospheric carbon dioxide concentrations, aerosol deposition data, and nitrogen deposition data that are available in CLM4.5.Please note that this model is a deterministic model, and thus will give the same answer every time it is simulated when driven by observationally based data sets as done here.
In this study we used a total of six natural wetland sites and ten rice paddy sites (see Tables B1 and B2 in Appendix B).We chose the wetland sites from varying geographical regions such as the tropics, mid-latitudes, and high-latitudes to account for the zonal variability.We selected the rice paddy sites such as to cover the major rice-growing regions with a focus on Asia.
The water table depth is one of the critical factors for methane emissions from natural wetlands because it determines the extent of anoxic and oxic soil zones where methane is produced and oxidized, respectively (Bloom et al., 2010;Grunfeld and Brix, 1999).Methane is produced in the wetlands from litter and dead vegetation remnants in anoxic conditions.The changes in the water table position also influence the moisture conditions of the soil and therefore affect the methane emissions.Here, we prescribed the measured water table position at each wetland site (except Panama) based on previous studies.Since the measured water table depths at Panama were not available, we used modeled water table positions (similar to Walter and Heimann, 2000).For the point simulations, the methane emissions were calculated only from the saturated portion of the soil (i.e.below the water table) when the water table is below the surface.The prescribed water table depth is used in the methane model for calculating anaerobic conditions, production, and oxidation.
Most of these wetland sites usually have peat soils with varying depths underlain by mineral soil.We also forced each wetland site with measured pH and a specific plant functional type (PFT).The PFT reflects the phenological and physiological characteristics of the vegetation (Oleson et al., 2013).Since the wetland PFT was not available in CLM4.5, we choose PFTs that are available in CLM4.5 and that closely match the specific vegetation types of the individual sites.We use C 3 arctic grass for Salmisuo, C 3 non-arctic grass for Alberta, Michigan, and Minnesota, and C 4 grass for Florida and Panama.Other surface data required to perform the point simulation include soil color and soil texture which we extracted from the global grid data sets available in CLM4.5.
For the point simulations at the rice paddy sites we only considered the rice growing season.The flooding and drainage dates are shown in Table C1 in Appendix C. We assumed that the fields were submerged during the simulation period between initial flooding and final drainage.A common feature of these sites during the growing season is that the water was not drained until harvest.We prescribed the C 3 crop PFT for all rice paddy sites, and assumed an optimal pH for the methane production whenever the pH value was not available.The dominant soil types at these sites are loam and clay.Other soil-related information such as soil color and texture are derived from the global grid data sets.
To bring the terrestrial carbon and nitrogen cycles close to steady-state conditions, we spun up both wetland and rice paddy sites for 1850 conditions (atmospheric CO 2 concentrations, nitrogen deposition, aerosol deposition, and land use) driven by a repeating 25-year subset (1948-1972) of the meteorological forcing data for more than 2000 years.Then, we performed transient simulations from 1850 to the simulation starting year of each site to generate the initial conditions file.
Additionally, we conducted global simulations of methane emissions from natural wetlands for 1993-2004.For these simulations, the grid cell averaged methane emissions were considered which accounts for methane emissions from both the inundated and non-inundated portion of the grid cell.Since the CLM4.5 simulated saturated fraction (an index of inundation) was substantially greater than the estimates from satellite observations and did not match the spatio-temporal pattern of variability (Riley et al., 2011), we prescribed the model with inundation fraction derived from multi-satellite observations for 1993-2004(Prigent et al., 2007)).Similar to point simulations, the global simulations were forced with NCEP/NCAR reanalysis atmospheric forcing data from 1948 to 2004 (Qian et al., 2006).The simulations were also spun up to steady-state conditions driven by atmospheric CO 2 , nitrogen deposition, aerosol deposition, and land use in the year 1850 and a repeated 25-year (1948-1972) subset of the meteorological forcing.

Mathematical problem formulation
The goal of our study is to improve the methane emission predictions of CLM4.5bgc by tuning the methane-related parameters such that the model better fits the observations.We use the CH 4 emission observation data for the locations and observation periods shown in Tables B1 and B2.Given the observation data at the M = 16 locations, the goal is to minimize the root mean squared errors (RMSEs) between the CLM4.5bgcmethane emission predictions and the observations at each site simultaneously.In order to tackle the problem, we formulate it such that we minimize the weighted sum of the RMSEs as follows: where d denotes the problem dimension (the number of optimization parameters), and x l k and x u k are the lower and upper bounds of variable x k , respectively.The RMSE is computed for each location i. N i is the number of observations available at location i, O i,j denotes the j th methane emission observation at location i, and S i,j denotes the corresponding methane emission predicted by CLM4.5bgc.The weights w i are computed based on the means of the CH 4 emissions at the observation locations as follows.Denote the mean CH 4 emission at location i, i = 1, . .., M. The weight w i for the ith location is then defined by where where it is assumed that a i > 0 for all i.The goal is to give each location approximately equal influence in the weighted sum of RMSEs, i.e., we assign locations with large mean CH 4 values small weights such that these locations have approximately the same influence on the weighted sum as locations with low emissions.Otherwise, locations with large  2  7   VMAX_CH4_OXID   1.25 × 10 −6 1.25 × 10 −4 1.25 × 10 − emissions would dominate the sum (1a) because their RM-SEs would accordingly be larger.In that case the optimization would be driven by minimizing the RMSE of the site(s) with the largest emissions.There are also other methods of how w i could be determined.In the numerical experiments, we will investigate also the possibilities of assigning equal weights to each observation site and assigning weights derived from grouping the observation sites into zones.Another possibility would be to apply clustering algorithms in order to determine groups of observation sites with similar characteristics.For this possibility, however, different clustering methods and different numbers of desired clusters will lead to different groups and different weight adjustments.Lastly, the problem could be formulated as multi-objective optimization problem, for example, with 16 objectives and the goal of minimizing each observation site's RMSE individually, or as bi-objective optimization problem by minimizing the sum of the weighted RMSE values of northern and southern locations at the same time.However, each objective function evaluation is very expensive, and thus the number of evaluations that can be done to obtain the Pareto front in a multiobjective setting is limited.Our focus is on demonstrating that single objective global optimization analysis is useful in identifying reasonable parameter values.1).
Optimization problems become increasingly more complex and difficult to solve as the number of parameters increases (curse of dimensionality).Thus, we determine first which of these 21 parameters are the most sensitive and thus the most important for the optimization.By sensitive we refer to parameters that when changed slightly lead to a significant change in emission predictions.Insensitive parameters, on the other hand, can be changed and do not (or comparatively only very mildly) change the emission predictions and can thus be excluded from the optimization, which decreases the problem dimension.
We conducted analyses for each observation site in which we investigated to which of these 21 parameters the methane emission predictions of CLM4.5bgc are the most sensitive.We altered the value of each parameter k = 1, . .., d by, respectively, adding and subtracting 20 % of the variable range and we recorded the absolute change in emission predictions, i.e. we ran CLM4.5bgc with perturbed parameter values for each parameter separately.
There are several parameters that are relatively important to the sensitivity test for all 16 observation sites, but there are also parameters that are important for some locations and less important for others.Tables 2 and 3 show the sensitive and insensitive parameters together with the number of locations (out of 16) for which these parameters are important and unimportant, respectively.Thus, in the optimization we consider only the parameters in Table 2 since these parameters are the most important at most locations.Please note that, due to (nonlinear) relationships between the parameters, for many parameters the effects of individual parameters will be opposite but act in a similar manner, indicating that some parameters may be difficult to optimize for.In order to limit the number of parameters we consider, while allowing for the largest range in behavior, we combine information from the sensitivity study with information about the methane flux equations themselves (described in more detail in Appendix A).The most important parameters from the sensitivity study come from the dominant three terms in the methane flux equation, which are production (parameters 1, 2, and 21), oxidation (parameters 7, 8, 9, and 10), and aerenchyma transport (parameters 13, 15, 16, and 17).The first four parameters chosen are also the most important parameters at all 16 sites (see Table 2).Because production is the most important term, there are two parameters from production that the sensitivity studies indicate are the most important, namely one that controls globally the methane production flux (F_CH4, parameter 2), and one term that controls the temperature dependency of the methane production (Q10CH4, parameter 1).Another parameter that influences methane at all the sites comes from the oxidation equation (VMAX_CH4_OXID, parameter 7), and the final parameter that is important at all 16 sites is the parameter controlling the aerenchyma transport (SCALE_FACTOR_AERE, parameter 13).The above four parameters are the most sensitive parameters, and thus are easy to choose, as well as cover most of the important processes we want to investigate.For the last parameter, we include one parameter that controls how inundation affects methane production (MINO2LIM, parameter 21).Inundation is an important process for controlling methane flux, since there is an order of magnitude more methane coming from wet areas than dry, and thus having one parameter which changes the model's sensitivity to inundation is appropriate.

Surrogate models
Surrogate models are used in optimization algorithms that aim to solve computationally expensive black-box problems.Surrogate models serve as computationally cheap approximations of the expensive simulation model (Booker et al., 1999), i.e., f (x) = s(x) + e(x), where f (•) denotes the true expensive objective function, s(•) denotes the computationally inexpensive surrogate model, and e(•) denotes the difference between both.Surrogate models are used throughout the optimization to guide the search for promising solutions.The computationally expensive objective function is evaluated only at few selected points, and thus it is possible to find near-optimal solutions with only very few expensive function evaluations.

Surrogate global optimization algorithm
Surrogate global optimization algorithms follow in general the steps shown in Algorithm 1.
We use the DYCORS algorithm by Regis and Shoemaker (2013) for the optimization of the methane-related parameters of CLM4.5bgc.The reader is referred to this publication for the details of the algorithm.Since the parameters have significantly different ranges (see Table 1), we scale all parameters to the interval [0, 1] when selecting new sample sites.When doing the computationally expensive CLM4.5bgc simulations, we scale the parameters back to their original ranges.Thus, the perturbation radius used in DYCORS is the same for each variable.
We create a symmetric Latin hypercube initial experimental design with 2(d + 1) points and run CLM4.5bgc at the selected parameter vectors in order to compute the objective function values.We then fit the cubic RBF model to Algorithm 1 General surrogate global optimization algorithm 1: Select points from the variable domain to create an initial experimental design.2: Do the expensive objective function evaluations (here the CLM4.5bgcsimulations) at the points selected in Step .3: Fit the surrogate model (here the RBF model) to the data from Steps and .4: Use the information from the surrogate model to select the new evaluation point x new .5: Do the expensive evaluation at x new : f new = f (x new ) (here, run CLM 4.5bgc for the parameter input vector x new ).6: if Stopping criterion is not met (the maximum number of allowed function evaluations has not been reached) then 7: Update the surrogate model and go to Step .8: else 9: Return the best solution found during the optimization.10: end if the data and generate two sets of candidate points for the next expensive function evaluation (the next CLM4.5bgcrun at the 16 sites).The first set of candidate points is generated as described by Regis and Shoemaker (2013) by randomly perturbing the best point found so far.The second set of candidate points is generated by uniformly selecting random points from the whole variable domain.Thus, we create twice as many candidate points as DYCORS.The goal of using uniformly random points from the whole variable domain is to obtain candidates that are far away from the best point found so far, and hence if selected as a new evaluation point, the search is more global (exploration by function evaluation at points that are far away from already sampled points).
We use the same criteria as in DYCORS for determining the best candidate point (using the RBF approximation to predict the objective function values at the candidate points, compute the distance of the candidate points to the set of already sampled points, and compute a weighted score of these two measures where the weights cycle through a predefined pattern).In order to guarantee that the matrix in Eq. ( 7) is well-conditioned, we ensure (as done in Regis and Shoemaker, 2013) that the sample points are sufficiently far away from previously evaluated points by discarding candidate points that are closer than a given threshold distance to previously evaluated points.We run CLM4.5bgc at each of the 16 observation sites using the one newly selected sample point as input parameter vector to obtain the corresponding objective function value.We update the RBF model with the new data and iterate until we have reached the maximum number of allowed function evaluations.

Numerical experiments
In this section we discuss the setup and results of the numerical experiments.In a first set of experiments (pseudo data case), we generate synthetic (pseudo) data and treat it as if it were the real measurement data in order to assess how well our optimization approach performs.For these experiments we know the optimal solution.In the second set of experiments (real data case), we use the measured methane emission data and apply the optimization algorithm.The goal in the second set of experiments is to find a parameter set that reduces the objective function value (the weighted RMSE in Eq. 1a) from its default value (the RMSE when using CLM4.5bgc'sdefault parameter settings, see also Table 1, column v k ).Finally, we run CLM4.5bgcglobally with the best set of parameters found during the optimization of the real data case and investigate how much the default model predictions and the model predictions with the optimized parameter values differ from each other.
For each set of experiments we ran the optimization algorithm three times in order to examine the influence of the random component in the algorithm (random initial experimental design and random generation of candidate points).We allowed 800 function evaluations for the five-dimensional problem and 1000 evaluations for the 11-dimensional problem.The question of how many function evaluations need to be performed in order to obtain a fixed level of solution accuracy is problem dependent.For computationally expensive optimization problems, such as the problem we consider here, the time for evaluating the objective function and the totally available time for obtaining a solution usually defines how many evaluations can be done with any algorithm.Results for many difficult computationally expensive optimization problems (for example, problems with multiple local minima) indicate that surrogate global optimization methods can usually obtain more accurate results compared to nonsurrogate methods with the same limited number of evaluations (see, for example, Mugunthan et al., 2005).It is a very difficult problem to find the best values of the parameters for climate models, and the more evaluations one does, in general the better the answer.
The weights w i in Eq. (1a) were for the pseudo data case computed based on the pseudo observations (see Sect. 5.1) at each of the 16 sites at the same dates for which we also have real measurements.For the real data case, the weights were computed based on the actual measurements.The weights are given in Table D1 in Appendix D.
Solving problem (1a) requires running CLM4.5bgc for each input vector x of parameter values and for each of the 16 observation sites.We run CLM4.5bgc on the Yellowstone Supercomputing Facility (Computational and Information Systems Laboratory, 2012).Each simulation at a single location takes between 15 and 30 min.We do the simulations for the 16 sites in parallel in order to speed up the objective function evaluation time.

Pseudo data case
We assessed the performance of the optimization algorithm by investigating how well the algorithm could find the model parameters that were used for creating the pseudo data.For this purpose, we ran CLM4.5bgc with default parameter values v k , k = 1, . .., d, at all 16 sites for the same time span for which we also have observation data (see Tables B1 and B2 in Appendix B) and we record the model's predictions for the same dates at which the methane emissions were measured.We use this as our pseudo observation data that we want to match in the optimization, i.e., the goal of the optimization is to start from a set of parameter vectors that is different from the default parameter values and to recover the default parameter values by optimization.For the default parameter values, the objective function value will be zero, which is the global minimum of the pseudo data case.

Results for d = 5
Figure 1 shows the progress plots of the three optimization trials T1, T2, and T3.Illustrated is the development of the best objective function value found within the given number of function evaluations (horizontal axis).The fewer evaluations needed for reducing the objective function value, the better.The plot shows that the objective function value is reduced significantly in each of the three trials from a value of over 30 to about 5 within less than 150 function evaluations and close to zero towards the end of the optimization.Table 4 shows the best parameter values found during each of the three optimization trials together with the default parameter values.The table shows that the RMSE after 800 function evaluations is not exactly zero (which can be expected from  an approximation method), but the default parameter values are matched closely.

Results for d = 11
Figure 2 shows the objective function value development as the number of function evaluations increases for the 11dimensional case for the three trials T1, T2, and T3.The figure shows a rapid decrease of the objective function value from over 50 to less than 10 within 100 evaluations, which shows that the surrogate model algorithm is very efficient at finding improved solutions.Although the objective function value improvement over the following function evaluations is lower, we can see that the algorithm still makes progress and if we allowed more than 1000 evaluations, the objective function value would be further improved (which also follows from the global convergence property of the DYCORS algorithm).
Table 5 shows the parameter values of the best of the three trials (T3) together with the default parameter values and the variable vector CP that was evaluated during the optimization and that has a worse objective function value than the best solution, but that is closer to the default parameter values.This point has the same parameter values as T3 for all but two parameters, namely, parameters 10 (Q10_CH4OXID) and 21 (MINO2LIM), which we indicate by bold numbers.For these two parameters, the point CP is closer to the global optimum, but it has a worse objective function value.This indicates a multimodality of the objective function (getting closer to the true global minimum requires an increase in the objective function value, i.e., the algorithm has to escape from a local basin of attraction).This multimodality makes the search for the global optimum significantly more difficult.
In order to examine the impact of the differences between default and optimized parameter values on the model prediction, we use the best parameter vector of each trial and plot the corresponding CH 4 emission predictions against the predictions when using the default parameter values in Figure 3.We can see that although we do not exactly match the default parameter values, the model's predictions when using the op-  timized parameters are very close to the predictions when using the default parameter values (all points in the scatter plot lie close to or on the dashed line which represents agreement of default and optimized predictions).As also reflected in the best RMSE value reported in the legend, T3 matches the default data best and T2 has the largest differences.This result indicates that the calibration problem is not "identifiable" for all parameter sets, indicating that more than one parameter set can give a very similar result in terms of the objective function value.For example, for the model y some constant κ.With only five parameters as described in the previous section, the parameter values obtained from the optimization did match very closely with those of the default case used to create the pseudo data, and thus with this small set of parameters the problem was identifiable.However, for 11 parameters, we did encounter the identifiability problem.In some disciplines such parameters are called "hidden".For example, estimating α and γ in the previous example with y = α β x +γ when β is given would be identifiable.However, estimating α, β, and γ is no longer identifiable.
It would be desirable to have an identifiable model, but the CLM (and probably other climate modules) have a number of interacting parameters and multiplicative nonlinearities, and thus there is no guarantee that all parameters are identifiable.This is reinforced by the data in Table 5, which indicates that the surface over which the optimization algorithm searches in the 11 parameter case is multi-modal, i.e., there are multiple local minima and it is possible for two (or more) parameter sets to yield the same objective function value (here RMSE).Hence the inability of the optimization to find the exact set of parameters that was used for generating the pseudo data is a problem caused by the complexity and multiplicative nonlinearities of the CLM model, not by the choice of the optimization method.However, the optimization analysis for both pseudo data cases (with 5 and 11 parameters, respectively) shows that the chosen optimization method is able to find a set of parameter values that has a low prediction error.The multi-modality in Table 5 does indicate the need for a global (not a local) optimization method.

Real data case
In the real data case, we use the actual methane emission measurements at each of the 16 observation sites for com- puting the objective function value.Since we only have very few observations for each site and no information about measurement errors, we did not exclude any of the measurements from the optimization although there might be outliers.Also for the real data case we examine the case for d = 5 and d = 11 variables.

Results for d = 5
The progress of the development of the objective function value for the three trials T1, T2, and T3, respectively, is illustrated in Fig. 4 which also shows in the legend the lowest RMSE value found in each of the three trials.The RMSE was efficiently reduced from over 155 to below 115 within the first 150 function evaluations.Thereafter the objective function value improvement was at a significantly lower rate.All three trials return a solution with approximately the same objective function value.
The parameter values of the best solutions found in the three trials are shown in Table 6 where also the default parameter values are given for comparison.We can see that the three optimized solutions are approximately the same and significantly different from the default case.We can also see that three of the five optimized parameter values are on or very close to the boundary of the variable domain (shown in bold), indicating that improvements of the objective function value may be possible by increasing the parameter range.However, it is not possible due to physical constraints and at this point, we do not have information about possible wider parameter ranges than the ones we used in this study.
Figures 5 and 6 show the CH 4 emission predictions of CLM4.5bgc when using the default and the optimized parameter values for two selected observation sites (one wet-  land and one rice paddy site) together with the actual observation data.The legends show the associated RMSE value before applying the weights for computing (Eq.1a).We can see that the optimized solution actually worsens the predictions for Alberta (the RMSE value with default parameters is about 209 and with optimized parameters, the value is about 221, which is about 6 % worse).For Central Java, on the other hand, the RMSE values of the optimized solutions are significantly better than for the default values (the default RMSE is about 221 and the optimized RMSE values are about 48, which is an improvement of over 350 %).In both figures we can also see that despite the large differences between optimized and default parameter values, the trend in the predictions of CLM4.5bgc is the same, i.e., when the predicted CH 4 emissions with default parameters increase so do the predicted emissions when using the optimized parameters and vice versa.

Results for d = 11
Figure 7 shows the progress plots for each of the three trials together with the best objective function values found (legend) for the 11-dimensional case.The best objective function value found is about equal for each of the three trials.The figure shows that in each trial the algorithm is able to efficiently reduce the objective function value within the first 200 function evaluations.The improvement after 200 function evaluations is significantly slower.
Table 7 shows the parameter values of the best solution found in each of the three trials and the default parameter values.The table shows that for some parameters, for exam- ple, parameters 1, 7, and 8, all trials lead to approximately the same values (which are different from the default parameter values).For the remaining parameters, the values corresponding to the best solution found differ significantly for each trial and differ also from the default parameter values.Also for the 11-dimensional problem, some parameter values corresponding to the best solution found are on the upper or lower boundary of the parameter range (for example, parameters 1, 8, 13, 15, indicated in bold).Since all three solutions have approximately the same objective function values, but the points differ greatly, it is an indicator that we either have a multi-modal surface in which some minima assume approximately the same objective function values, or we have a very flat valley in which many points assume similar objective function values.Both possibilities make it very difficult for gradient-based optimization algorithms to find the global optimum.In the first case, the optimization algorithm will get trapped in a local optimum if it is not started close to the global minimum.In the second case, the gradient-based algorithm would require many function evaluations because many steps and gradient computations are necessary due to a very small step size.The surrogate optimization algorithm overcomes this problem.
Table 8 shows the unweighted RMSE values (before applying the weights in Eq. (1a) for computing the objective function value) between observations and simulations using the default parameters (column 5), the best parameters of optimization trial T1 of the 11-dimensional case (column 4), and the best parameters of trial T2 of the 5-dimensional case, respectively.The table shows that with our optimization we were able to decrease the default RMSE for four sites in the 5-dimensional case and for six sites in the 11dimensional case.The RMSE is lower at seven sites for the 11-dimensional case than for the 5-dimensional case.Since we minimized a weighted sum of all RMSE values, it can be expected that the RMSE at some locations may be worse for the optimized case than for the default case.We can see that for two of the improved sites (Java and Cuttack), the improvement is very large, and thus the overall RMSE of the optimized solution is lower than for the default parameters.Figures 8 and 9 show the observed CH 4 emissions, the predictions with the default parameter values, and the predictions using the optimized parameter values for Alberta (Canada) and Central Java (Indonesia).For both sites we can see that the predictions with the optimized parameters have lower RMSEs than when using the default parameter values (note that the reported RMSEs in the legend are not weighted as done in Eq. 1a).For Central Java, for example, the opti-  3792.66 3991.73 4345.56 1993.4 1993.5 1993.6 1993.7 1993.8 1993.9 1993.4 1993.5 1993.6 1993.7   mized parameters greatly improved the model's predictions, but we can also see that the temporal variability in the predictions stays the same although not as pronounced.We noticed this "temporal variability preserving" behavior for several sites such as Beijing, California, Cuttack, New Delhi, Florida, Japan, Michigan, Minnesota, Salmisuo, Texas, and Vercelli.Compared to the case where we optimized only five parameters, the solution for Alberta has improved and the RMSE values for all three trials are for the d = 11 case better than the default RMSE value.On the other hand, the solution for Central Java is worse for T1 in the d = 11 case than in the d = 5 case.The temporal variability in the model's predictions does not necessarily follow the temporal variability in the observation data (see, for example, Fig. 10).Note that in Fig. 10 the temporal variability is the same for each of the three tri-  als although the best solutions found in the three trials were very different (see Table 7).Thus, it seems that the improvement of the model's predictions is restricted by an underlying model component that enforces the temporal variability.This is likely to be associated with structural errors either in the methane or in the carbon model.Notice that the methane emission is dependent on the temporal variability predicted in the carbon and land model, especially on the heterotrophic respiration rate, which could have the wrong magnitude or temporal evolution.
Figure 11 shows a scatter plot of the mean values of the CH 4 predictions using default and optimized parameter values vs. the mean values of the observed CH 4 emissions.Ideally, if the simulated emissions agreed with the observations, all points would lie on the dashed line.Thus, the closer a point to the dashed line, the more simulation and observation are in agreement.The figure shows that with the optimized parameters, we obtain better or similar results for Beijing, Cuttack, Minnesota, Central Java, Nanjing, Japan, Salmisuo, Alberta, and Michigan.Although not all sites have been strictly improved by the optimization, the overall RMSE has been improved (indicated in the legend).
Figure 11 also shows that with default parameters, CLM4.5bgc predicts less CH 4 emissions than observed for both observation sites in the northern latitudes (Alberta, ID = 1, and Salmisuo, ID = 16), which is corrected by the optimization such that the mean emissions at these sites are closer to the dashed line.Thus, based on the observation data, CLM4.5bgc with default parameters does not predict enough emissions in the northern latitudes.On the other hand, CLM4.5bgc over-predicts the emissions for four lo-cations, namely Cuttack (ID = 14), Central Java (ID = 12), Nanjing (ID = 5), and Japan (ID = 8), which are located in the tropical and/or subtropical zone.For those four locations, the predictions with the optimized parameters are closer in agreement with the observations.Hence, the observation data force the model predictions to increase in the northern latitudes and to decrease in the tropics.This can also be seen in Figs. 12 and 13 in the following section where we simulated the model globally and compared default and optimized model predictions for the individual zones (discussed below).

Gobal CH 4 emission simulations
We simulated CLM4.5bgc to obtain predictions for the CH 4 emissions on a global scale and compared the predictions when using the default parameter values and the optimized parameter values from the 11-dimensional cases.Figure 12 shows spatial plots of the average methane emissions (mg CH 4 m −2 d −1 ) and the zonal means (right hand side of the plots) when using the default parameters (panel a), and the difference between the predictions when using the default and the optimized parameters for trial T1 (panel b).The figure shows that with the optimized parameters, the CH 4 emission predictions in the northern regions are larger than for the default parameters.For the tropics, the predictions with the optimized parameters are lower than when using the default values.
Figure 13 shows a comparison of the CH 4 emission predictions from several different models (models 1-10).We can see that globally the predictions with the optimized parameters (model 12) were only slightly higher than with the default parameters (model 11).However, the predictions of CH 4 emissions in the tropics are significantly lower than for the default model and the predictions are also lower in comparison to all other models (1-10).On the other hand, for the northern latitudes, CLM4.5bgc with optimized parameters predicts significantly more CH 4 emissions than the default model and models 1-10 in the comparison.Hence, even though the global average of predicted emissions did not change much, the distribution of the predicted emissions between the tropical and the northern latitudes changed significantly.
As indicated in the previous section, the observation data drive the model to predict more CH 4 emissions in northern latitudes and fewer emissions in the tropics.We investigated whether our weighting scheme in Eq. (1a) may give too much influence to individual observation sites or zones.Thus, we did an additional optimization trial of the parameters in Table 2 where we give each observation site the same weight w i = 1, i = 1, . .., 16 ("unweighted").We also did a second additional optimization trial of the parameters in Table 2 where we give each zone the same influence on the total RMSE in order to account for the location of the various observation sites ("zonally weighted").Thus, each location in the temperate zone (12 sites totally) has w i = 1/36, and each location in the northern (2 sites) and tropical (2 sites) zone, respectively, has the weight w i = 1/6.The spatial plots of the differences between the average methane emissions when using default and optimized parameters for the unweighted trial are shown in panel c of Fig. 12, and the spatial plots of the differences when using the zonally weighted objective function is shown in panel d of Fig. 12.The figures show that for both additional trials, the CH 4 emissions in the northern latitudes are even further increased.Moreover, the bars for models 13 and 14 in Fig. 13 show the total methane emissions of the unweighted and the zonally weighted trials, respectively.The zonally weighted trial increases the global emissions, which is caused by larger emission predictions in the temperate zone and the northern latitudes.In comparison to the default CLM4.5bgc predictions, the unweighted trial shows a decrease in the predicted emissions in the tropics and an increase in the predicted emissions in the northern latitudes.Thus, even though it is suggested that CLM4.5bgc with default parameter settings overpredicts the CH 4 emissions in high latitudes (Koven et al., 2013), the observation data argue that the predictions should even be increased.

Conclusions
In this paper we used a surrogate optimization approach for calibrating the parameters of the methane module of the Community Land Model (CLM4.5bgc).Given only relatively few measurements at 16 observation sites (wetlands and rice paddies) our goal was to explore the use of a surrogate optimization method to improve the model prediction capability in a computationally efficient way by minimizing the root mean squared error between the measurements and the model's predictions.We identified important methanerelated parameters in CLM4.5bgc by doing a sensitivity analysis and we were thus able to reduce the problem dimension from 21 to 11.We then used a surrogate optimization approach for tuning the most important parameters in order to solve the problem.We investigated two cases, namely a problem with five of the most important parameters and a problem with all 11 parameters, respectively.
We first used pseudo data in order to asses how well the surrogate optimization performs and showed that we are able to closely match the pseudo observations.We were able to reduce the RMSE to less than a fifth within the first 150 function evaluations for both pseudo data cases.The objective function was shown to have multiple local minima, which indicates that the problem is probably not identifiable when 11 parameters were optimized.Although the RMSE was greatly reduced by the optimization for the 11 parameter pseudo data case, the optimization results did not generate the same values of the parameters in some cases as were used to generate the pseudo data.This is a problem with the model, not with the optimization method used.The multiple local minima detected in Table 5 indicate that a global optimization method was needed.We used a surrogate global optimization method because the objective function was expensive to evaluate and has multiple local minima.The surrogate has been shown to reduce the number of objective function evaluations (e.g.climate model simulations) required to obtain accurate approximations of the global minimum and so it is designed for computationally expensive models like climate modules.
By conducting the simulations globally and comparing the average predicted emissions with default and optimized parameters, we could show that the total global CH 4 emissions did not change significantly.
However, the distribution of the predicted emissions between latitudes changed significantly.The observation data force the optimized model's CH 4 emission predictions in the northern latitudes to increase and the predicted emissions in the tropics to decrease.In comparison to other models, CLM4.5bgc with both default and optimized parameters predicts significantly more emissions in the northern latitudes and less emissions in the tropics.
where φ(τ ) = τ 3 denotes the cubic radial basis function whose corresponding polynomial tail is linear (p(x) = b 0 + b T x), and x ι , ι = 1, . .., n, denotes the points at which the objective function has already been evaluated.The parameters λ ι ∈ R, ι = 1, . .., n, and the parameters b 0 ∈ R and b = [b 1 , . .., b d ] ∈ R d are determined by solving the following linear system of equations:

Figure 1 .
Figure 1.Progress plot that shows the development of the best objective function value found vs. the number of function evaluations for the pseudo data case with d = 5 parameters for optimization trials T1, T2, and T3.The legend shows the lowest RMSE value found in each trial.

Figure 2 .
Figure 2. Progress plot that shows the development of the best objective function value found vs. the number of function evaluations for the pseudo data case with d = 11 parameters for optimization trials T1, T2, and T3.The legend shows the lowest RMSE value found in each trial.

Figure 3 .
Figure 3. CLM4.5bgcCH 4 predictions when using the default parameter values vs. the predictions when using the best solution found in each of the three optimization trials T1, T2, and T3, respectively, for the pseudo data case with d = 11 parameters.The legend shows the lowest RMSE value found in each trial.

Figure 4 .
Figure 4. Progress plot that shows the development of the best objective function value found vs. the number of function evaluations for the real data case with d = 5 parameters for optimization trials T1, T2, and T3.The legend shows the lowest RMSE value found in each trial.The first function evaluation (left side of the graphs) corresponds to the RMSE when using the default parameters.

Figure 5 .
Figure5.CH 4 emission observations and predictions when using the optimized parameters of optimization trials T1, T2, and T3, respectively, and when using the default parameters for the wetland site Alberta, Canada, for the real data case with d = 5 parameters.The legend shows the lowest RMSE value found in each trial.

Figure 6 .Figure 7 .
Figure6.CH 4 emission observations and predictions when using the optimized parameters of optimization trials T1, T2, and T3, respectively, and when using the default parameters for the rice paddy site Central Java, Indonesia, for the real data case with d = 5 parameters.The legend shows the lowest RMSE value found in each trial.

Figure 8 .Figure 9 .
Figure8.CH 4 emission observations and predictions when using the optimized parameters of optimization trials T1, T2, and T3, respectively, and when using the default parameters for the wetland site Alberta, Canada, for the real data case with d = 11 parameters.The legend shows the lowest RMSE value found in each trial.

Figure 10 .
Figure10.CH 4 emission observations and predictions when using the optimized parameters of optimization trials T1, T2, and T3, respectively, and when using the default parameters for the wetland site Salmisuo, Finland, for the real data case with d = 11 parameters.The legend shows the lowest RMSE value found in each trial.

Figure 12 .
Figure 12.Average methane emissions (mg CH 4 m −2 d −1 ) simulated by CLM4.5bgc for (a) default parameters, (b) differences between default parameters and 11-dimensional optimization trial T1, (c) differences between default parameters and optimization trial with unweighted sum of RMSE, and (d) differences between default parameters and optimization trial with zonally weighted sum of RMSE.Zonal means are shown on the right side of each spatial plot.

Table 1 .
CH 4 related parameters in CLM4.5bgc and their upper and lower bounds x u k and x l k , respectively, and the default parameter values v k .

Table 2 .
Parameters that are sensitive for most observation sites (out of 16).

Table 3 .
Parameters that are least sensitive for observation sites (out of 16).

Table 4 .
Default and optimized parameter values of optimization trials T1, T2, and T3 for the five-dimensional pseudo data case.We report four decimal places because the model output is sensitive to very small changes for some variables.Note that we scaled the numbers to the interval [0,1].

Table 5 .
Default and optimized parameter values of optimization trial T3 and parameter values for the point CP that was sampled during the same optimization trial and that is closer to the default point, but that has a worse objective function value (11-dimensional pseudo data case).Bold numbers indicate the parameters for which CP is closer to the default value than T3 (but CP has a worse objective function value).

Table 6 .
Default and optimized parameter values of optimization trials T1, T2, and T3 for the five-dimensional real data case.Bold indicates optimized parameters that are on (or close to) the variable boundary (all variables are scaled to [0,1]).

Table 7 .
Default and optimized parameter values of optimization trials T1, T2, and T3 for the 11-dimensional real data case.Bold indicates optimized parameters that are on the variable bound (all variables are scaled to [0,1]).

Table 8 .
Unweighted RMSE values for each site using the best parameters found during optimization trial T1 of the d = 11 real data case and trial T2 of the d = 5 real data case and with default parameter values.

Table B2 .
Rice paddy site data.