Development of an ensemble-adjoint optimization approach to derive uncertainties in net carbon fluxes

Introduction Conclusions References


Introduction
The terrestrial biosphere plays an important role in the global carbon cycle and has a great impact on the accumulation of carbon dioxide (CO 2 ) in the atmosphere.Feedbacks between the carbon cycle and climate change, generally known as carbon-climate feedbacks, have the potential to accelerate the Correspondence to: T. Ziehn (tilo.ziehn@bristol.ac.uk) rise in atmospheric CO 2 , which causes further global warming (Matthews et al., 2007).The quantification of the carbon cycle-climate feedback is therefore important for determining the magnitude of future climate change.However, there is much uncertainty about the size of natural sinks of the terrestrial carbon cycle, which in turn has a major impact on the uncertainty of climate predictions (Zaehle et al., 2005;Denman et al., 2007).The large variations in the prediction of the future atmospheric CO 2 load result from differences between models (Cramer et al., 1999;Friedlingstein et al., 2006), but also from uncertainties of the process parameters of the terrestrial ecosystem models (TEMs) (Knorr and Heimann, 2001).
The increase in the complexity of TEMs over recent years has also led to an increase in the number of parameters.Prior parameter values are usually based on "expert knowledge", which in some cases is little more than an informed guess.Furthermore, even those parameters that have clear analogues iin the observed system are often not directly related to the values derived from laboratory experiments or site-scale experiments.Parameter optimization methods are very useful in this context: they provide a way of constraining the model parameters against observations and in this way reduce parameter uncertainties.
The Bayesian approach has been shown to provide a powerful and convenient framework for combining prior knowledge about parameters with additional information, in particular observations (Rayner et al., 2005).The resulting inverse problem expressed by Bayes' theorem can be solved in different ways, for example through Monte Carlo inversion (Sambridge and Mosegaard, 2002) or variational data assimilation.Monte Carlo inversion methods such as the Markov Chain Monte Carlo (MCMC) method are able to find an optimal solution by sampling the posterior probability density function (PDF) of the parameters directly.They are easy to implement and require no assumptions about the model (i.e.continuity).However, they may require a very large sample T. Ziehn et al.: Ensemble-adjoint optimization size which is not always feasible due to computational limitations (i.e.long computing time of TEMs).Variational data assimilation, such as the four-dimensional variational scheme (4-D-Var), allows one to combine observational data with a model.It uses derivative code (i.e. the adjoint of the model) for the optimization of the parameters and therefore requires the model to be differentiable with respect to all parameters.Although the 4-D-Var approach is very efficient in most cases, the optimization might not always converge in time or might only identify a local minimum.These issues arise due to the complexity and non-linearity of state-ofthe-art TEMs and the potentially high-dimensional parameter space.
In this contribution we address the convergence issue of the optimization scheme in the 4-D-Var approach as used in the Carbon Cycle Data Assimilation System (CCDAS) (Rayner et al., 2005) and propose a new concept for deriving the posterior PDF for parameters and target quantities in a global TEM.Although we focus only on one TEM, the Biosphere Energy Transfer and Hydrology scheme (BETHY) (Knorr, 2000), the approach is universal and can be applied to any other model.The main idea is that we only optimize a subset of parameters, i.e. those controlling the heterotrophic respiration (in the following called soil carbon parameters), because they are best constrained by atmospheric CO 2 concentration observations as demonstrated by Rayner et al. (2005) and Scholze et al. (2007).The prior uncertainties of the remaining parameters, i.e. those related to net primary productivity (NPP), are included through ensemble runs and are therefore not constrained by the observations.This new concept allows us to treat all parameter uncertainties in a consistent way.

Materials and methods
The 4-D-Var data assimilation scheme has been successfully applied within CCDAS to constrain process parameters in a TEM.CCDAS can be used in various modes.For example, in calibration mode it serves as an estimator algorithm for a set of photosynthesis, autotrophic and heterotrophic respiration process parameters by using automatically generated adjoint code (first derivative) for parameter optimization.In Hessian mode, the Hessian model code (second order derivative) is used for estimating posterior parameter uncertainties.As its ecosystem model, CCDAS uses the BETHY model, which simulates carbon assimilation and soil respiration within a full energy and water balance and phenology scheme.Calculated fluxes are then mapped to atmospheric concentrations using the atmospheric transport model TM2 (Heimann, 1995).
The CCDAS framework has been previously described in detail by Scholze (2003) and Rayner et al. (2005).Therefore, we provide only a brief summary.The data assimilation in CCDAS is performed in two steps: In the first step, the full BETHY model is used to assimilate global monthly fields of the fraction of Absorbed Photosynthetically Active Radiation (fAPAR) for optimizing parameters controlling soil moisture and phenology (Knorr and Schulz, 2001).In the second step, soil moisture and leaf area index (LAI) fields are provided as inputs for a reduced version of BETHY, in the following referred to as CarbonBETHY.This version is used to assimilate atmospheric CO 2 concentration observations from a large number of observation stations for optimizing photosynthesis and soil carbon parameters and to derive their posterior uncertainties (Rayner et al., 2005;Scholze et al., 2007).

Data assimilation
The Bayesian approach (Tarantola, 1987(Tarantola, , 2005) ) provides a consistent framework for constraining model parameters x against observations c.This framework enables us to combine the prior probability distribution of the parameters P (x) with the probability distribution of the observations given the parameters P (c|x) in order to determine the inverse (posterior) probability distribution of the parameters given the observations P (x|c): The factor 1/A is a normalisation constant and independent of the parameters x.In many cases a normal distribution is assumed for the prior parameter values and the observations.This Gaussian assumption has the advantage that only the mean and covariance have to be provided in order to describe the prior probability distribution of each variable.Applying a normal distribution to Eq. (1) leads to the following expression: where c M = M(x) are the modelled observations.The covariance matrices C c and C x 0 express the uncertainty of the observations c and the model parameter priors x 0 , respectively.We are usually interested in the maximum of P (x|c), which will give us the most likely set of parameter values.This can be found in two ways: we can either maximize Eq. ( 2) using Monte Carlo inversion, or we can minimize the negative exponent of Eq. ( 2) using variational data assimilation.
The cost function J (x) describes the mismatch between the observations and their modelled equivalents and the mismatch between the parameters and their priors.The data assimilation in CCDAS is based on the 4-D-Var scheme.In our case, the vector of observations, c, represents monthly atmospheric CO 2 concentrations measured at 41 remote monitoring stations (GLOBALVIEW-CO 2 , 2004).CarbonBETHY is used to calculate surface fluxes which are then mapped via the atmospheric transport model TM2 to atmospheric concentrations c M .Since BETHY calculates only the natural land-atmosphere fluxes, we have to add land use change as an external flux as described in Rayner et al. (2005).Background fluxes for fossil fuel emissions are based on the flux magnitudes from Boden et al. (2009) as described in Scholze et al. (2007).The spatial flux pattern and the magnitude of ocean CO 2 exchange is taken from Takahashi et al. (1999) with estimates of inter annual variability taken from Le Quéré et al. (2007).
The parameter vector x contains the photosynthesis and soil carbon parameters in CarbonBETHY with their prior values represented by x 0 (see Tables S1 and S2).A quasi-Newton method, the Broyden-Fletcher-Goldfarb-Shanno (BFGS) variant of the Davidon-Fletcher-Powell (DFP) formula (Fletcher and Powell, 1963;Press et al., 1996), is used for the minimization of the cost function, which requires the calculation of the gradient of J with respect to the control parameters x in each iteration.All derivative code is generated from the model's source code using the tool Transformation of Algorithms in Fortran (TAF) (Giering and Kaminski, 1998;Kaminski et al., 2003).

Challenges
The simultaneous optimization of the photosynthesis and soil carbon parameters in CCDAS as described in the previous section can be challenging due to slow convergence or failure of convergence.Even if a convergence requirement is fulfilled (i.e.test for convergence on x), the gradient of the cost function may not be sufficiently close to zero at the final convergence point in parameter space.As a consequence the Hessian is not positive definite (i.e.contains negative eigenvalues), which indicates that an exact minimum has not been found.This has been noted by Rayner et al. (2005) where, in order to derive the posterior parameter uncertainties, the Hessian had to be modified manually.Another concern is that due to the large input space dimension and the fact that the BETHY model is highly non-linear, it is likely that we only identify a local minimum with CCDAS.
The study by Ziehn et al. (2011a) has revealed that the performance of the optimization in CCDAS can be significantly improved if only the soil carbon parameters are constrained with atmospheric CO 2 concentration data, while all parameters controlling NPP were kept fixed.Earlier studies with CCDAS (Rayner et al., 2005;Scholze et al., 2007) confirmed that NPP-related parameters (i.e.photosynthesis parameters) are constrained relatively little by the assimilation of CO 2 concentration observations.Most importantly it could be shown in the study by Ziehn et al. (2011a) that an ensemble run of optimizations, starting each of them in a different point, identified only one minimum in the physical parameter space, which gives us additional confidence that the global minimum has been found.Additionally, the gradient in the cost function minimum was very close to zero and the Hessian was positive definite so that no manual modification of the Hessian was required.Although the overall performance of the optimization has improved significantly, there is one drawback of the technique presented in their work: the uncertainties in the NPP-related parameters have not been included, which means that estimated uncertainties of the soil carbon parameters and diagnostics were only a lower bound.Therefore, we propose a new concept, that treats the uncertainties in the photosynthesis parameters via ensemble runs and optimizes the soil carbon parameters using the adjoint optimization approach within CCDAS.

Concept and test case
A flow chart of the concept developed in this contribution is presented in Fig. 1.In a first stage, ensemble runs are performed using CarbonBETHY by varying the NPP-related parameters randomly according to a normal distribution defined by their prior mean and standard deviation (see Table S1).All other parameters (i.e.soil carbon parameters) are kept fixed.Here, we use a sample size of N = 200.We also perform one additional forward run, referred to as base case, where all NPP controlling parameters are set to their prior mean.
CarbonBETHY is driven by observed climate data over 25 yr for the period 1979 to 2003.Global vegetation is mapped onto 13 different plant functional types (PFTs) and each grid cell can contain sub-areas (sub-grid cells) with up to three different PFTs with their amount specified by each PFT's fractional cover.CarbonBETHY is run on a 2 • × 2 • grid with 3462 land grid cells (excl.Antarctica).
Each ensemble run (including the base case) provides a monthly field of NPP, which is used as an input field in the second stage.Here, we apply CCDAS to optimize only the soil carbon parameters, using atmospheric CO 2 concentration as observations.Most of the soil carbon parameters are globally valid (i.e. they have the same value in each of the grid cells), only the carbon balance parameter β is differentiated by PFT and region (Ziehn et al., 2011a).In addition to the 13 PFTs (Fig. S1 and Table S3) we also consider 6 different regions (Fig. S2 and Table S4), which results in a set of 73 parameters (67 βs + 5 global parameters + 1 offset).For each NPP input field we obtain a different set of optimal soil carbon parameters including their uncertainties.We can then propagate the posterior uncertainties for those parameters to any output target quantity of interest.This is done by making use of the Jacobian (first order derivative) of the BETHY  model within CCDAS (Scholze et al., 2007).Thus we obtain uncertainty estimates and covariances for output target quantities, such as the net ecosystem productivity (NEP) calculated as where R S,s and R S,f are the respiration fluxes from the slowly and rapidly decomposing soil carbon pools, respectively.In a third stage, we superimpose the posterior PDFs for the soil carbon parameters and the output target quantity in order to obtain their final PDF, which also accounts for the prior uncertainties in the NPP-related parameters.The calculation of the final PDF p(y) for the output target quantity y is given by the following equations: where N is the ensemble size and µ i and σ i are the mean and standard deviation for each individual output y i .Individual PDFs described by µ i and σ i have a normal distribution.In practice, we discretize those PDFs using a step length of 1 × 10 −4 PgC and then calculate the sum over all discrete points divided by the total number N of PDFs (ensemble size).In this way we obtain the final PDF as described by Eqs. ( 6) and ( 7), which can be non-Gaussian.The calculation of the final (superimposed) soil carbon parameter PDF is performed in a similar way.

Results and discussion
The optimization within CCDAS (data assimilation in stage 2, see Fig. 1) reached convergence for 198 out of the 200 ensemble members and required about 1700 iterations on average.However, of the 198 successful optimization runs we had to exclude further 28 runs, where either the gradient in the cost function minimum was not sufficiently small enough (i.e.greater than 1 × 10 −3 ) or the optimal (posterior) parameter set contained non-physical parameter values.For physically meaningful results we require here that all parameters are positive, and some parameters that respresent fractions have to fall between 0 and 1.However, the optimal set of parameters derived by CCDAS may contain values outside those defined ranges and we therefore have to exclude the corresponding runs.This leaves us with 170 sets of optimal soil carbon parameters, which were obtained by using 170 different NPP input fields (ensemble runs in stage 1, see Fig. 1).A time series of global mean NPP including error bars is shown in Fig. 2a.A list of the posterior parameter values for the five global parameters including their uncertainties is presented in Table 1, the values for the parameter β for each PFT and region and the offset (global atmospheric CO 2 concentration at the beginning of the optimization period) are presented in Table S2.Note that we distinguish between model parameters (physical domain) and parameters as used by the optimization in CCDAS (normalized domain).For most of the parameters we assume a log-normal distribution to ensure positive values as discussed above, which results in the asymmetry shown in Table 1 and Table S2.However, after suitable transformation, all parameters follow a Gaussian distribution in the normalized domain.Results are only discussed in the physical domain and in the following, we focus only on the five global parameters.The optimal values for the temperature sensitivity of respiration for the fast and slow carbon pools (Q 10,f and Q 10,s ) for the base case are close to their prior values and within the prior uncertainty range.The soil moisture dependence parameter κ is reduced from its initial value, but is also within its prior uncertainty range.The Geosci.Model Dev., 4, 1011Dev., 4, -1018Dev., 4, , 2011 www.geosci-model-dev.net/4/1011/2011/T. Ziehn et al.: Ensemble-adjoint optimization 101519801981198219831984198519861987198819891990199119921993199419951996199719981999 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989   optimized parameter values for the fast pool turnover time, τ f , and the fraction f s of the decomposition flux going from the fast to the long-lived soil carbon pool are much larger than their priors and both outside the prior uncertainty range.
All five global parameters are well constrained by the CO 2 data, shown by the small posterior uncertainty in the base case.The posterior mean values for all soil carbon parameters are very similar in both cases (base case and superimposed case), showing that the mean values are not heavily effected by changes in the NPP-related parameters.
Our target output quantity is global mean NEP, for which a time series is shown in Fig. 2b.In the following we focus on global mean NEP for the 1980s and 1990s.The PDFs for those quantities are presented in Fig. 3.We obtain the final PDF by superimposing the 170 individual PDFs (Eq.7) from each optimization run to account for both, the posterior soil carbon parameter uncertainties and the prior uncertainties in the NPP-related parameters.The superimposed PDF is not necessarily Gaussian.However, skewness and kurtosis of the distribution for the case of the 1990s (Fig. 3b) indicate that the assumption of a normal distribution is a good approximation.The mean values for our target quantities are nearly identical for the base case and the superimposed case, showing again that the NPP-related parameters have little effect on the mean values.The uncertainties for the target quantities, however, increase by more than 50 % for the 1980s and by more than 100 % for the 1990s using the ensemble-adjoint method.
According to Denman et al. (2007) the terrestrial carbon sink removed −1.7 Pg C yr −1 (range: −3.4 to +0.2 Pg C yr −1 ) during the 1980s and −2.6 Pg C yr −1 (range: −4.3 to −0.9 Pg C yr −1 ) during the 1990s from the atmosphere.The results from our study match the mean values well, with a carbon flux of −1.83 Pg C yr −1 (range: −1.84 to −1.82 Pg C yr −1 ) for the decade of the 1980s and −2.55 Pg C yr −1 (range: −2.57 to −2.54 Pg C yr −1 ) for the decade of the 1990s.However, the uncertainties of our results are small in comparison to those from Denman et al. (2007).One reason for this is the large number of negative entries for individual years in the error covariance matrix of global mean NEP for the 1980s and 1990s.
The covariance between the flux uncertainties can be expressed via the uncertainty correlation matrix of diagnostics, R d , which is defined as follows: where C i,j d is element i,j of the uncertainty covariance matrix of the diagnostics (global NEP per year), and σ i the posterior uncertainty of parameter i derived from the diagonal elements C i,i d of the matrix C d .Figure 4 shows the correlation matrix for global mean NEP for the 1980s and the 1990s.Due to the large number of negative correlations the overall uncertainty for global mean NEP over the 10 yr period (1980s and 1990s) is rather small.However, the uncertainty for global mean NEP for a single year, for example the year 1990, is by at least a factor of two larger then global mean NEP for the 1990s in the base case and increases by the ensemble-adjoint method by more than a factor of four.

Conclusions
The ensemble-adjoint optimization approach presented here allows us to treat all parameter uncertainties in a TEM in a consistent way.Some parameters are constrained against data using the 4-D-Var data assimilation scheme, whereas the uncertainties of the remaining parameters are included via ensemble runs.In this way we optimize only those parameters which are constrained best by the observations used in the 4-D-Var step, but retain full error propagation from parameters to diagnostics.This has the advantage that fewer parameters and processes are involved within the optimization process, which, in turn, speeds up the convergence of the optimization.We are also more confident that we find the global minimum in the reduced parameter space.
In this study we have illustrated the usefulness of the ensemble-adjoint optimization approach by including prior uncertainties of model parameters (here the NPP-related parameters) that have not been constrained by the atmospheric CO 2 data, to derive a full probability density function on the model's target output quantities.For future applications, the proposed concept also allows the inclusion of posterior uncertainties for the remaining, yet unconstrained parameters.Ziehn et al. (2011b) have demonstrated how to constrain the parameters of the Farquhar et al. (1980) photosynthesis model using an extensive set of plant traits and therefore provide a way on how to derive the posterior PDF for the NPPrelated parameters.Those results could potentially been used within the same ensemble-adjoint optimization framework.We would only need to replace the prior mean and uncertainties for the NPP-related parameters with the derived posterior mean and uncertainties for the same parameters.In this case all parameters, NPP-related parameters and soil carbon parameters, would be constrained by observational data.

Fig. 2 .Fig. 2 .
Fig. 2. Time series of (a) the global mean net primary productivity (NPP) and (b) the global mean net ecosystem productivity (NEP).For (a) the median and error bars are calculated from the 170 NPP fields for each year, which are then used as inputs for CCDAS.For (b) the median and error bars are based on the final NEP PDF for each year.Error bars represent the lower and upper percentiles equivalent to one standard deviation (i.e.15.9th percentile and 84.1th percentile respectively).

Fig. 3 .
Fig. 3. PDFs for global NEP for (a) the 1980s and (b) the 1990s.Blue: 170 individual PDFs, red: base case PDF (which used prior photosynthesis parameters and was not part of the ensemle), green: superimposed PDF from the 170 individual PDFs.

Fig. 3 .
Fig. 3. PDFs for global NEP for (a) the 1980s and (b) the 1990s.Blue: 170 individual PDFs, red: base case PDF (which used prior photosynthesis parameters and was not part of the ensemle), green: superimposed PDF from the 170 individual PDFs.

Table 1 .
Prior and posterior parameter values including their uncertainties for five global parameters.Upper and lower percentiles equivalent to one standard deviation are given (i.e.µ − σ is equivalent to the 15.9th percentile and µ + σ is equivalent to the 84.1th percentile).The relative reduction in uncertainty (Red.)related to +σ is also shown.Prior and posterior parameter values for the β parameters are provided in TableS2.Units: τ f , years; all others unitless.