Optimization of a prognostic biosphere model for terrestrial biomass and atmospheric CO 2 variability

This study investigates the capacity of a prognostic biosphere model to simulate global variability in atmospheric CO2 concentrations and vegetation carbon dynamics under current environmental conditions. Global data sets of atmospheric CO2 concentrations, above-ground biomass (AGB), and net primary productivity (NPP) in terrestrial vegetation were assimilated into the biosphere model using an inverse modeling method combined with an atmospheric transport model. In this process, the optimal physiological parameters of the biosphere model were estimated by minimizing the misfit between observed and modeled values, and parameters were generated to characterize various biome types. Results obtained using the model with the optimized parameters correspond to the observed seasonal variations in CO 2 concentration and their annual amplitudes in both the Northern and Southern Hemispheres. In simulating the mean annual AGB and NPP, the model shows improvements in estimating the mean magnitudes and probability distributions for each biome, as compared with results obtained using prior simulation parameters. However, the model is less efficient in its simulation of AGB for forest type biomes. This misfit suggests that more accurate values of input parameters, specifically, grid mean AGB values and seasonal variabilities in physiological parameters, are required to improve the performance of the simulation model.


Introduction
The terrestrial biosphere generally absorbs CO 2 from the atmosphere, and its annual global carbon uptake rate is considered to be similar to that of the ocean (Tans et al., 1990).Many studies have attempted to accurately quantify the total carbon exchange rate between the terrestrial biosphere and the atmosphere and to determine the role of the terrestrial biosphere in the global carbon cycle (e.g., Schimel, 1995;Field et al., 1998).Modeling of the terrestrial biosphere is one of the key strategies used in these studies (e.g., Potter et al., 1993;Running and Hunt, 1993;Ito and Oikawa, 2002).
Biosphere models use a variety of approaches and representations to predict the processes of ecosystem carbon dynamics.Therefore, quantities in the carbon budgets derived from these models differ.Cramer et al. (1999) compared 17 biosphere models and found that global terrestrial net primary productivity (NPP) calculated by the models ranged from approximately 40 to 65 Pg C yr −1 and that the models yielded different NPP spatial distributions.Even at regional scales, Ichii et al. (2010) reported large differences in the magnitudes of annual gross primary productivity (GPP) and ecosystem respiration generated by nine simulation models.The discrepancies revealed by the systematic comparisons between different biosphere models indicate that the current biosphere models are still in need of improvement (Friedlingstein et al., 2006;Jung et al., 2007;Sitch et al., 2008).
Model-data synthesis is one approach to addressing the problem of discrepancies between models because such a Published by Copernicus Publications on behalf of the European Geosciences Union.
synthesis reduces uncertainties and optimizes controlling processes in a model to improve its fit to observed data.This method is widely applied in terrestrial biosphere modeling, as well as in atmospheric and oceanic modeling, based on various a priori information.Examples include modeling approaches to the estimation of residence times of carbon in soils and plants in regional areas (Barrett, 2002;Zhou and Luo, 2008), the fit of eddy covariance flux variables at point scales (Braswell et al., 2005;Sacks et al., 2006), the impact of seasonal water stress on ecosystem gas exchange (Reichstein et al., 2003), and atmospheric inversion schemes (Ciais et al., 1995;Enting et al., 1995;Bousquet et al., 2000;Peylin et al., 2005).An overview of model-data synthesis methods in terrestrial carbon studies and the improvements required in such methods have been discussed in several previous studies (Raupach et al., 2005;Wang et al., 2009;Luo et al., 2011).
Using a model-data synthesis approach, Kaminski et al. (2002) introduced a method to systematically infer optimal model parameters and applied the method to the assimilation of observed atmospheric CO 2 concentrations into an integrated terrestrial biosphere-atmospheric transport model.Atmospheric CO 2 concentrations reflect the distributions of CO 2 sources and sinks at various spatial and temporal scales, rather than those distributions restricted to small-scale footprints.In this way, these inputs are more likely to yield reliable results when applied to an optimized global biosphere model.In these studies, two parameters, light use efficiency and the temperature sensitivity of respiration (Q 10 ) were individually assessed for 12 biomes, which resulted in good simulations of the phases and amplitudes of seasonal atmospheric CO 2 at the observation sites.Rayner et al. (2005) further developed the method of Kaminski et al. (2002) by replacing the biosphere model with a more mechanistic approach and optimizing more of the model parameters, in what is known as the "Carbon Cycle Data Assimilation System" (CCDAS).
To date, studies have focused predominantly on CO 2 fluxes associated with atmospheric CO 2 variability, and less effort has been directed toward the effects of parameter optimization in simulations of ecosystem variability in carbon pools or in plant materials.However, the importance of accurate estimates of the forest carbon pools, as well as atmospheric CO 2 variability, is increasingly recognized in scientific and political circles (e.g., Carvalhais et al., 2010;Saatchi et al., 2011), as such estimates are required to quantify the CO 2 emissions attributable to deforestation and forest degradation, which account for about 12 % or more of total anthropogenic CO 2 emissions (van der Werf et al., 2009).Improvements in the estimates of forest carbon pools and atmospheric CO 2 variability are necessary for assessing the global carbon balance and the role of the terrestrial biosphere in the state of the current climate.
Here, we develop an optimization scheme for a global biosphere model for atmospheric CO 2 variability, above-ground biomass (AGB), and NPP data, combined with an atmospheric transport model and a Bayesian inversion method.Our primary goal is to improve current assessments of the terrestrial carbon cycle under current climatic conditions and atmosphere-biosphere interactions.For this purpose, we focus on constructing a model that is capable of comprehensively simulating the carbon dynamics of natural vegetation, vegetation structure, and atmospheric CO 2 variability on a global scale.

Terrestrial biosphere model
A prognostic biosphere model, the Vegetation Integrative SImulator for Trace gases (VISIT; Ito, 2010) is used in this study to simulate the global terrestrial carbon cycle.All variables in VISIT are calculated at a 2.5 • × 2.5 • spatial resolution and with a time step of one day.The global vegetation types in the model are classified into 14 biomes (Fig. 1), and a map of their distributions is produced using the synergetic land cover product (SYNMAP) (Jung et al., 2006).The reanalysis/assimilation data sets released by the Japan Meteorological Agency (JMA) and the Japan 25-year reanalysis (JRA-25)/JMA Climate Data Assimilation System (JC-DAS) (Onogi et al., 2007) are used to force the VISIT simulation.The JRA-25 reanalysis covers the period from 1979 to 2004, and JCDAS covers the period from 2005 to 2009.The meteorological data used to operate VISIT are downward shortwave radiation, total cloudiness, air temperature, ground surface temperature, soil temperatures, specific humidity, precipitation, and wind velocity.The precipitation bias in JRA-25/JCDAS is corrected based on the study of Saito et al. (2011).We use the climate data for the 31-year period from 1979 to 2009 for both spin-up and model simulations.In the spin-up simulation, the 31-year climate data are repeated over 4000 years until carbon pools reach an equilibrium state.
In the model, plant physiology and the carbon budget are simulated in each grid cell.Net ecosystem productivity (NEP) is defined as follows: where HR is the heterotrophic respiration, AR is the autotrophic respiration, and NEE is net ecosystem CO 2 exchange rate.The daily GPP rate is expressed using the dry-matter production theory of Monsi and Saeki (1953): where (= 4.32 × 10 −4 ) is a unit converter from µmol CO 2 m −2 s −1 to Mg C ha −1 day −1 ; DL is day length (h), LAI is a leaf-area index estimated as a function of the given specific leaf area, SLA (cm 2 g −1 ); P c is the single-leaf photosynthetic rate (µmol CO 2 m −2 ); P sat is the single-leaf photosynthetic rate under light saturation (µmol CO 2 m −2 ); K a is a dimensionless function of solar height; Q e is light use efficiency (mol CO 2 mol −1 photon); and PPFD mid is the photosynthetic photon flux density (PPFD) at the canopy top at midday (µmol photon m −2 s −1 ).The parameter PPFD mid is estimated from the equation of Kuroiwa (1966) with inclusion of the cloudy decline effect.Seasonal variations in P sat are assumed to be dependent on the ground surface temperature (T g ; • C), the intercellular CO 2 concentration (C i ; ppmv) (which is estimated from the ambient CO 2 concentration), and soil moisture ( ; mm), according to where P max is the potential maximum value of P sat ; F is the coefficient function used to calculate seasonal variations in P sat ; F tmp is formulated using the maximum, minimum, and optimum temperatures for photosynthesis (T max , T min , and T opt , respectively); and F stl and F nstl are formulated using the photosynthesis-limiting factors for intercellular CO 2 concentration (kmci) and soil moisture (km_nstl), according to where CD cmp is the CO 2 compensation point (Brooks and Farquhar, 1985), FP sw is the soil water holding capacity (mm) (given by soil property data), and C stl and C nstl are constant coefficients.
Parameter AR represents both the growth and maintenance respiration from foliage, stem and branch, and root components.The growth respiration (AR G ) of each component is the cost to produce new biomass, given as where rg X is the specific growth respiration rate, TP X is the carbon translocation rate, and the subscript X represents the different components: foliage (FL), stem and branch (SB), and roots (RT).In contrast, the maintenance respiration (AR M ) is represented as a function of T g : where rm X is the specific maintenance respiration rate for each component and M X is the carbon mass for each component.
The parameter HR represents respiration rates in two layers: the litter (HR L ) and humus (HR H ) layers.Both HR L and HR H are calculated as functions of the soil temperature at two depths, namely 10 cm (T s10 ) and 200 cm (T s200 ), and in the upper and lower soil layers: where shl and shm are the specific heterotrophic respiration rates and M L and M H are the carbon masses of the organic matter in the litter and humus layers, respectively.The annual mean value of T s10 is used as a substitute for T s200 because JRA-25/JCDAS provides the soil temperature at depth.In VISIT, the amount of litter fall (LF) is assumed to be proportional to the carbon mass of each component (foliage, stems and branches, and roots), according to where lf X is the specific litter fall rate.
Parameter AGB is the sum of the carbon mass in foliage M FL and stems and branches M SB .Changes in M X for a given period are given by The value of AGB is estimated by accumulating M FL and M SB on a given initial AGB for both spin-up and model simulations.
In this study, the prior values of the parameters are the same for all biome types (Table 1).The posterior parameters are estimated by Bayesian inversion.An average sensitivity of each parameter to three variables (NEE, AGB, and NPP of whole biomes) was estimated using 10 % change in the prior parameters and a 2 • C variation in temperature parameters, using first-order sensitivity analysis.From this analysis the 12 parameters with the highest sensitivities (T min , SLA, lf FL , P max , shl, T opt , Q e , TP FL , km_nstl, shm, Q 10 , and lf SB ) were selected to optimize the fit of the model simulations to the observations (Table 1).These parameters explain 90 % of the variations in the three variables for whole biomes.Uncertainties associated with these parameters are unknown; thus, they are defined using the results of the sensitivity analysis.The sensitivity of lf SB , which shows the lowest sensitivity (1.9 %) of the selected 12 parameters to changes in the three variables, is defined as a reference value for parameter uncertainty; i.e., it is assumed that all parameters potentially have uniform sensitivity to the three variables.A scale factor is calculated for each parameter using the reference sensitivity of 1.9 % divided by the sensitivity of the parameter; then, an uncertainty of the parameter is defined by multiplying by 10 % of the prior parameter, or 2 • C, using a scale factor (Table 1).Here we assume a linear relationship between parameter sensitivity and changes in the parameters.

Atmospheric tracer transport model
Seasonal variability in atmospheric CO 2 concentrations at the observation sites is simulated using the National Institute for Environmental Studies, Frontier Research Center for Global Change (NIES/-FRCGC) off-line global atmospheric tracer transport model (NIES-TM; Maksyutov et al., 2008).The NIES-TM is one of the transport models evaluated in the Atmospheric Transport Model Intercomparison Project (TransCom; Law et al., 1996;Gurney et al., 2002).The vertical turbulent diffusion in the boundary layer of NIES-TM is parameterized using the monthly mean planetary boundary layer (PBL) heights derived from a 3-hourly PBL height data set (Schubert et al., 1993).
We operate the NIES-TM with a 2.5 • × 2.5 • grid resolution and nine vertical levels, with forcing data from JRA-25/JCDAS.The oceanic CO 2 flux, fossil fuel emission inventory, and NEE in the terrestrial biosphere are required a priori information to run the NIES-TM.In this study, the ocean flux is derived from the Offline ocean Tracer Transport Model (OTTM; Valsala and Maksyutov, 2010), and the fossil fuel emission data are from the Open-source Data Inventory of Anthropogenic CO 2 emissions (ODIAC; Oda and Maksyutov, 2011).We run the NIES-TM for a 6-year period (2000)(2001)(2002)(2003)(2004)(2005).The data for the first year are used for the spin-up simulation, and the data for 2001-2005 are used in the analysis.Trends in simulated atmospheric CO 2 concentrations with periods longer than around 1.5 years and shorter than about 15 days are removed using an orthogonal wavelet decomposition (Howell and Mahrt, 1994).Then, the mean monthly atmospheric CO 2 is calculated.

Bayesian inversion
We optimize a set of biophysical parameters in VISIT against mean monthly atmospheric CO 2 and mean annual AGB and NPP.To derive an optimal parameter set m, the deviations of the model estimates from the observed data are minimized using a Bayesian inversion scheme.When the probability distributions of all observed and a priori biophysical parameters are assumed to be Gaussian, and their means are d obs and m p , respectively, the misfit between d obs and the modeled values G(m) for atmospheric CO 2 , AGB, and NPP is defined by a cost function (Tarantola, 2005): where the superscript T denotes the transpose, and C D and C M are the covariance matrices defining the uncertainties in d obs and m p , respectively.The values of C D are described in the following section.The model result G consists of a linear atmospheric transport model A and a nonlinear terrestrial biosphere model B. In the minimization of S(m), in the case of atmospheric CO 2 , G is determined by the first derivatives of B(m p ), which, using finite differences m, is where a linear relationship is assumed between the first derivative of B and m p for m.In the cases of NPP and AGB, the operator G is determined by Eq. ( 15) without multiplication by A. The center of the posterior Gaussian distribution, which shows the optimized values for the model parameter m, can be expressed following Tarantola (2005) as: The expression in Eq. ( 16) is inverted using single-value decomposition.
The optimal parameter set m is given by minimizing S(m) iteratively.In this process, the values of C D and C M are fixed, while the value of m p for all parameters evolved at each iteration by taking a posterior m as a prior m p .The minimization of S is assessed by calculating χ 2 , which is the mean square mismatch between d obs and G(m), expressed as The value of χ 2 is calculated for the entire data set of atmospheric CO 2 , AGB and NPP.The iteration for the optimization is aborted when the value of χ 2 between the current and previous iterations is less than 0.1 and the difference is continuously maintained or decreases in further iterations.

Data
The atmospheric CO 2 data from GLOBALVIEW-CO 2 (2010) are used as the observation data for CO 2 concentrations.The GLOBALVIEW-CO 2 product is based on atmospheric measurements and provides information about CO 2 variability at over 100 sampling locations around the globe.
The CO 2 variability is stored in a file named "extended records" (ext), which records weekly smoothed atmospheric concentrations, and which is used in the following analysis.Mean monthly CO 2 variability is calculated from the ext data in the same manner as that for simulated atmospheric CO 2 (Sect.2.2).The standard error of CO 2 variability, recorded in a file named "statistical summary of the average seasonal pattern", is used as a measure of the uncertainty in the observed CO 2 in C D in Eqs. ( 14) and ( 16).We analyze sampling locations at 102 sites, including at 19 sites with multiple vertical CO 2 observations sampled by aircraft (Fig. 1).For those sites with vertical profiles, partial-column CO 2 concentrations are estimated from the vertical profiles by assuming that the CO 2 concentration at each observation height represents the concentration at altitudes centered around the observation height.We analyze this partial-column concentration instead of individual CO 2 concentrations at multiple heights.AGB data from the International Institute for Applied Systems Analysis (IIASA; Kindermann et al., 2008) are used in this study to optimize AGB in the biosphere model.The IIASA data provides half-degree global biomass and carbon stock data estimated from a downscaling model on the basis of a map provided by the Global Forest Resources Assessment.The data for AGB per hectare are calculated from the gridded AGB data divided by the grid area data.To reduce discrepancies attributable to differences in the dominant biome in each grid between the model and IIASA, the grid resolution for AGB is expanded to 7.5 • × 7.5 • .Therefore, the IIASA data with a grid resolution of 0.5 • × 0.5 • and the VISIT data with a resolution of 2.5 • × 2.5 • are aggregated to the desired resolution.Here, the mean AGB values in VISIT for the 10-year period of 2000-2009 are used for the analysis.We compare AGB in 344 grids over the global terrestrial area.
The third data set contains the NPP data from the Global Primary Production Data Initiative (GPPDI; Scurlock et al., 1999;Olson et al., 2001).The GPPDI data set consists of three classes of NPP data: NPP measurements at intensively studied sites with site-specific information, NPP measurements at extensive sites with less site-specific information, and gridded NPP data with a grid resolution of 0.5 • compiled from collections of inventory, modeling, and remote-sensing data.All of the NPP data from the three classes are aggregated into a 7.5 • grid cells, resulting in 166 grid cells being analyzed in this study.As for AGB, the NPP data in VISIT are averaged over the period 2000-2009 and are aggregated into 7.5 • grid cells.
Global maps of AGB and NPP include various estimating errors, including scale mismatches between measurements and models, measurement errors, and representation errors, leading to large uncertainties in C D (Raupach et al., 2005).This study defines the uncertainties in the observation data of AGB and NPP as 50 % of the 7.5 • grid mean values, based on the errors in relative estimates of annual NPP given in Raupach et al. (2005), with minimum uncertainties of 1.0 Mg C ha −1 and 0.5 Mg C ha −1 yr −1 for AGB and NPP, respectively.

Atmospheric CO 2 simulation
The prior value of χ 2 that is weighted by number of data points for atmospheric CO 2 , AGB, and NPP is 7.80.This value decreases with each iterations, and the iteration is aborted after 10 repetitions because no large changes in χ 2 are subsequently observed.The final value of weighted mean χ 2 after 10 iterations for the whole data set is 3.93; χ 2 = 4.83 for atmospheric CO 2 , 1.46 for AGB, and 1.90 for NPP.The large value of χ 2 for atmospheric CO 2 can be partly attributed to the combined effect of minor uncertainties in the observations.
Atmospheric CO 2 variability is estimated globally, and mean monthly concentrations are compared with observed data from GLOBALVIEW-CO 2 at 102 stations.The result estimated from the posterior NEE is in good agreement with observations for the large domain and seasonality (r 2 = 0.86, P < 0.001; Fig. 2).The slope of the linear regression indicates that the modeled CO 2 variability is slightly underestimated, with an offset of 0.2 ppm.The misfit of mean monthly atmospheric CO 2 between observations and model values is defined at each station as , where N mnt is the number of months, N mnt = 12.The value of χ 2 N at each station varies from 0.16 to 52.75, with a mean χ 2 N of 4.90, and the annual mean value of the root mean squared error (RMSE) varies from 0.11 to 4.71 ppm, with a mean RMSE of 1.46 ppm.The values of χ 2 N show large misfits (χ 2 N > 10) at 11 stations; misfits of 2-3 ppm in atmospheric CO 2 in months with minor observation uncertainties result in the large χ 2 N value.Such observation uncertainties occur at stations located on islands, peninsulas, bare ground, complex terrains, and oceans, where the biospheric CO 2 flux exerts less influence on atmospheric CO 2 variability, or where local mesoscale motions control the variability.Improvements in the atmospheric CO 2 simulation at these stations is necessary to improve not only the biospheric model, but also the atmospheric transport and ocean models; thus, the 11 stations at which χ 2 N > 10 are excluded from the following analysis.Consequently, the mean values of χ 2 N and RMSE for the whole data set become 3.02 and 1.36 ppm, respectively.
Figure 3 shows examples of the mean seasonal variations in atmospheric CO 2 at the University of Maine, Argyle, Maine, USA (AMT; 45.03 • N, 68.68 • W) and Tierra Del Fuego, Ushuaia, Argentina (TDF; 54.85 • S, 68.31 • W).The biome types at the AMT and TDF stations are classified as mixed forest and closed shrubland, respectively, in the model.The annual mean χ 2 N and RMSE values for the observed and posterior data are 3.59 and 2.01 ppm, respectively, at AMT and 0.17 and 0.11 ppm, respectively, at TDF.The peak of photosynthetic uptake in August at AMT is slightly underestimated in the model, resulting in a discrepancy in the position of the peak between the observations and the model (Fig. 3a).An improvement can be seen in the results for the posterior parameter as compared with the prior parameter, although the CO 2 variability for some months is still outside the domain of the uncertainties in the observed data.At the TDF station, the prior concentration shows overestimates of both the photosynthetic uptake in summer and the respiration release in winter (Fig. 3b).The posterior concentration reduces these to a difference of within a few tenths of a ppm, and the discrepancy of the CO 2 release peak in the prior data is moderated in the posterior data.
Figure 4 shows a comparison of the mean seasonal amplitudes of atmospheric CO 2 based on the observed and posterior parameters at all stations along a single latitude.The seasonal amplitude at each station is defined as the difference between the maximum and minimum mean monthly concentrations.The posterior simulation shows a representation of the seasonal amplitude with an RMSE of 1.65 ppm for all stations, although the differences in the seasonal amplitude between the observation and the model are significant (P < 0.001).Erroneous overestimates (> 5 ppm) in the mean amplitude occurs at two stations in the Northern Hemisphere: Sary Taukum, Kazakhstan (KZD; 44.08 • N, 76.87 • W), and Schauinsland, Germany (SCH; 48.00 • N, 8.00 • E).The seasonal variability in the posterior parameters at both stations shows large overestimates in the winter concentration and slight overestimates of the maximum photosynthetic uptake, which occur in early summer.Unfortunately, in this study we failed to improve these mismatches and produce a good agreement with observations at other stations.This is partly because of the inability of the global biosphere model to represent NEE variability in the areas around observation points and also because unsuitable biophysical parameters are selected for adjustment in the optimization process.(n = 11), 11.3 ± 3.1 (n = 51), 6.7 ± 1.4 (n = 13), 2.2 ± 1.1 (n = 9), 1.3±0.3(n = 3), and 1.2±0.2ppm (n = 4) for the GLOBALVIEW-CO 2 data and 15.3 ± 2.9, 11.2 ± 3.6, 6.1 ± 1.6, 1.9 ± 0.9, 2.1 ± 0.4, and 1.9 ± 0.1 ppm for the posterior data.

Biomass simulations
The χ 2 N values of mean annual AGB and NPP for the observed and prior data range from 0.00 to 4781 among the grid cells, with a mean of 71.50.This large misfit of AGB and NPP is improved by using posterior values, which give χ 2 N values in the range of 0.00 to 58, with a mean of 1.74.Six of the total of 510 grid cells (at a 7.5 • × 7.5 • resolution) show χ 2 N values much greater than 10 for the posterior data.In these grid cells, the posterior AGB or NPP are overestimated as compared with observations, suggesting that discrepancies exist between observations and the model in some biome types.
In the simulation using the posterior parameters, the mean annual AGB and NPP values are estimated at a spatial resolution of 2.5 • × 2.5 • before being compared with IIASA and GPPDI data for each biome type that had been aggregated to a comparable resolution (Figs. 5 and 6).The biome type for each grid cell is classified using the SYNMAP product (see above).The IIASA data show biome-specific patterns in the median AGB: the biomes showing AGB values of > 30 Mg C ha −1 , and with relatively large variances, include evergreen broad-leaf forest (EBF), deciduous broad-leaf forest (DBF), and mixed forest (MF).Those with AGB values of approximately 20-30 Mg C ha −1 include evergreen needleleaf forest (ENF), deciduous needle-leaf forest (DNF), and woodland (WL); other biome types show AGB values of < 20 Mg C ha −1 (Fig. 5).The simulation using the prior parameters shows greater variability in AGB values for most biome types but does not overestimate AGB in any of the forest biome types relative to the IIASA data.These patterns in the AGB distributions are improved in the simulations using the posterior values.The variations in AGB with respect to biome and the small variations in AGB for each type are reproduced in the model.However, the model using the posterior values underestimates the magnitudes of AGB for forest and woodland biomes (except for DNF), and a significant difference is noted between observations and model results in median AGB for the whole biomes (P < 0.001).The largest difference between model results in terms of AGB is present in EBF; the lower and upper 10 % AGB in the IIASA are equal to 19.5 and 105.5 Mg C ha −1 , respectively, with a median value of 52.4 Mg C ha −1 , whereas those of the model range from 1.2 to 38.5 Mg C ha −1 , respectively, with a median value of 22.3 Mg C ha −1 .
The median NPP for each biome type in the GPPDI ranges from 2 to 6 Mg C ha −1 yr −1 , except for the much higher NPP of EBF and lower NPP of bare ground (BG) (Fig. 6).As in the AGB distributions, the simulation with the prior values shows greater variability in NPP values across biomes relative to observations.The NPP distributions using the posterior values show biome-specific patterns similar to those of AGB.As compared with AGB distributions, no large underestimations in the NPP values of forest type biomes are present; however, there is a significant difference in the median NPP values between observations and the model for whole biomes (P < 0.05).No forest type biomes show NPPs comparable with those of forest type biomes in the GPPDI, suggesting that there may be discrepancies in the dominant biome type in a grid cell between the GPPDI data and the model.

Model parameterization
The differences between posterior and prior parameters, divided by the uncertainties of the prior parameters (Table 1), are shown in Fig. 7.The posterior parameters show various changes relative to the prior parameters.Here, we examine some of the parameters that show clear changes.The largest increases and reductions in the magnitudes of the posterior parameters are observed for the optimum temperature for plant activities T opt .This parameter specifies the optimum temperature for photosynthetic uptake, which directly controls the rate of photosynthesis and its seasonal variability in response to changes in air temperature.The temperature dependence of photosynthesis is strongly influenced by other environmental factors, such as light intensity, intercellular CO 2 pressure, and water and nutrient regimes; however, T opt , which is generally between 20 and 30 • C, represents the adaptation of a plant to the temperature regimes of its habitat (Berry and Björkman, 1980;Monson et al., 1982).The posterior parameters of T opt are approximately observed within 20 and 30 • C, which range from 19.8 to 28.7 • C, with high T opt values for EBF, closed shrubland (CSL), and cropland, and low T opt values for DNF, MF, and tundra biomes.The variability in posterior T opt values among biomes is roughly consistent with the latitudinal distribution that the biomes occupy (Fig. 1): high T opt value with high mean temperatures at low latitudes, and low T opt values with low mean temperatures at high latitudes.The minimum temperature for plant activities, T min , which mainly controls the phases of leafing and senescence and the duration of photosynthetic activity, also shows similar patterns of change in the posterior parameters to those of T opt .These results suggest that variability in biome-specific parameters associated with temperature dependence can be reproduced by our optimization scheme.
The SLA describes the allocation of leaf biomass per unit of leaf area.Values of SLA generally decrease with increasing leaf life span, and the mean SLA values of forbs, broadleaf shrubs, broad-leaf trees, and needle-leaf trees are 188, 112, 112, and 48 cm 2 g −1 , respectively (Reich et al., 1997(Reich et al., , 1998)).Regardless of biome, the SLA is well predicted by the posterior parameters for the corresponding biomes, of 175, 131, 131, and 195 cm 2 g −1 for grassland, CSL, EBF, and ENF, respectively, although the SLA for ENF is obviously a misfit relative to values obtained in previous studies.In our model, SLA directly regulates the variability in GPP, along with changes in the LAI, which is similar to the function of P max , which defines the potential maximum photosynthetic rate (Eq.3).However, SLA is an ecological parameter that is closely related to canopy leaf nitrogen content, photosynthetic capacity, irradiance, and nutrient availability, and the relationship between P max and SLA is non-linear (Pierce et al., 1994;Meziane and Shipley, 1999).Nevertheless, our optimization scheme does not account for the interactions among parameters or between parameters and environmental factors, and parameters that work similarly in the model are not explicitly excluded, resulting in similar changes in posterior parameters between P max and SLA.Consequently, posterior SLA may be partly constrained by changes in P max , leading to a misfit for ENF.This shortcoming in our system may extend to other parameters; therefore, further development of our system is required to include the interactions among parameters and to identify criteria for selecting optimized parameters.
The posterior parameters of temperature dependence of the respiration rate Q 10 , light use efficiency Q e , specific litter fall of stems and branches lf SB , and the photosynthesis limitation factor in terms of soil moisture km_nstl show large reductions in uncertainty relative to those of prior values (Fig. 8).This large reduction in uncertainty indicates that more information is integrated into the model parameters based on observational data.The causes of larger reductions in uncertainty in the four parameters are the relatively large uncertainties in the prior parameters (see Sect. 2.1) and the relatively high sensitivity of atmospheric CO 2 , AGB, and NPP to changes in the parameters.However, very small reductions in uncertainty are observed in all parameters.This is attributed to indirect contributions of each parameter to observational variables and small uncertainties given in the prior parameters, resulting in a difficulty of integrating new information into the parameter (Rayner et al., 2005).
The biosphere model VISIT is designed to be used with biome-specific parameters that have constant values for each biome, many of which are independent of environmental influences.The parameters optimized in this study, such as the carbon translocation rate to foliage (TP FL ) and litter fall (lf X ) rate, have constant values for each biome, indicating that these parameters have been adjusted to represent the central parts of the distributions of the corresponding physiological processes on long-term and regional scales.However, for example, vegetation alters its biomass allocation patterns in response to differences in climate to allow its survival under unfavorable conditions (Callaway et al., 1994;Maherali and DeLucia, 2001), and the amount of litter fall also varies considerably under different environmental conditions (Berg and Laskowski, 2006), but none of these factors are included in the model.Therefore, the model far from adequately represents the physiological processes under all conditions, which might result in large discrepancies in model results from the observed data at some points.If it is possible to describe the response of each parameter to the environmental conditions, a model that includes those variable parameters could improve the simulation of global carbon sequestration and its distribution, although adopting empirical rules also entails the possibility that the uncertainties in the modeling will increase (Saito et al., 2009).
In this study, we selected 12 key parameters in a prognostic biosphere model for model-data synthesis.Caution is required because it might be possible to obtain similar model results with different combinations of model parameters (Wang et al., 2009).Additional observations would be useful in optimizing many more model parameters and enhancing the performance of the model-data synthesis.

Model limitations
Our optimization scheme was tested by running it with only observational data on atmospheric CO 2 concentrations, in the same manner as performed using the three observational variables.The χ 2 value of 7.5 obtained using the prior values decreases to 4.6 after 10 iterations.Relative changes in prior and posterior parameters and the mean annual distributions of AGB and NPP for each biome show very similar patterns to those displayed in Figs.5-7, despite the fact that any information on observational data for AGB and NPP is not integrated into the posterior parameters (not shown).The test results, and the discrepancies between observations and the results of posterior simulations for mean annual AGB and NPP (shown in Figs. 5 and 6), suggest that observational data of AGB and NPP might not contribute much to the minimization of the misfit between the observations and the model.A possible explanation for this observation is that the sizes of the observational uncertainties for AGB and NPP are too large to solve the minimization problem, resulting in an incomplete indication of variability.The large observational uncertainty reflects a lower degree of confidence in the accuracy of the AGB and NPP observations (Raupach et al., 2005), and improving the representation of the AGB and NPP distributions on large spatial scales would, thus, improve the results of the model simulations.
The model does not account for the effects of natural and anthropogenic disturbances, such as those due to fire, land use changes, and grazing and harvesting of forest products and croplands; thus, the mean global budget of NEP estimated from the posterior parameters is approximately 0 Pg C yr −1 on account of the equilibrium state of carbon pools.This result clearly underestimates the current carbon uptake rate of about 2.6 Pg C yr −1 (IPCC, 2007).Including the effects of disturbances and subsequent regrowth of vegetation would improve the discrepancies between observations and model simulations of the global carbon budgets.

Conclusions
This study has described a framework for optimizing a prognostic biosphere model and gives some typical results estimated from posterior parameters.We used a Bayesian inversion method to improve the model estimates against observed variabilities in atmospheric CO 2 concentrations, mean annual AGB, and NPP by adjusting the physiological model parameters.The simulation based on posterior parameters produces moderate seasonal variations and annual amplitudes in atmospheric CO 2 ; however, it is still not able to solve explicitly the distributions of the mean amounts of AGB and NPP.Nevertheless, by using the optimization scheme, it is possible to reproduce some features of the physiological parameters that characterize different biomes, suggesting that the optimization scheme could be useful in understanding the dynamics of the global carbon cycle.However, it should be noted that the following advances will be required for further improvement of carbon cycle models to be made: a greater number of observations and improvements in the model regarding physiological processes and global biome distributions; reductions in the uncertainties in the atmospheric transport model and the a priori information; and models with greater spatial and temporal resolution.
Edited by: M. Kawamiya The publication of this article is financed by CNRS-INSU.

Figure 2 .
Figure 2. Comparison of the mean monthly CO 2 variability calculated with GLOBALVIEW and with the model presented here.

Figure 3 .
Figure 3. Seasonal variations in the observed and modeled mean monthly CO 2 concentrations (ppm) at (a) University of Maine, USA, and (b) Tierra Del Fuego, Argentina.Open circles with the dashed lines are GLOBALVIEW-CO 2 data, the dotted lines represent the prior parameters, and the solid line represents the posterior parameters.The error bars indicate the uncertainties in the observations.

Figure 5 .
Figure 5.Comparison of AGB (Mg C ha −1 ) values for each biome according to IIASA (light grey box) and VISIT with prior (dark grey box) and posterior (open box) parameters.The box-and-whisker plots show the median, upper, and lower quartiles, and the upper and lower 10 % of the data.

Figure 6 .
Figure 6.As for Fig. 5, except showing a comparison of NPP values (Mg C ha −1 yr −1 ) for GPPDI and VISIT with prior and posterior parameters.The GPPDI does not contain NPP data in the grids corresponding to tundra (TND).

Figure 7 .
Figure 7. Differences between the posterior parameters (PRM pst ) and the prior parameters (PRM pri ), with uncertainties of prior parameters (UNC pri ) as (PRM pst − PRM pri ) / UNC pri .

Figure 8 .
Figure8.As Fig.7, but as the uncertainty reduction (%) from UNC pri to uncertainties of posterior parameters (UNC pst ) for each parameter as (UNC pri − UNC pst ) / UNC pri × 100.

Table 1 .
List of variables in prior parameters (Prior) and their uncertainties (Unc).The values given in parentheses are for grassland.