Model – data fusion across ecosystems : from multisite optimizations to global simulations

This study uses a variational data assimilation framework to simultaneously constrain a global ecosystem model with eddy covariance measurements of daily net ecosystem exchange (NEE) and latent heat (LE) fluxes from a large number of sites grouped in seven plant functional types (PFTs). It is an attempt to bridge the gap between the numerous site-specific parameter optimization works found in the literature and the generic parameterization used by most land surface models within each PFT. The present multisite approach allows deriving PFT-generic sets of optimized parameters enhancing the agreement between measured and simulated fluxes at most of the sites considered, with performances often comparable to those of the corresponding site-specific optimizations. Besides reducing the PFTaveraged model–data root-mean-square difference (RMSD) and the associated daily output uncertainty, the optimization improves the simulated CO2 balance at tropical and temperate forests sites. The major site-level NEE adjustments at the seasonal scale are reduced amplitude in C3 grasslands and boreal forests, increased seasonality in temperate evergreen forests, and better model–data phasing in temperate deciduous broadleaf forests. Conversely, the poorer performances in tropical evergreen broadleaf forests points to deficiencies regarding the modelling of phenology and soil water stress for this PFT. An evaluation with data-oriented estimates of photosynthesis (GPP – gross primary productivity) and ecosystem respiration (Reco) rates indicates distinctively improved simulations of both gross fluxes. The multisite parameter sets are then tested against CO2 concentrations measured at 53 locations around the globe, showing significant adjustments of the modelled seasonality of atmospheric CO2 concentration, whose relevance seems PFT-dependent, along with an improved interannual variability. Lastly, a globalscale evaluation with remote sensing NDVI (normalized difference vegetation index) measurements indicates an improvement of the simulated seasonal variations of the foliar cover for all considered PFTs.


Introduction
Land surface models (LSMs) have been tools of growing importance in the continuous effort to develop comprehensive earth system models which help to understand the effects of changes in land surface processes and land-use practices upon biogeochemical (carbon, water, nutrients) and energy cycles, and more generally upon earth's climate (Cramer et al., 2001;Friedlingstein et al., 2006;Sitch et al., 2008).With the goal of improving accuracy and realism, the increasing amount and range of scale of the processes included in mechanistic LSMs result in a growing number of parameters associated with the corresponding model equations (Pitman, 2003).Some parameters are easily identified with a given physical process (and can sometimes be measured); others are purely empirical and account for a variety of processes embodied in a few equations, yet to be refined.In both cases, obvious computational and complexity limits have traditionally led model developers to use broad classes of soil and vegetation types, for which typical, generic parameter values are assigned (e.g.Sellers et al., 1996).
One difficulty is scaling up the leaf-and plant-level measurements of physical parameters for ecosystem-scale simulations (Jarvis, 1995;Bonan et al., 2012).Moreover, the variety of species within each of the 10-20 plant functional types (PFTs) typically used by most models makes the choice of a representative parameter value critical, thus adding significant uncertainty to the model outputs.In this context, parameter optimization methods have been increasingly used to calibrate model parameters and reduce the associated uncertainty.The criterion is to minimize the misfit between simulation outputs and observed data (Raupach et al., 2005).As for ecosystem models, eddy covariance measurements provide direct, near-continuous, in situ observations of carbon dioxide, water and energy exchanges between the canopy and the atmosphere (Baldocchi et al., 2001;Baldocchi, 2008).This measurement method has been applied across an extensive global network (683 sites as of April 2014), spanning a wide range of ecosystems and climates (http://fluxnet.ornl.gov/).
Over the last decade, numerous studies with various LSMs have used this available information to derive sets of parameters that significantly improve the model-data fit, with optimization approaches ranging from simple parameter adjustments to rigorous data assimilation frameworks (e.g.Wang et al., 2001Wang et al., , 2007;;Reichstein et al., 2003;Braswell et al., 2005;Knorr and Kattge, 2005;Santaren et al., 2007;Thum et al., 2008;Williams et al., 2009;Carvalhais et al., 2010;Keenan et al., 2012).However, most of these efforts have focused on model calibration at individual sites, which often results in model parameters overly tuned to the specifics of a particular site given the small spatial footprint of each flux tower (typically a few hectares).Only recently, some studies started to assess through optimization the generic nature of model parameters within PFTs.The benefit of a set of parameters derived at one site was evaluated for simulations at a similar site (Medvigy et al., 2009;Verbeeck et al., 2011) and over the surrounding region (Medvigy and Moorcroft, 2012), with encouraging results.In parallel, two independent efforts simultaneously used data constraints from several sites to assess the degree of improvement of the simulated fluxes depending on the "generic criterion" used for the optimized parameters (Groenendijk et al., 2011;Kuppel et al., 2012).The study of Groenendijk et al. (2011), conducted at over a hundred locations across several PFTs, found that the cross-site parameter variability after optimization explained the poorer performances of grouping sites by PFT, while no such discrepancy appeared in Kuppel et al. (2012), whose study was however limited to temperate deciduous broadleaf forests.
Building on the optimization procedure developed by Kuppel et al. (2012), the present work assesses the potential of the multisite assimilation of carbon net ecosystem exchange (NEE) and latent heat (LE) flux measurements in a process-based terrestrial ecosystem model (ORCHIDEE -Organizing Carbon and Hydrology in Dynamic Ecosystems).The objective is to improve site-scale simulations of carbon and water fluxes at a large number of flux-tower sites, as well as global-scale simulations of vegetation phenology and terrestrial carbon balance.Specifically, we address the following questions: (1) for each of the seven PFTs considered (out of 12 in ORCHIDEE, 5 being not covered by the dataset used here), can we find a generic set of optimized parameters that enhance the model-data fit at all sites?(2) How well does the multisite approach compare to site-specific optimizations?(3) What are the main improvements introduced by the optimization procedure from seasonal to annual timescales: daily error, model-data bias, seasonal cycle amplitude and/or phase?(4) Which processes remain poorly captured by the model after optimization?(5) Have the eddycovariance-constrained sets of multisite parameters a notable impact on global-scale simulations?
Section 2 presents the ecosystem model, the data assimilation system, and the eddy covariance measurements used in this study, as well the supplementary data sets and models.The results are presented and discussed in Sect.3, successively dealing with the model-data fit at the site level (Sect.3.1), the comparison between multi-and single-site results (Sect.3.2), and the uncertainties of modelled NEE and LE (Sect. 3.3).Then we evaluate the impact of the derived multisite parameterization upon the site-scale simulation of photosynthesis and ecosystem respiration rates (Sect.3.4), and, at the global scale, upon the simulated seasonality and interannual variability of atmospheric CO 2 concentration (Sect.3.5.1)and finally upon the seasonality of vegetation activity (Sect.3.5.2).

Vegetation model and optimized parameters
The biogeochemical vegetation model used in this study is ORCHIDEE (Krinner et al., 2005).It calculates the water, energy and carbon fluxes between the land surface and the atmosphere at a half-hourly time step.The exchange of carbon and water during photosynthesis and the energy balance are treated every 30 min, while carbon allocation, autotrophic respiration, foliar onset and senescence, mortality and soil organic matter decomposition are computed on a daily time step.The soil hydrology follows a double-bucket scheme (Ducoudré et al., 1993) and its impact on stomatal conductance is described in Krinner et al. (2005).The reader is referred to previous publications for the standard equations of ORCHIDEE (e.g.Kuppel et al., 2012).As in most biogeochemical models, the vegetation is grouped into several PFTs, 12 in the case of ORCHIDEE, excluding bare soil.Except for the modelled phenology (Botta et al., 2000), the equations governing the different processes are generic across PFTs, but with specific parameter values for each vegetation class.When used in "grid-point mode" at a given site, we force the model with the corresponding half-hourly gapfilled meteorological data measured at the flux towers.At the global scale, the ERA-Interim meteorology (Dee et al., 2011) is used as forcing and the model outputs are calculated at a 0.72 • × 0.72 • resolution.In this case the global PFT map is computed at the spatial resolution of the forcing fields, from an original vegetation map available at 5 km, which is derived from a high-resolution IGBP AVHRR (International Geosphere-Biosphere Programme -Advanced Very High Resolution Radiometer) global land data set (Eidenshink and Faundeen, 1994) and uses 94 ecosystem classes (Olson, 1994).Importantly, the modelled carbon pools are initially brought to equilibrium before both site-and global-scale simulations by cycling the available meteorological forcing over several centuries (spin-up procedure), with the prior parameterization of the model.This procedure ensures a net carbon flux close to zero over annual-to-decadal timescales.
Table 1 presents the PFT-generic parameters used in this study.As our emphasis is on adjusting the carbon cycle, there are significantly more optimized parameters leveraging on photosynthesis and respiration processes than, for instance, on the energy balance.We also included two additional parameters to optimize the initial state of the model provided by the spin-up procedure: (1) a common multiplier of the initial carbon pool content, by default equal to one, and (2) the initial leaf area index (LAI) of non-deciduous PFTs, by default taken from the spin-up outputs.Both parameters are considered as site-specific, since the soil organic carbon content is closely related to the local land-use history, while the foliar cover of evergreen and herbaceous species directly relates to vegetation history at the site level.One consequence is that they cannot be spatially extrapolated, thus the global simulations performed for evaluation (see Sect. 2.4) use the default value of these last two parameters, i.e. using the initial carbon pool content and foliar cover provided by the spin-up procedure.

Data assimilation system
The model parameters are optimized using the variational data assimilation method described in Kuppel et al. (2012).Assuming a Gaussian distribution for errors on the parameters, the model outputs and the measured data, the optimized set of parameters corresponds to the minimization of the following Bayesian cost function J (x) (Tarantola, 2005): which quantifies both the misfit between modelled and observed fluxes, and the misfit between a priori and optimized parameters.
x is the vector of unknown parameters, x b the vector of background (i.e.here, prior) parameter values, H (x) the model output, y the vector of observed fluxes, and P b and R are the prior covariance matrices of parameter errors and observation errors, respectively.The cost function is iteratively minimized with the gradient-based algorithm L-BFGS-B (limited-memory Broyden-Fletcher-Goldfarb-Shanno, with bound constraints), which allows prescribing boundaries for each variable to optimize (Byrd et al., 1995).At each iteration, the gradient of the cost function J (x) is computed with respect to all parameters, mostly using the tangent linear (TL) model of ORCHIDEE generated with the automatic differentiator tool TAF (Transformation of Algorithms in Fortran; see Giering et al., 2005).Exceptions concern two phenological parameters, K pheno,crit and c Tsenes (see Table 1), where the threshold functions prevent the use of a linear approximation.In these cases we use a finite-difference approach with prescribed perturbation steps respectively equal to 4 and 2 % of the allowed variation range.The recent work of Santaren et al. (2013) with the same ecosystem model highlighted the risk of converging towards a local minimum within a site-specific variational optimization.In our case, preliminary tests within three PFTs (tropical and temperate evergreen broadleaf forests, and temperate deciduous broadleaf forests) allowed us to verify that the convergence of our multisite approach barely depends on the choice of the first-guess values assigned to the optimized parameters.However, such robustness is not guaranteed with the site-specific approach, and potential convergence issues are discussed in the results section.
Once the cost function reaches the minimum, the posterior parameter error variance/covariance matrix P a is explicitly calculated from the prior error covariance matrices (P b and R) and the Jacobian of the model H at the minimum of the cost function (H ∞ ), using the linearity assumption (Tarantola, 2005): The prior parameter error covariance matrix P b is diagonal as prior uncertainties are supposed to be uncorrelated between parameters.The prior standard deviation for each parameter is equal to one-sixth of the range between the lower and higher boundaries.The latter have been carefully specified following the physical and empirical expertise of the OR-CHIDEE modelers, based on literature reviews or databases (such as TRY, Kattge et al., 2011).
In the prior observation error covariance matrix R, we include both the random error on the measurements and the model error, the latter stemming from missing/inadequate process representation in the structural equations of the ecosystem model.Although the measurement error is known not to be constant (e.g.Richardson et al., 2008), a previous study using the ORCHIDEE model suggested that the model component dominates the observation error budget (Kuppel et al., 2013).The variances in R are chosen as constant at a given site for each type of data (NEE and LE), equal to the mean square difference between the prior model and the observations.We also choose for simplicity to keep R diagonal, based on the rapid decline of the model error autocorrelation beyond the first lag day (Kuppel et al., 2013).

Assimilated eddy covariance flux data
We use the eddy covariance data provided by 78 flux towers of the FluxNet global network (Baldocchi, 2008), representative of 7 of the 12 vegetated PFTs defined in the OR-CHIDEE model (Table S2 in the Supplement).All the sites of a given dominant ecosystem are located in the same geographical hemisphere, which makes seasonal analyses easier.These observations derive from standard flux data processing methodologies (correction, gap-filling and partitioning) of the La Thuile data set (Papale, 2006).From the large amount of available site-years in this data set, our selection was driven by several requirements, the first of these being a minimum vegetation cover of 70 % by the dominant PFT within each tower footprint, based on site-level information.
Then we discarded the sites where measurements show a significant disagreement with the prior simulation outputs, as it suggests strong model structural deficiencies that make the parameter optimization pointless.Lastly, we selected at each site the longest data segment of consecutive years without gaps larger than a few weeks.Where measurements of the ground heat flux (G) were available, the monthly energy balance was closed with a correction factor, then half-hourly interpolated and applied to the LE and sensible heat (usually called H ) fluxes, according to the Bowen ratio technique (Twine et al., 2000).The half-hourly, gap-filled measured fluxes of NEE and LE are then used to compute daily means.We chose to assimilate daily averaged observations and not half-hourly measurements so as to focus the optimization on timescales ranging from seasonal to annual variations, and to take advantage of the rapidly decreasing autocorrelation of gap-filled half-hourly fluxes (Lasslop et al., 2008).In order not to give too much weight to data estimated from gapfilling as compared to measured data, each daily observation error is inflated by a factor 1 + 0.5k, where k is the daily fraction of half-hourly data estimated from gap-filling.We also checked that the gaps still remaining after the gap-filling were distributed evenly over the course of the day.The individual days with more than 20 % of these "ultimate" gaps were not included in the assimilation.
The eddy covariance data are compared to the simulated fluxes in terms of RMSD (root-mean-square difference) and bias.In addition, for the six non-tropical PFTs we use a curve-fitting procedure (composed of a polynome of degree two and four harmonics) to decompose the fluxes into their trends and mean seasonal cycles following Thoning et al. (1989).The detrended smooth seasonal cycle is used to estimate the ratio between the average annual amplitude of the simulated and observed fluxes, as well as a model phase coefficient defined as Here, b i and e i are respectively defined as the days when the detrended smooth curve crosses the zero line downwards and upwards.In tropical evergreen broadleaf forests, the phase and amplitude diagnostics presented above are not applied, due to the lack of a marked seasonal cycle.Instead, the predictive power of the simulations is evaluated using the Nash-Sutcliffe model efficiency coefficient (NSE; Nash and Sutcliffe, 1970): where F t i is the value of the simulated or observed flux at the time step t, and F obs the mean observed flux.

Evaluation tools
The model is evaluated at the sites using the two components of NEE: the gross primary productivity (GPP) and the ecosystem respiration rate (R eco ), both estimated via the fluxpartitioning method described in Reichstein et al. (2005).This method extrapolates night-time measurements of NEE, representing night-time R eco , into daytime R eco using a short-term-calibrated temperature response function.GPP is then derived as the difference between R eco and NEE.We acknowledge that GPP and R eco are not fully independent data (with respect to the assimilated NEE) and are essentially model-derived estimates somewhat conditional on our underlying assumptions, which will be kept in mind during the analysis.
Additionally, measurements of the normalized difference vegetation index (NDVI) made by the MODIS (Moderate Resolution Imaging Spectroradiometer) instrument are used to evaluate the simulated phenology at the global scale.From 2000 to 2010, the calculated reflectances (from measured irradiances) have been corrected for atmospheric absorption, scattering (Vermote et al., 2002) and directional effects (Vermote et al., 2009) in order to obtain a daily NDVI product with a 5 km spatial resolution.Observations contaminated with snow cover were removed from the analysis using MODIS' quality filter, and we discarded NDVI observations below 0.2 in order to minimize the impact of baresoil reflectance.Spatial averaging is used to match the ERA-Interim resolution (0.72 • × 0.72 • ) used for the global-scale simulations.Because it is directly derived from surface reflectances, we preferred NDVI to other satellite products such as FAPAR (Fraction of Absorbed Photosynthetically Active Radiation) or LAI, the latter requiring intermediate processing steps usually involving radiative transfer models, and thus possibly adding uncertainty to the retrieved data (Garrigues et al., 2008).Following Maignan et al. (2011), we then calculate the Pearson correlation factor between the times series of measured NDVI and the FAPAR modelled by ORCHIDEE, at the weekly timescale during the period 2000-2010.FAPAR has been estimated from modelled LAI with a simple Beer law: (5) The link between simulations and measurements is made by spatially averaging the latter to reach the resolution of the vegetation model (i.e. that of the ERA-Interim forcing).For each of the seven PFTs considered, we restrict our correlation computation to the model boxes where the dominant PFT cover fraction exceeds 50 % and where both the NDVI and FAPAR time series exhibit a visible seasonal cycle (i.e. with a standard deviation larger than 0.04).Lastly, the simulated global NEE fluxes are output at the daily timescale and spatially averaged from the ERA-Interim grid (0.72 • × 0.72 • ) to a 2.5 • × 3.75 • resolution (latitude, longitude).The LMDZ (Laboratoire de Météorologie Dynamique zoom) atmospheric transport model (Hourdin et al., 2006) is used at this resolution to convert these terrestrial fluxes into monthly atmospheric CO 2 concentrations, during the period 1989-2009.In order to complete the carbon balance at the planetary scale, we also transport the global oceanic and fossil net carbon fluxes respectively taken from a climatology (Takahashi et al., 2009) and from the EDGAR database (Emissions Database for Global Atmospheric Research; http://edgar.jrc.ec.europa.eu).The contribution of biomass burning is neglected, because regrowth of burnt vegetation is not accounted for in this version of OR-CHIDEE, and so are the evasion of CO 2 from aquatic bodies and emissions from harvested wood and agricultural products.The transported fluxes are evaluated using 53 smoothed records of atmospheric CO 2 concentrations (C CO 2 ) over the globe (Table S3 in the Supplement) (GLOBALVIEW-CO2, 2013).As the optimization of the initial soil carbon content cannot be spatially extrapolated for global simulations (see Sect. 2.1), the modelled trend of C CO 2 is not evaluated.Rather, we focus on the seasonal analysis and use the curvefitting procedures of Thoning et al. (1989) to extract the detrended seasonal signal of C CO 2 .In addition, we identify the contributions of 11 subcontinental regions to the simulated atmospheric CO 2 concentration at each station by independently transporting the fluxes from each of the following areas: boreal North America, temperate North America, tropical America, South America, Europe, northern Africa, southern Africa, boreal Asia, temperate Asia, tropical Asia, and Australia (e.g.Fig. 1 in Gurney et al., 2003).The simulated interannual variability of the C CO 2 is evaluated using the model-data RMSD of monthly anomaly from the detrended smooth seasonal signal calculated above: where C CO 2 ,month allyears is the all-time average, for each month of the year.

Site-level simulation of carbon and water fluxes
Figure 1 shows the average corrections brought by the optimization to the modelled NEE fluxes (with negative values meaning carbon uptake), grouped by dominant PFT (see acronyms in Table 1), in terms of RMSD and bias between simulations and measured data, also showing the PFT-averaged mean seasonal cycles.The largest reductions of model-data RMSDs are found in temperate and boreal broadleaf forests (TempEBF, TempDBF and BorDBF), where the two optimization scenarios (single-and multisite) decrease the misfit by more than 25 % compared with the prior (unoptomized) model.In temperate needleleaf forests (TempENF) and C3 grasslands (C3grass), the RMSD reduction exceeds 30 % for single-site optimizations, but the corresponding multisite sets of parameters reduce this value to less than 20 %.The improvements are less significant in tropical evergreen broadleaf forests (TropEBF) and boreal evergreen needleleaf forests (BorENF), where the reductions in misfit are between 9 and 15 %. Figure 1b shows that the NEE is on average overestimated by the prior model for all PFTs.This feature is even more striking in ecosystems which are marked sinks of carbon (according to the average measured carbon balance, not shown), here tropical and temperate forests.This positive bias is an artifact from the model initialization procedure, which brings each simulated site to a near equilibrium (see Sect. 2.1).It is significantly corrected by the optimization, notably via the scaling of the initial carbon pool content at each site (parameter K soilC in Table 1); one consequence is a clear reduction of the respiration during winter in temperate and boreal ecosystem sites and grassland sites in agreement with the measured data (Fig. 1c).
Figure 2 shows that the simulation of the LE flux is overall less improved by the optimizations than that of the net carbon flux, keeping however in mind the problem of energy balance closure discussed in Sect.2.3.The reduction of RMSD is the highest on average at TempDBF sites with values 24 % below the prior value, while decreases of 15-19 % are found at TempEBF and BorENF sites.The effect of the optimization is the weakest on average at sites located in TempENF and C3grass ecosystems.These weaker performances regarding LE flux indicate that the energy and water cycles in the OR-CHIDEE model involve other relevant parameters not optimized here, and possibly that the structural equations bear a significant error.Notably, we include in the optimization only one parameter that directly controls the soil evaporation  (Z0 overheight , see Table 1), and there is for example no constraint on the calculation of the surface temperature, a key component of the energy balance.
At the seasonal scale, Fig. 3a shows that large reductions (in relative value) of the simulated mean seasonal NEE amplitude are found in boreal evergreen needleleaf and deciduous broadleaf forests and C3 grasslands.The average correction is somewhat exaggerated in the two former cases and relatively accurate in the latter case.Conversely, the seasonal NEE variations are consistently amplified by the optimization in temperate evergreen needleleaf and broadleaf forests.However, the averaged model-data phasing is only weakly modified for the five aforementioned PFTs, with the exception of the site-specific improvements at TempENF and C3grass sites.Furthermore, considering the mild correction of the model-data biases in BorENF, BorDBF and C3grass (Fig. 1b), one can deduce that most of the RMSD reduction discussed earlier is for these three PFTs due to an improvement of the simulated NEE amplitude after the optimization.
In temperate deciduous broadleaf forests, the simulated pattern of NEE is chiefly improved via a better phased seasonal cycle, as shown by the increased phase score, which was already close to one before optimization.An earlier study at a similar set of sites of the same PFT showed that the optimization scheme tends to correct the overall prior model overestimation of the growing season length (Kuppel et al., 2012).However, the simulated seasonal amplitude of NEE is barely changed after optimization, as the corrected flux overestimations in winter and summer tend to cancel out, with a PFT-averaged seasonal amplitude remaining smaller than that of the observed data (Figs.1c, 3).
Regarding the latent heat flux, Fig. 3b shows that the optimization has generally a weaker effect on the simulated LE average phase and amplitude than in the case of NEE.In most cases the correction brought by the optimization barely affects the modelled phase, but improves the seasonal amplitude.We notice that the LE seasonal cycle is most often flattened as compared to the prior model, in agreement with the observations, except for the inconsistent amplification at TempEBF sites and the overreduction after the site-specific optimization in C3 grasslands.The weak phase correction might be related to the soil evaporation component of the latent heat flux, on which the optimization has a limited leverage, as mentioned earlier in this section, while the transpiration rate is tightly linked to GPP.It would also explain the generally lower phase coefficient in deciduous ecosystems (Fig. 3b), where soil evaporation is a potentially significant component of LE during leaf onset and senescence.Applying the Nash-Sutcliffe model efficiency coefficient (see Eq. 4) to all sites shows that TropEBF is the only PFT studied here, where the PFT-averaged value of this metric remains below zero after optimization for both fluxes, with NSE NEE = −2.77,−1.99, and −1.83, and NSE LE = −0.64,−0.35, and −0.52 in prior, multisite and site-specific cases, respectively (other PFT values not shown).It means that after optimization the model-data mean square error is still larger than the variance of the observations, or, in other words, that the observed mean is on average a better predictor than the model outputs.Figure 1 shows that for TropEBF the prior model simulates an unobserved increase of NEE from sink to source around July, and the simultaneous decrease of LE (Fig. 2) points towards an unrealistic simulated drought stress during this period of the year, the driest at most sites of this PFT.The optimization barely corrects the NEE variations during the dry season, although a more realistic LE flux is simulated after the multisite optimization.An earlier optimization study at a site of the same PFT highlighted the need for a much deeper soil water reservoir, along with a more linear root profile than that parameterized in the prior model, in order to account for the ability of tropical evergreen forests to maintain high photosynthesis and transpiration during the dry season (Verbeeck et al., 2011).Our multisite parameterization of the processes dealing with soil water availability goes in that direction, with values of soil water depth, root profile and water stress coefficient respectively adjusted from 2 to 2.38 ± 0.065 m, from 0.8 to 0.72 ± 0.095 m −1 and from 6 to 6.5 ± 1.06 (Table 1).These corrections from the prior parameterization remain however insufficient, as shown by the poorly realistic optimized seasonal cycle of NEE in Fig. 1.However, Verbeeck et al. (2011) also pointed at the structural inconsistency in the standard ORCHIDEE model for tropical evergreen forests: the phenological scheme notably neglects the leaf renewal at the transition between the wet and dry seasons (Chave et al., 2010) and the hydric stress calculation ignores the role of groundwater (Miguez-Macho and Fan, 2012), while these mechanisms possibly explain the high subsequent photosynthesis and transpiration rates often observed (De Weirdt et al., 2012).Concerning the LE flux, the optimization brings somewhat limited, yet consistent changes, while the reduction of daily uncertainty is modest, indicating a poor level of constraint by the observations used.It suggests either significant errors in the model equations, or that relevant, poorly known parameters, have not been considered.

Single site versus multisite
It can be noticed in Figs. 1, 2 and 3 that there is a general consistency across PFTs between RMSD reductions introduced by multisite and site-specific optimizations, with some exceptions in TempENF and most notably C3grass where the average site-specific RMSD reduction is twice as large for NEE, while there is almost no average multisite RMSD decrease for LE.Although the large number of sites selected for this last PFT and the associated intersite variability calls for prudence when considering average seasonal flux variations, it is worth noting that C3 grasslands are here the only PFT generically spanning such a diversity of climates.The reported discrepancy might thus indicate a need for additional classes of C3 grasslands in the model, at least with a climatic regionalization and ideally taking also into account pedologic conditions and management practices.
More generally, one would expect better efficiency from a site-specific scheme than with a multisite approach, given that grouping sites with different characteristics introduces conflicting constraints on the model equations, along with the fact that the RMSD is the criterion used in the optimization procedure (as the prior covariance matrix in the cost function of Eq. ( 1) is chosen diagonal; see Sect.2.2).This is true most of the time, except notably for NEE in boreal deciduous broadleaf forests and LE at TropEBF, TempEBF and BorENF sites where the multisite optimization results on average in larger RMSD decreases than for the site-specific approach.In these cases, Figs. 1 and 2 show that the difference in efficiency stems from unchanged local RMSDs after the site-specific optimization at a few sites of these particular PFTs.As found by Santaren et al. (2013), it may point to a failure of the single-site inversion algorithm to converge towards the global minimum of the cost function, possibly due to the presence of local minima.Our hypothesis is that the corresponding multisite cost functions avoid this pitfall because they are made more regular by the larger amount of simultaneous constraints on the parameters, "smoothing out" some of the problematic local minimums.Preliminary multisite optimization tests, using a few tens of random starting points, support this hypothesis, and further investigations will be needed to evaluate if this behaviour is valid for all PFTs.Indeed, we acknowledge some uncertainty regarding whether or not the optimized sets of parameters correspond to the very minimum of the cost function, as the efficiency of the variational optimization approach employed is conditional on a reasonable compliance with the linearity hypothesis.

Site-scale uncertainty
In addition to improving the agreement between modelled and measured fluxes, the optimization procedure is also useful to reduce the total uncertainty associated with the modelled output variables at each site, defined as σ total represents the summed contribution of two errors arising in the observation space: the measurement error and the error of the equations of the model (see Sect. 2.2).It is not directly altered through parameter optimization, although the model component may in principle vary with the parameter values.Following Desroziers et al. (2005), σ total is diagnosed at each site as the square root of the covariance between the time series of prior and posterior flux residuals (model minus observations).σ parameters is the parameter error contribution to the simulated fluxes, calculated at each site, before optimization, as the average daily standard deviation of the projection in observation space of the prior error covariance matrix P b , using the model's Jacobian matrix H, based on the definitions in Sect.2.2.The same is done after optimization, respectively using P a and H ∞ .Figure 4 reports the average values for simulated daily NEE and LE, showing individual site values and the corresponding PFT means as in Figs. 1 and 2. The reduction of the total NEE uncertainty varies from one PFT to another, ranging from 6 % in tropical evergreen broadleaf forests to 33 % in boreal evergreen needleleaf forests.As it is reduced by 65-95 % (not shown), we deduce from Eq. ( 7) that the weak relative decrease indicates a dominance of the observation error in the total uncertainty budget.This is for example consistent with the reported inaccuracies in the model structure for TropEBF ecosystems discussed in the previous section.
Regarding the LE flux, the mild changes from prior to posterior uncertainty mean that we might face a potentially large observation error component (model + measurements) -the latter being insensitive to parameter optimization; see Sect.2.3 -and overall that little statistical information has been gained from the optimization of the selected water cycle parameters.
One can notice that the posterior uncertainties are always slightly lower with the multisite optimization than with the site-specific approach, which is consistent with the fact that the number of assimilated data is larger in the former case than in the latter.Finally, we also found that the optimization suppresses at each site much of the temporal correlation of the flux errors, which are large in the prior ORCHIDEE model (see for instance the time correlogram in Fig. 1 of Kuppel et al., 2013).This results in a large decrease of the total yearly uncertainty from the prior model for all PFTs, by 77-86 % and 43-80 % for simulated NEE and LE fluxes, respectively (not shown).

Simulated GPP and respiration
Figure 5 shows the mean seasonal cycle, averaged over each PFT, for the gross carbon fluxes: photosynthesis (GPP; Fig. 5a) and ecosystem respiration (R eco ; Fig. 5b).The "observed" values are estimates based on a partition of the measured NEE (see Sect. 2.4).These gross carbon fluxes have not been used as constraints in the optimization procedure, but are useful as indicators of the model performance.One can first notice that the average increases of GPP at TropEBF and TempEBF sites are responsible for the NEE decrease observed in Fig. 1.It is worth noting that the results reported in Fig. 5 also confirm the inability of the model to simulate a sustained high photosynthesis rate during the dry season at TropEBF sites (see Sect. 3.1), while this feature appears in the observation estimates.At TempENF sites, the remarkable adjustment of the NEE cycle primarily derives from a reduced R eco at the peak of the growing season.Both GPP and R eco are consistently decreased in boreal forests and C3 grasslands sites, although the reduction is still lower than what would be needed to match the estimates.In addition, because the respiration rate is the sole reducing component in winter and because the photosynthesis rate is more largely decreased than R eco during the growing season, the net result is the reduction of the seasonal amplitude of NEE for these three PFTs.Finally, there is a large, yet insufficient, decrease of R eco after the optimization in temperate deciduous broadleaf forests, notably related to the scaling of the initial carbon pool content (Sect.3.1; Kuppel et al., 2012), while GPP is less drastically reduced, in close agreement with the observations.This evaluation at each site with gross carbon fluxes shows that the optimization procedure is able to provide a set of parameters which improves the simulation of both assimilation and respiration processes in the ORCHIDEE model for six out of the seven PFTs considered here, suggesting a partial distinction of both gross contributions from the constraint provided by the net carbon flux.

Global-scale evaluation
One of the objectives of assimilating flux data from a large number of sites, spanning a wide range of ecosystems, is to identify generic sets of parameters that improve the simulation of carbon and water balance at the regional-to-global scale.Indeed, there is no guarantee that a set of parameters improving the simulations at an ensemble of individual sites sharing broadly common biogeochemical and biophysical characteristics, but with a limited spatial footprint, will also be beneficial for simulations at much larger scales.In this context, global simulations allow evaluating how the constraint of eddy covariance data is propagated from one spatial scale to another, and how transferable the optimized parameterization is from grouped in situ optimizations to gridded simulations.

Seasonality of atmospheric CO 2 concentrations
Regarding the simulated mean seasonal cycle of atmospheric C CO 2 , the optimized set of parameters yields a median reduction of the model-data RMSD of 5.2 %.Among the 53 sample locations used in this study, there is a significant improvement at 27 of them with a RMSD decrease larger than 5 %, a notable degradation at 20 locations with a RMSD increase larger than 5 %, and less than a 5 % shift at the remaining 6 locations.In addition, a latitudinal clustering can be identified, as a large median improvement by 42.2 % (RMSD-wise) is found at the 3 northernmost locations (Alert, Ny-Ålesund, and Barrow) and by 33.5 % at the 18 locations of the Southern Hemisphere, while there is a median degradation by 5.6 % in the rest of the Northern Hemisphere.
Figure 6 shows the mean seasonal cycle of the simulated C CO 2 , compared to the extended record at three locations, one in each of the latitudinal areas defined above: Alert, South Pole, and Mauna Loa, respectively.We note that using the optimized parameter sets tends to reduce the seasonal amplitude of C CO 2 , and results in an earlier phasing for the "breathing of the biosphere" in the Northern Hemisphere.At station Alert, there is a significant adjustment of the simulated seasonal cycle, when changing from the default to the multisite parameterization of the ORCHIDEE model.This correction chiefly benefits the seasonal amplitude, which is decreased and becomes remarkably close to that observed.The analysis of the contribution of the 11 subcontinental regions in the simulated atmospheric signal (see Sect. 2.4), grouped in Fig. 6d in larger regions, indicates that the major terrestrial contribution to this result is from changes in C CO 2 due to the boreal Northern Hemisphere fluxes.It is consistent with the decrease of the NEE seasonal amplitude produced by the multisite optimization at sites in boreal evergreen needleleaf forests, boreal deciduous broadleaf forests and C3 grasslands (Figs. 1c,3a), dominant in this region.Separate global simulations using an optimized parameterization for one PFT at a time show that the degraded phasing at Alert produced by the multisite approach in Fig. 6a mainly stems from the contributions of BorENF and C3 grassland ecosystems (not shown).While a mild multisite phase deterioration from the prior parameterization is found at the site level in BorENF (Fig. 3a), it is not the case for C3 grassland and it may thus question the representativeness of the flux measurements sites used, with respect to high-latitude ecosystems in general.
At station South Pole, the model-data fit is also mostly enhanced after the optimization by a significant decrease of the seasonal amplitude of C CO 2 , which is more than twice as large in the prior simulation as in the measured data.Moreover, the "regionalized" analysis indicates that the corrections are primarily due to the reduced seasonal amplitude of C CO 2 components from temperate South America and southern Africa, and contributions from the boreal Northern Hemisphere are also noticeable (not shown).We therefore deduce that the optimization of C3 grassland parameters is the most influential factor explaining the improved simulation of C CO 2 at this station, but also that the influence of boreal needleleaf evergreen forests cannot be neglected here.
The reduction of the simulated cycle amplitude is too strong at station Mauna Loa, which, combined with earlier seasonality, leads to a poorer model-data fit after optimization.The remote location of Mauna Loa (North Pacific) makes it sensitive to influences from most of the Northern Hemisphere.We find that the main drivers of the simulated correction are flattened, earlier C CO 2 variations in temperate and boreal regions of North America and Asia, and Europe (not shown).These results thus reflect part of the reduction of the seasonal amplitude of NEE in boreal ecosystems and C3 grasslands noticed at the Alert and South Pole stations.The degraded model-data fit between optimized C CO 2 and data from Mauna Loa also suggests that the boreal correction of NEE amplitude is too strong or insufficiently compensated at a large scale by the amplification of the seasonal cycle at temperate latitudes visible at TempENF and TempDBF sites (Figs. 1c,3a).
Finally, using the multisite-optimized model overall brings a small improvement of the modelled interannual variability of C CO 2 , with a median reduction of 3.9 % for the RMSD between modelled and measured monthly anomaly (not shown).Forty-five locations, out of the 53 used in this study, display an improvement with RMSD decreases of up to 27 %, while at the remaining 8 locations the degradation of the simulated interannual variability remains small with RMSD increases always smaller than 1.5 % (not shown).These results suggest that despite the relative shortness (1-3 years) of most of the FluxNet data sets selected to optimize the ORCHIDEE model, the diversity of the covered weather situations gives a modest, yet consistent source of information to better reproduce interannual variations of carbon fluxes at the global scale.

Global-scale phenology index
Figure 7 reports for each optimized PFT the correlation factor between weekly values of measured NDVI and modelled FAPAR during the period 2000-2010 (see Sect. 2.4), for both the prior and optimized models.There is no result for BorDBF, whose vegetation fraction never exceeds 40 % in our case.All remaining six PFTs exhibit a higher median correlation factor when using the multisite parameterization, which means that the modelled leaf seasonal cycle better matches the global-scale observations.This median improvement seems to accurately reflect the overall trend for TempDBF-, BorENF-and C3grass-dominated pixels, while a larger interpixel variability is introduced in the case of temperate evergreen forests.The improved modelled seasonality is related to the more accurately simulated GPP at FluxNet sites after multisite optimization, the latter being in turn partly driven by the improvement of the seasonal variations of simulated LAI.The dominant feature seems to be a shorter growing season length for TempDBF, which is consistent with the site-level simulations of GPP seasonality for this PFT (Fig. 5a), and an earlier beginning of the growing season for C3 grasses (not shown).Note that this improvement also explains most of the increased correlation factors in temperate and boreal evergreen forests, since these PFTs do not present a climate-driven leaf phenology in the current formulation of the ORCHIDEE model.Consequently, deciduous and herbaceous PFTs are the only significant contributors to the seasonal cycle at such a coarse resolution, even when these ecosystems are secondary and/or the understory within an evergreen-dominated forest.Lastly, the score for TropEBF remains poor because the model wrongly simulates the leaf renewal and the hydric stress during the dry season, as discussed in Sects.3.1 and 3.4.

Limitations of the current approach: summary and discussion
The limitations of our model-data fusion method highlighted throughout the Results section are of three types, somewhat interlocked: (1) within the limits of the model structure, (2) how adequate the chosen set of optimized parameters was and (3) how close to the optimal values the optimization algorithm tuned these parameters.Taking these items in reverse order, we first acknowledge that using a variational optimization algorithm with a model with non-linearities might expose to miss the global minimum of the cost function, and indeed a few obvious convergence failure cases have been found for some single-site optimizations in TropEBF, TempENF, and boreal forests.Some functions of the ORCHIDEE model could potentially be linearized to generate a more accurate tangent linear model -and to advantageously avoid using finite differences for some phenological parameters (see Sect. a demanding effort of model recoding, but it has already been done for another LSM (Knorr et al., 2010).Alternatively, stochastic optimization approaches could yield better convergence, as they can circumvent the linearity constraint.While a single-site model-data fusion study with the same LSM showed advantageous results for a genetic Monte Carlo-based technique over its variational counterpart (Santaren et al., 2013), no major difference was found by Ziehn et al. (2012) between Monte Carlo and gradientbased approaches when optimizing a simpler LSM with atmospheric CO 2 observations.In the case of multisite optimization efforts, we suggest that the cost function "smoothing" discussed in Sect.3.2 could make the convergence efficiency less sensitive to the choice of the minimization approach, while keeping in mind the much lower computational time required in the variational case.
Second, the number of optimized parameters remains somewhat modest as compared to the diversity of processes modelled in the ORCHIDEE model.Our choice was partly driven by a model sensitivity criterion, while the actual leverage of an optimized parameter on model outputs also depends on the uncertainty associated with this very parameter (Dietze et al., 2014).This can result in selecting some parameters that are already reasonably well known but that have medium-to-high model sensitivity and thus with low overall leverage, while poorly known parameters with mild-to-low model sensitivity could have a comparatively higher value for the optimization.In addition, as our focus was on the carbon cycle, only a few water-and-energy-related parameters were considered.Notably, the correction of LE partly benefited from that of NEE via transpiration, but the soil evaporation optimization was neglected despite being a significant -and debated -player of the terrestrial water cycle (Schlesinger and Jasechko, 2014).
The third hindering factor to simulating carbon and water fluxes close to their true value is the "observation error", i.e. the uncertainty arising from the simplification needed to make the ecosystem functioning fit within explicit equations plus the error associated with the measurements, fluxes and meteorological forcing included.Although this error is rarely quantified in model-data fusion efforts, model-data fit analyses and uncertainty budgets showed in this study that the relative importance of this observation error greatly varies from one PFT to another -and is potentially dominated by the model error component in the case of simulations at fluxtower sites (Kuppel et al., 2013).It is highest in tropical evergreen broadleaf forests, where parameter optimization will likely be of limited help until a more realistic phenological scheme is implemented.Regarding the simulations of LE in general, the small number of related parameters optimized makes it difficult to assess to which extent the nearly unchanged flux uncertainty comes from the parameter scarcity or structural inaccuracies in the model, stressing again the need for a better consideration of water and energy cycles together with that of carbon in future model-data fusion efforts.

Conclusions
Generalizing the results of Kuppel et al. (2012) across ecosystems, this study has shown that a significant degree of improvement is introduced to the simulation of carbon and water fluxes, through a generic optimization approach with in situ measurements of NEE and LE fluxes, relying on the traditional PFT classification used in many land surface models.At the global scale, this optimization method allows first for a better simulation of the seasonal foliar cover.Second, the multisite parameter set has significant leverage upon the simulated seasonality of atmospheric C CO 2 , with performances somewhat spatially heterogeneous and depending on the PFT considered, while a small, yet encouraging improvement of the simulated interannuality of C CO 2 is found.The remaining discrepancies in C CO 2 indicate that combining atmospheric CO 2 concentration and a larger number of fluxtower observations, in a carbon cycle multidata assimilation system (CCDASs; e.g.Kaminski et al., 2013), would be beneficial.Using more site-specific years of flux data will also allow for a systematic in situ evaluation of the multisite parameters across time periods, regions and climate regimes by separating training sites from evaluation sites.Such a procedure was not applied in this study due to the small number of sites for some PFTs, but remains essential for testing a LSM used for climate projections.More generally, we suggest that the assimilation of FluxNet data should be considered as a baseline for the development of multidata assimilation systems where more complementary data streams are combined.In particular, daytime and night-time NEE could replace the daily values used here, and adding measurements of leaf area index, soil respiration fluxes (e.g.chamber measurements), biomass and litter/soil carbon pools, would help in better separating the processes and constraining environmental drivers, as would a simultaneous parameter optimization of both over-and understory PFT fractions.The FluxNet multisite approach can also be used to characterize the structural, parametric and total uncertainties associated with the simulated annual biospheric carbon balance at regional-toglobal scales, and to compare it with (1) the discrepancies of results between global ecosystem models (Sitch et al., 2008), and (2) the error carried by the terrestrial carbon fluxes estimated via inverse modelling with atmospheric transport models (e.g.Chevallier et al., 2010).The underlying problem is thus to evaluate what would be gained from simultaneously assimilating various data streams covering different spatial and temporal scales into a terrestrial ecosystem model, and how the PFT classification should be refined to maximize this improvement.In parallel, by using a diagonal prior covariance matrix for the parameter error, within an identical PFT and across PFTs, we implicitly assumed that all parameters could in principle be efficiently corrected as independent random distributions.This ignores the fact that a covariance structure interlinking the optimized parameterization would be necessary to translate the interconnectedness of ecophysiological processes within a given PFT.For instance, the allocation of carbon within the plant reservoirs depends on specific allometric relations and on photosynthesis rate; these relations would need to be embedded in the prior parameter error covariance matrix.Additionally, the influence of nearby individuals of other PFTs (e.g. the understory) should be accounted for when correcting parameters of a given PFT.Together with a simultaneous optimization of several PFTs, building standard spatialized parameter covariance tables from databases of plant traits and soil characteristics (e.g.Kattge et al., 2011) and "preliminary" posterior multisite parameter error covariance matrices (e.g.supplementary material of Kuppel et al., 2012) might soon become necessary to consistently apply model-data fusion to more sophisticated mechanistic ecosystem models.

Figure 1 .
Figure 1.Model-data (a) RMSD and (b) bias for the daily NEE time series at each site (filled circles), grouped and averaged by PFT (horizontal bars), in three cases: prior model (green), multisite optimization (blue) and single-site optimization (orange).(c) PFT-averaged mean seasonal cycle of NEE, for the training observations (black) and the three aforementioned cases, smoothed with a 15-day movingaverage window.

Figure 2 .Figure 3 .
Figure 2. Model-data (a) RMSD and (b) bias for the daily LE time series at each site (filled circles), grouped and averaged by PFT (horizontal bars), in three cases: prior model (green), multisite optimization (blue) and single-site optimization (orange).(c) PFT-averaged mean seasonal cycle of LE, for the training observations (black) and the three aforementioned cases, smoothed with a 15-day moving-average window.

Figure 4 .
Figure 4. Uncertainty of simulated daily (a) NEE and (b) LE fluxes.For each PFT, the horizontal lines give the average of the individual site values (filled circles) in three cases: prior model (green), multisite optimization (blue) and single-site optimization (orange).

Figure 5 .
Figure 5. PFT-averaged mean seasonal cycles of (a) the photosynthetic carbon flux and (b) the respiration flux, smoothed with a 15-day moving-average window.The simulations using prior (green), single-site (orange) and multisite (blue) parameterizations are compared to the evaluative observation-derived flux estimates (black).

Figure 6 .
Figure 6.Detrended mean seasonal cycle of the atmospheric CO 2 concentrations at (a) Alert, (b) South Pole and (c) Mauna Loa locations during the 1989-2009 period: the optimization-independent concentration records (black) are compared to simulations where the biospheric contribution is calculated using the ORCHIDEE model with the default (green) and multisite (blue) parameterizations, and the model-data RMSD given between brackets.(d) Regional contributions to the mean seasonal cycle simulated at Alert.

Figure 7 .
Figure 7.Correlation factor between weekly time series of modelled FAPAR and independent measurements of NDVI, for the 2000-2010 period.The results are grouped using the dominant PFT at each pixel, for global simulations with default (green) and multisite parameterization (blue).The central horizontal bar indicates the median value, the top and bottom of the boxes correspond to the first and last quartile, and the 5th and 95th percentiles are given by the "error bars".

Table 1 .
Parameters of ORCHIDEE optimized in this study.The prior values are given for each PFT, and the multisite posterior values are in bold font.A hyphen means that the parameter is not optimized, spin up that the spin-up value is taken, and site that the posterior value is site-specific.