A global carbon assimilation system using a modified ensemble Kalman filter

A Global Carbon Assimilation System based on the ensemble Kalman filter (GCAS-EK) is developed for assimilating atmospheric CO2 data into an ecosystem model to simultaneously estimate the surface carbon fluxes and atmospheric CO2 distribution. This assimilation approach is similar to CarbonTracker, but with several new developments, including inclusion of atmospheric CO2 concentration in state vectors, using the ensemble Kalman filter (EnKF) with 1week assimilation windows, using analysis states to iteratively estimate ensemble forecast errors, and a maximum likelihood estimation of the inflation factors of the forecast and observation errors. The proposed assimilation approach is used to estimate the terrestrial ecosystem carbon fluxes and atmospheric CO2 distributions from 2002 to 2008. The results show that this assimilation approach can effectively reduce the biases and uncertainties of the carbon fluxes simulated by the ecosystem model.


Introduction
The carbon dioxide concentration in the atmosphere plays an essential role in the study of global change for its potential to warm up the atmosphere and the surface.A better estimation of carbon fluxes over global ecosystems would help in better understanding each nation's contribution to global warming and improve global warming science.
In the past decade, many efforts have been made to estimate the surface CO 2 fluxes using both atmosphere-based top-down and land-based bottom-up methods.Carbon-Tracker (Peters et al., 2005(Peters et al., , 2007) may be one of the most advanced among these efforts.It uses an ensemble square root filter to assimilate atmospheric CO 2 mole fractions into an ecosystem model coupled with an atmospheric transport model.
The model state vectors in CarbonTracker are carbon fluxes only.However, the observed CO 2 consists of both initial state of atmosphere CO 2 and recently released carbon fluxes, and therefore including CO 2 concentration in the state vectors should improve the estimation of initial atmosphere CO 2 (Miyazaki et al., 2011).This could lead to further improvement of carbon flux estimation.Kang et al. (2011) and Liu et al. (2012) also added CO 2 concentrations to the state vectors due to their strong correlations with weather variables that are simultaneously assimilated.However, their efforts mainly focus on studying the performance of the assimilation methodology and observation settings by using idealized models only, not on assimilating real observations.
The length of the assimilation window in CarbonTracker is 5 weeks.This would include CO 2 observations far from the analysis time.However, this may not necessarily improve the flux analysis compared to an instantaneous analysis due to the attenuation of the detailed information as discussed by Enting (2002).A shorter assimilation window reduces the attenuation of observed CO 2 information, because the analysis system can use near-surface CO 2 observations before the transport of CO 2 blurs out the essential information of nearsurface CO 2 forcing (Kang et al., 2012).
Published by Copernicus Publications on behalf of the European Geosciences Union.

S. Zhang et al.: A global carbon assimilation system
It is well known that correct estimation of the forecast error statistics is crucial for the accuracy of any data assimilation algorithm.In all existing ensemble Kalman filter (EnKF) assimilations for estimating carbon fluxes, the ensemble forecast errors are estimated by the difference of perturbed forecasts and their ensemble mean.The perturbed forecast errors are defined as the perturbed forecast states minus the true state.Motivated by the fact that the analysis state is a better estimate of the true state than the forecast state, Wu et al. (2013) proposed a new estimator for the perturbed forecast errors by using the difference between the perturbed forecast states and the analysis state.Moreover, they demonstrated through a simulation study that the new estimator can lead to better assimilations for models with large errors.Since the errors of ecosystem models are generally large, the new estimation of the perturbed forecast errors is potentially useful to improve EnKF assimilation for estimating carbon fluxes.
Besides forecast errors, the observation errors also need be accurately estimated.In the majority of schemes for estimating carbon fluxes, including CarbonTracker, the observation error variances are not estimated but empirically assigned.The quality of the estimation of observation error variances critically depends on whether the forecast error covariance matrix is appropriately estimated (Desroziers et al., 2005).However, appropriate estimation of the forecast error covariance matrix is a challenge in real applications.
In this paper, we propose several modifications to the conventional EnKF for assimilating atmospheric CO 2 observations into ecosystem models.First, the model state contains both the surface carbon fluxes and atmospheric CO 2 concentration as suggested by Miyazaki et al. (2011), Kang et al. (2011) and Liu et al. (2012).Second, the analysis state is used to adaptively estimate forecast errors as suggested by Wu et al. (2013) and Zheng et al. (2013), and both forecast and observation errors are inflated as suggested by Liang et al. (2012).Finally, the 1-week assimilation window is tested against longer windows.This modified EnKF is used to assimilate real CO 2 concentration data into the Boreal Ecosystem Productivity Simulator (BEPS; Chen et al., 1999;Liu et al., 1999;Mo et al., 2008) for estimating the real terrestrial carbon fluxes with 3-hourly and 1 This paper consists of six sections.The models and data used in this study are introduced in Sect.2, while the methodology is described in Sect.3. Section 4 presents the validations of the new methodologies using the real observing system.A real data application of the proposed methodology is presented in Sect. 5. Conclusions and discussions are given in Sect.6.

Surface carbon flux models
The surface carbon fluxes mainly arise from fossil fuel combustion, vegetation fire, oceanic exchange, and biosphere.In this study, only the surface carbon fluxes from biosphere are simulated using BEPS, while the rests are taken from data sets of CarbonTracker 2011 (http://www.esrl.noaa.gov/gmd/ccgg/carbontracker/).
BEPS is a process-based ecosystem model mainly developed to simulate forest ecosystem carbon budgets (Chen et al., 1999;Ju et al., 2006;Liu et al., 1999).For many reasons, including the complexity of ecosystem processes, spatialtemporal variabilities, and representative errors, parameters in process-based models often do not represent their true values when these models are used to calculate carbon budgets over large areas or for long time periods (Mo et al., 2008).Errors in these parameters lead to biases in model results (other uncertainties, such as lack of knowledge on historical land use change and land management, also have influence on model results).In this study, we try to reduce biases in the BEPS-simulated carbon fluxes by incorporating atmospheric CO 2 concentration measurements with data assimilation methods.The prior carbon fluxes simulated by BEPS are at a spatial resolution of 1 • × 1 • and for every 1 h.On each model grid, BEPS calculates carbon fluxes of six different plant function types and outputs the sum of them through weighting the fluxes against areal fractions of the plant function types.Figure 1 shows the plant function types with the largest weight on each grid.
The vegetation fire flux is taken from CarbonTracker 2011 data set, which is modeled using the Carnegie-Ames-Stanford Approach (CASA) biosphere model (Potter et al., 1993) based on the Global Fire Emission Database (GFED) (van der Werf et al., 2006).
The oceanic CO 2 flux is taken from CarbonTracker 2011 optimized results, whose a priori estimates are based on two different data sets: namely, the ocean inversion flux result (Jacobson et al., 2007) and pCO 2 -Clim prior estimate derived from the climatology of seawater pCO 2 (Takahashi et al., 2009).
The fossil fuel combustion estimate is the data set preprocessed by CarbonTracker 2011 from the global total fossil fuel emission of the Carbon Dioxide Information and Analysis Center (CDIAC) (Boden et al., 2011) and the Open-source Data Inventory of Anthropogenic CO 2 emission (ODIAC) data set (Oda and Maksyutov, 2011).

Atmospheric transport model
The global chemical transport Model for OZone And Related chemical Tracers (MOZART; Emmons et al., 2010)  with 28 vertical levels.The forcing meteorology is from the National Center For Atmospheric Research (NCAR) reanalysis of the National Centers for Environmental Prediction (NCEP) forecasts (Kalnay et al., 1996;Kistler et al., 2001).Since CO 2 is chemically inert in the atmosphere, we turn off all the chemical processes and leave only transport of CO 2 by atmospheric motions.Given the atmospheric CO 2 concentration in the previous week and the surface carbon fluxes in the current week, MOZART is used to forecast gridded atmospheric CO 2 concentration within the current week.

Observation
The atmospheric CO 2 concentration measurements collected and preprocessed by Observation Package (ObsPack) data product (Masarie et al., 2014) 1. CO 2 concentration measurements reflect the variability of the total surface carbon fluxes (i.e.fossil fuel combustion, vegetation fire, oceanic uptake and biosphere) as well as inter-exchange among CO 2 air mass in the initial atmosphere.
The observation error variances are also provided in obspack_co2_1_CARBONTRACKER_CT2013_2014-05-08.They were subjectively chosen and manually tuned to fit into specific atmospheric transport models and observations (Peters et al., 2005(Peters et al., , 2007)).Since these values depend on the atmospheric transport model used in a carbon data assimilation system, they are just used as prior values for this study and will be adaptively adjusted with the proposed assimilation scheme.

Methodology
Within tth week, let c t be a set of gridded atmospheric CO 2 concentrations every 3 h, f t be the set of prior carbon fluxes every 3 h, and λ t be a set of factors defined as constants on areas and within a week for adjusting f t .Then, the model state is defined as In this study, only land surface carbon fluxes need to be adjusted.The partition of the adjustment factors (i.e.λ t ) is based on 11 Transcom regions (Gurney et al., 2004) and 19 Olson ecosystem types, as in CarbonTracker.Thus, the size of the state vector in this study is 128 × 64 × 28 × 8 × 7 (c t : lon × lat × lev × times/day × days) plus 145 (λ t ).We refer to this data assimilation scheme as the Global Carbon Assimilation System based on the ensemble Kalman filter (GCAS-EK).

EnKF with error inflations
Using the notations of Ide et al. (1997), the first EnKF algorithm used in this study consists of the following three main steps: www.

Forecast step
The perturbed forecast states are estimated as where i represents an ensemble member, ξ t,i are vectors sampled from a distribution with mean zero and a given covariance matrix (taken from prior covariance structure in CarbonTracker; see the document of CarbonTracker and Peters et al., 2005Peters et al., , 2007)), and G is the atmospheric transport operator which maps c t−1 and the λ t adjusted f t onto gridded CO 2 concentration.Then the forecast state is estimated as where m is the ensemble size.

Error step
The ensemble forecast errors and the observation error covariance matrix are estimated as √ θ t X f t and µ t R t , respectively, where and R t is the prescribed observation error covariance matrix.θ t and µ t are the inflation factors of the forecast error and the observation error, respectively, which are estimated by minimizing the objective function (Liang et al., 2012;Zheng, 2009): where y o t is the vector of atmospheric CO 2 concentration measurements, H t is a linear observation operator, which interpolates gridded CO 2 concentrations at observation times and locations.Michalak et al. (2005) used a similar objective function for estimating the statistical parameters in the atmospheric inverse problems of surface fluxes.

Analysis step
The perturbed analysis states are estimated as where ε t,i is a normal random variable with mean zero and covariance matrix µ t R t (Burgers et al., 1998).The analysis state x a t is estimated as x a t,i .
Finally, set t = t +1 and return to the forecast step (1) for the assimilation at next time step.The assimilated surface carbon fluxes are from all sources because the observed CO 2 concentrations arise from all sources.Then, the surface carbon fluxes from the biosphere are estimated by the assimilated total carbon fluxes minus carbon fluxes from other sources supplied by the forcing data.

Constructing error statistics using analysis
Let x t t be the true state.Then the ensemble forecast error should be defined as x f t,i − x t t .However, x t t is estimated by x f t in Eq. ( 4).Since x a t is derived by assimilating observations into the model, it is a better estimate of x t t than x f t , especially when the model error is large (Wu et al., 2013).Therefore, after the analysis step (3) in Sect.3.1, it is suggested to return to the error step (2), and substitute x f t in Eq. ( 4) by x a t .This procedure is repeated until the corresponding objective function (Eq.5) converges (Wu et al., 2013;Zheng et al., 2013).In this study, the iteration is stopped when the difference between the minima of −2L t (θ, µ) at nth and n + 1th iterations is less than 1.A flowchart of the proposed assimilation scheme is shown in Fig. 2. Inf refers to the case with inflation on forecast error covariance only, both Inf refers to the case with inflations on both forecast and observation error covariances and iteration refers to the case with both inflations and further using analysis to improve forecast error statistics.The closer χ 2 /n obs is to 1, the better the corresponding error estimates.

Removing carbon mass imbalance
In this study, the background CO 2 concentration field at the beginning of a week is the analysis state at the end of the previous week.It is then updated using the observations within the week; therefore, the estimated CO 2 concentration at the beginning of the week is different from that at the end of the previous week.This results in inexact carbon mass balance.To remove this imbalance, a corrected atmospheric CO 2 concentration is generated using the sequential forecast of CO 2 concentration with the optimized carbon fluxes from the very beginning of the entire assimilation period.The corrected CO 2 concentration is denoted by c ca t .

Validation statistics
Chi-square statistics (Tarantola, 2005) are used to test the error covariances constructed in this study.For the tth week, it is defined as where and θ, µ are the estimated inflation factors for the week.
If the forecast and observation error covariance matrix are correctly estimated, χ 2 2,Iter follows a Chi-square distribution with n obs degrees of freedom, where n obs is the number of observations within tth week.Since the mean and the vari- ance of χ 2 2,Iter /n obs are 1 and 2/n obs , respectively, the value of χ 2 2,Iter /n obs should be close to 1.The Chi-square statistics for the error covariance matrices without using the analysis state can be defined similarly to Eq. ( 8), but with X f t replaced by X f t .They are denoted as χ 2 0 , χ 2 1 and χ 2 2 for the cases of no inflation, inflation on forecast error only and inflation on both forecast and observation errors, respectively.The closer χ 2 j /n obs , j = 0, 1, 2 to 1 is, the better the corresponding error statistics.
The root mean square error (RMSE) of estimated CO 2 observations is defined as where y ca i (l) is generated by interpolating c ca t to the observation site l and time i, and L is the total number of the CO 2 concentration observations during the entire assimilation period.The smaller RMSE means better assimilation scheme.

Error covariance statistics
To validate the construction of error statistics used in this study, we plot the weekly time series of χ 2 2,Iter /n obs (Eq.8) from 2002 to 2003 in Fig. 3, which shows that the values are remarkably close to 1.In contrast, the weekly time series of χ 2 0 /n obs , χ 2 1 /n obs , and χ 2 2 /n obs (for the cases of no inflation, inflation on forecast error only, and inflation on both forecast and observation errors) are not as close to 1 as χ 2 2,Iter /n obs .This indicates that the construction of error statistics using the analysis state iteratively (Sect.3.2) is effective for correctly estimating the error statistics.Figure 3 also shows that χ 2 2 /n obs is closer to 1 than χ 2 1 /n obs , and both are closer to 1 than χ 2 0 /n obs .This suggests that the inflations on forecast error and observation error are also both effective in improving the estimation of error statistics.

Inclusion of CO 2 concentration in state vectors
In this study, the CO 2 concentration is included in state vectors.The benefit of this inclusion needs to be tested against the traditional approach without this inclusion.This issue is studied with the 1-week assimilation window.
For this purpose we design a comparative experiment as follows.In every week, the CO 2 concentration (i.e.c) is not updated (Eq.6).Instead the analysis CO 2 concentration is derived by sequentially predicting atmospheric CO 2 concentration forced by the updated flux within the week.The carbon mass is automatically balanced in this experiment.The results show that RMSE of the analysis CO 2 concentration observations (Eq.10) is 8.5 % larger than that of the corrected analysis CO 2 concentration described in Sect.3.3.This suggests that inclusion of CO 2 concentration in state vectors can significantly alter the CO 2 mass balance and may have an advantage in optimizing the surface CO 2 flux.
If the CO 2 concentration is not included in state vectors, the analysis CO 2 concentration at the beginning of each week is just the analysis CO 2 concentration at the end of the previous week, so the CO 2 concentration observations within the current week are not used to optimize the CO 2 concen-tration at the beginning of each week.However, when the CO 2 concentration is included in state vectors, all the observations within the current week and the previous weeks are used to estimate the CO 2 concentration at the beginning of the current week.So the CO 2 concentration at the beginning of each week estimated by inclusion of CO 2 concentration in state vectors could be more accurate than estimated without inclusion.Therefore, the estimated flux associated with the updated CO 2 concentration at the beginning of the current week should have better quality.This is more clearly demonstrated by smaller RMSE in Eq. ( 10) with the inclusion than that without the inclusion.

Length of assimilation window
Different lengths of the assimilation window are used in various systems (5 weeks in CarbonTracker, 3 and 7 days in Miyazaki et al., 2011, and6 h in Kang et al., 2012).We choose the 1-week assimilation window in our methodology for the following reasons.First, since most surface stations only have weekly observations, we need at least 1 week of data to cover the globe.Second, beyond 1 week the errors of the atmospheric transport model may be significant, and they are very difficult to quantify.Third, the detailed information of observations may be attenuated with time by atmospheric diffusion and advection (Enting, 2002).
For comparison to longer assimilation windows, the following alternative experiments with moving assimilation windows were carried out.In the first alternative experiment, the length of the moving window is set to be 2 weeks while the forecast time step is still 1 week.The CO 2 concentration observation system is still the same as that described in Sect.3, but is used to update the global carbon flux and the atmospheric CO 2 concentration within the current week and the previous week.This procedure is similar to Eq. ( 6), while the ensemble forecast state of the first week in the assimilation window is set as its ensemble analysis state at previous assimilation time step.Therefore, carbon fluxes and CO 2 concentration every week are optimized twice with the observations in the current week and the next week.The corrected analysis of CO 2 concentration is also retrieved from rerunning the atmospheric transport model as described in Sect.3.3.The second alternative experiment is similar to the first one, but with the 3-week moving window.
The linear trends for the observations, the estimates with 1-week, 2-week and 3-week moving windows are 2.14, 2.17, 1.59, and 1.13 ppm yr −1 , respectively.It seems that the longer the moving window is, the larger difference is the long-term growth rate to the measurements.For further investigating the reason, the annual mean carbon budgets on 11 Transcom regions are shown in Fig. 4. It can be found that the longer the moving window is, the larger are the carbon budget adjustments.Long windows result in underestimation of the corresponding long-term growth rate.
To further investigate the long time and long distance impact of atmospheric transport on CO 2 observations, components of CO 2 concentration at observation sites associated with different Transcom regions in each day before their observation times are calculated in the following way.For a given region and some day before the observation time, prior fluxes on other regions and in other days are all masked.Then the atmospheric transport model can be run with a homogeneous initial atmospheric CO 2 concentration and forced by the masked fluxes to obtain the corresponding CO 2 concentration components.
These components at individual sites are then averaged in time to investigate general impacts of carbon fluxes from different sources.The results at 7 selected sites are shown in Fig. 5.For these sites, CO 2 concentrations resulting from carbon fluxes within 25 days are mainly from local carbon fluxes within 7 days (although mostly within 3 days).Carbon fluxes beyond seven days or regions far from the observation locations have very small impacts, indicating that they have little information in observations (i.e. the contribution is less than observation error), even if the atmospheric transport model is accurate.Actually majority of observations (approximately 49) over continental sites used in this study have similar properties to these seven sites.If the errors of the transport and ecosystem models are considered, the information of fluxes 1 week before may be even more difficult to estimate.
The setting of length of the assimilation window is closely related to spatial and temporal localizations of forecast errors.For the observation network and the atmospheric transport model used in this study, the 1-week assimilation window seems most suitable.

Application and results
In this section we use the data assimilation methods described in Sect. 3 to estimate the land surface carbon fluxes from 2002 to 2008.

Adjustment to total carbon budget of BEPS
We first carry out a control run starting from 1 January 2002 with no adjustment of prior fluxes.The simulated CO 2 concentrations are interpolated at observation times and locations, and compared with real observations in the year 2005.The result in Fig. 6 (top) shows that the simulated concentrations have a bias of 2.945 ppm and an RMSE of 4.525 ppm, implying an underestimation of carbon sinks by BEPS.Using GCAS-EK to estimate the ecosystem fluxes, we carry out another control run and comparisons.The bias and RMSE are reduced to 0.967 and 3.675 ppm, respectively (Fig. 6, bottom).
It is worthwhile to point out that the underestimation of carbon sinks by BEPS is conditioned on the estimated carbon fluxes released by fossil fuel and fire, even if the ocean fluxes used in our assimilation system are accurate.As described in Sect.2, the observed variability of CO 2 concentration is due to the variability of carbon fluxes from all sources, including fossil fuel combustion, vegetation fire, oceanic uptake, and biosphere exchange.If non-biospheric carbon sources are underestimated, the carbon sinks from the biosphere simulated by BEPS would also be underestimated.Nevertheless, our adjustment to carbon sinks simulated by BEPS appears reasonable.similar, although they are quite different from that of Car-bonTracker 2011.

Multiyear average of the global carbon flux distribution
Carbon budgets are calculated based on the BEPS ecosystem types and the 11 Transcom regions (Fig. 8).Similar to the global distribution maps (Fig. 7), GCAS-EK carbon budgets (Fig. 8) have almost the same property in sources or sinks with that simulated by BEPS.However, they are quite different from that of CarbonTracker 2011 in many aspects.For example, for the C4 and the shrub in Australia, BEPS simulates carbon sources while CarbonTracker 2011 shows carbon sinks.Moreover, in North America, there is a large carbon sink increase of the GCAS-EK over the BEPS simulated.A further diagnostic (not shown here) reveals that, between October and April, the carbon sinks estimated by CarbonTracker 2011 are much larger than that estimated by GCAS-EK.But between May and September, the carbon sinks estimated by CarbonTracker 2011 and GCAS-EK are very close.

Interannual and seasonal variations
The interannual variations of the global total carbon budgets are shown in Fig. 9.It shows that CarbonTracker 2011 predicts the largest multiyear average carbon sink (−3.89PgC yr −1 ), compared with the smallest one simulated by BEPS (−2.23 PgC yr −1 ).The assimilated mean carbon sink (−3.87 PgC yr −1 ) is virtually identical to that estimated by CarbonTracker 2011.The carbon sinks simulated by BEPS and predicted by CarbonTracker 2011 obviously have more interannual oscillation than that assimilated by GCAS-EK.The monthly variations of the multiyear-averaged carbon budgets before and after the assimilation of BEPS results are compared with that by CarbonTracker 2011 in Fig. 10.Clearly, the seasonal variability of the carbon budgets by Car-bonTracker 2011 is the largest.The assimilated fluxes based on BEPS have larger sinks in the summer and smaller sources in the winter than those before the assimilation.

Comparison to other flux estimations
Two independent gridded carbon flux estimates are compared with GCAS-EK estimates.
The first independent data set is net carbon exchange of U.S. terrestrial ecosystems by Xiao et al. (2011) which is generated by integrating eddy covariance flux measurements and satellite observations from Moderate Resolution Imaging Spectroradiometer (MODIS), and is referred to as EC-MOD.The original data set is during 2002 to 2006 with spatial resolution of 1 km and temporal resolution of 8 day.For The carbon budgets estimated by GCAS-EK were also compared to those by Lauvaux et al. (2012), Penn State University (PSU) inversion and Colorado State University (CSU) inversion (Schuh et al., 2013) for the Mid-Continent Intensive (MCI) area from June to December 2007.The spatial patterns by GCAS-EK and CarbonTracker 2011 are similar to those estimated by PSU, CSU (Schuh et al., 2013) and Lauvaux et al. (2012) (not shown here).The regionalaveraged carbon sinks estimated by GCAS-EK and by Car-bonTracker 2011 are 0.19 and 0.26 PgC, respectively, while the averaged carbon sinks estimated by PSU and CSU (Schuh et al., 2013) and by Lauvaux et al. (2012) are between 0.14 and 0.18 PgC, which are closer to that estimated by GCAS-EK than that by CarbonTracker 2011.
Since the true values of carbon flux are unknown, the closeness to the independent gridded carbon flux estimates does not mean a better assimilation.However, these two ex-amples indicate that the carbon fluxes estimated by GCAS-EK may provide some useful new information of global carbon flux estimation to the atmospheric inversion community.Therefore, the development of the new assimilation system is worthwhile.

Conclusion
We propose a methodology to assimilate atmospheric CO 2 concentration into surface carbon fluxes simulated by an ecosystem model.In our framework, CO 2 concentration is included in the state vector, and the assimilation window is restricted to 1 week.Both forecast and observation errors are inflated, and forecast error statistics are estimated in an adaptive procedure using the analysis states.Generally speaking, these adaptive estimations improve the accuracy of assimilated error statistics in EnKF, which leads to further improvement in the accuracy of analysis states.Importantly, pre-assigned values of the observation error variance are improved if these adaptive procedures are applied.
The application of the methodology to real data shows that the assimilated total carbon budgets by GCAS-EK are comparable to those reported by CarbonTracker 2011.However, there are significant regional differences between carbon flux distributions assimilated by GCAS-EK and CarbonTracker 2011, which may be attributed to the differences between the ecosystem models, atmospheric transport models, or the assimilation methodologies.
In our future study, we will investigate the sensitivity of assimilation results to the accuracy of ecosystem and transport models.Also, more observation data sets, such as remotesensing CO 2 column data, will be introduced into the GCAS-EK.

Figure 3 .
Figure3.χ 2 statistics of the analysis state for four estimates of error covariance.Original refers to the case without inflations, one Inf refers to the case with inflation on forecast error covariance only, both Inf refers to the case with inflations on both forecast and observation error covariances and iteration refers to the case with both inflations and further using analysis to improve forecast error statistics.The closer χ 2 /n obs is to 1, the better the corresponding error estimates.

Figure 5 .
Figure 5. Mean components of CO 2 concentration at observation sites (Site IDs: LEF_01P0, BAL_01D0, WLG_01D0, BKT_01D0, BHD_01D0, MKN_01D0 and ABP_01D0) from 11 Transcom regions in each of the 25 days before the observation time; x axis refers to days before the observation time; y axis refers to the amount of CO 2 concentration in ppm.Different colors within a bar refer to CO 2 concentration from 11 different Transcom regions; 11 regions refer to North American boreal (N-Ame-B), North American temperate (N-Ame-T), South American tropical (S-Ame-Tr), South American temperate (S-Ame-T), northern Africa (N-Afr), southern Africa (S-Afr), Eurasia boreal (Era-B), Eurasia temperate (Era-T), tropical Asia (Tr-Asa), Australia (Aus) and Europe (Eur), respectively.

Figure 6 .
Figure 6.Comparisons between real observations and simulated concentrations by control runs: (top) control run forcing by prior carbon fluxes; (bottom) control run forcing by assimilated carbon fluxes by GCAS-EK.Both simulations start from 1 January 2002 and all simulated concentrations at observation locations and times in 2005 are compared here.

Figure 7
Figure 7 shows the distribution of the average global carbon budget from 2002 to 2008 where the two spatial patterns of carbon fluxes related to BEPS (BEPS and GCAS-EK) are

Figure 8 .
Figure 8. Annual mean carbon budgets (PgC yr −1 ) on areas with six BEPS plant function types in Transcom regions from 2002 to 2008.The errors of GCAS-EK fluxes are the root mean square errors of the ensemble.

Figure 9 .
Figure 9.Comparison of interannual variations of global carbon budgets from 2002 to 2008 by three products: BEPS, GCAS-EK and CarbonTracker 2011.
Land areas of six plant function types used in ecosystem model BEPS.

Table 1 .
Listed are 92 observation sites used in this study.r refers to prescribed observation error (µmol µmol −1 ).