A dual-pass carbon cycle data assimilation system to estimate surface CO 2 fluxes and 3 D atmospheric CO 2 concentrations from spaceborne measurements of atmospheric CO 2

Here we introduce a new version of the carbon cycle data assimilation system, Tan-Tracker (v1), which is based on the Nonlinear Least Squares Four-dimensional Variational Data Assimilation algorithm (NLS-4DVar) and the Goddard Earth Observing System atmospheric chemistry transport model (GEOS-Chem). Using a dual-pass assimilation framework that consists of a carbon dioxide (CO2) assimilation pass and a flux assimilation pass, we assimilated the atmosphere 15 column-averaged CO2 dry air mole fraction (XCO2), while sequentially optimizing the CO2 concentration and surface carbon flux via different length windows with the same initial time. When the CO2 assimilation pass is first performed, a shorter window of 3 days is applied to reduce the influence of the background flux on the initial CO2 concentration. This allows us to obtain a better initial CO2 concentration to drive subsequent flux assimilation passes. In the following flux assimilation pass, a properly elongated window of 2 weeks absorbs enough observations to reduce the influence of the initial CO2 20 concentration deviation on the flux, resulting in better surface fluxes. In contrast, the joint assimilation system Tan-Tracker (v0) uses the same assimilation window for optimization of CO2 concentration and flux, making the uncertainties in CO2 concentration and flux indistinguishable. The proper orthogonal decomposition (POD)-4DVar algorithm applied with the older system is only a rough approximation of the one-step iteration of the NLS-4DVar algorithm; thus, it can be difficult to fully resolve the nonlinear relationship between flux and CO2 concentration. In this study, we designed a set of observation 25 system simulation experiments to assimilate artificial XCO2 observations, in an attempt to verify the performance of the newly developed dual-pass Tan-Tracker (v1). Compared with the prior and joint system, the dual-pass system provided a better representation of the spatiotemporal distribution of the true flux and true CO2 concentration. We performed sensitivity tests of the flux assimilation window length and number of NLS-4DVar assimilation iterations. Our results indicated that the appropriate flux assimilation window length (14 days) and the appropriate number of NLS-4DVar maximum iterations (three) 30 could be used to achieve optimal results. Thus, the Tan-Tracker (v1) system, based on a novel dual-pass assimilation framework, provides more accurate surface flux inversion estimates and is ultimately a better tool for carbon cycle research. Geosci. Model Dev. Discuss., https://doi.org/10.5194/gmd-2019-54 Manuscript under review for journal Geosci. Model Dev. Discussion started: 13 May 2019 c © Author(s) 2019. CC BY 4.0 License.


Introduction
Since the Industrial Revolution, humans have consumed fossil fuels and emitted large amounts of carbon dioxide (CO2).About 50% of the CO2 remains in the atmosphere.The continuous rise in global atmospheric CO2 concentrations breaks the radiation balance of the Earth system, resulting in global climate change.The remaining CO2 is absorbed by the terrestrial ecosystem and oceans; however, there are still many uncertainties associated with these absorption mechanisms (Ballantyne et al., 2012;Le Qué ré et al., 2017).Determining the appropriate carbon budget for the Earth's ecosystem and oceans is important for the development of relevant climate policies and predictions of future scenarios, having been the focus of extensive carbon cycle research (Stocker et al., 2013).In recent years, there has been an increase in multi-source atmospheric CO2 concentration measurements and model development.The surface carbon flux inversion method, obtained by combining model and atmospheric CO2 information, has made great progress in carbon cycle data assimilation (Peters et al., 2005;Peters et al., 2007;Tian et al., 2014;Deng et al., 2016;Feng et al., 2016;Basu et al., 2013;Basu et al., 2018).
Many have attempted to optimize surface carbon flux measurements.For example, Carbon-Tracker (Peters et al., 2005;Peters et al., 2007) is a well-designed carbon assimilation system that uses Transport Model 5 (TM5) and the ensemble Kalman filter (EnKF) method (Evensen, 1994) to assimilate in situ CO2 observations.The Carbon Cycle Data Assimilation System (CCDAS) (Rayner et al., 2005) Kaminski et al., 2013) couples the Biosphere Energy-Transfer HYdrosphere (BETHY) model (Kaminski and Heimann, 2001) with the atmospheric transport model TM2, to assimilate satellite observations of photosynthetically active radiation and atmospheric CO2 concentration observations; the approach is a twostep process, in which the parameters of the carbon cycle model are first optimized to improve surface flux measurement accuracy.(Tian et al., 2014) uses the Goddard Earth Observing System atmospheric chemistry transport model (GEOS-Chem) and the identity matrix as a joint dynamical model; a proper orthogonal decomposition (POD)-based four-dimensional variational assimilation algorithm (POD-4DVar) (Tian et al., 2011) is combined with a joint assimilation framework to integrate in situ CO2 concentration observations, with simultaneous optimization of the CO2 concentration and flux.This method has obtained good results; however, there are still some problems associated with the joint assimilation framework.The same window lengths limit the ability to distinguish the CO2 concentration from the flux.Additionally, the POD-4DVar algorithm is only a rough approximation of a one-step iteration of the Nonlinear Least Squares (NLS)-4DVar algorithm (Tian and Feng, 2015;Tian et al., 2018).Although the above assimilation system has achieved reasonable results, the sparse and uneven spatial distributions of in situ stations greatly limit the flux optimization accuracy.Several unconventional data assimilation techniques have been explored.For example, Zhang et al. (2014)  With the launch of the Greenhouse gases Observing SATellite (GOSAT) (Kuze et al., 2009) and the Orbiting Carbon Observatory-2 (OCO-2) satellite (Crisp et al., 2017), satellite data assimilation experiments have also been conducted based on the atmosphere column-averaged CO2 dry air mole fraction (XCO2) at higher temporal and spatial resolutions.Basu et al. (2013) used TM5 4DVar to assimilate GOSAT observations, and showed that satellite data provided an effective constraint for surface carbon source-sink inversion.Tian et al. (2014) used Tan-Tracker (v0) to conduct GOSAT observation assimilations using a set of observing system simulation experiments (OSSEs), and found that the optimized CO2 concentration and flux showed expected results.Deng et al. (2016) used the GEOS-Chem and the 4DVar method to simultaneously assimilate GOSAT observations of the land and ocean.This method provided a better representation of the CO2 surface flux than others that used only terrestrial observations; additionally, the results indicated that increasing the observation coverage further improved the sensitivity of surface flux inversion measurements.Feng et al. (2016) used the EnKF to assimilate GOSAT observations in Europe; the flux inversion results obtained displayed a larger amplitude change than those using an in situ station.Basu et al. (2018) applied 4DVar OSSEs to OCO-2 observations with multiple atmospheric transport models; they showed that the wider global coverage provided by OCO-2 observations enabled better surface flux representation than in situ observations.Overall, flux results depend on the atmospheric chemical transmission mode used.The abovementioned assimilation attempts using satellite data have reduced the uncertainty associated with flux measurements and provided some insight into surface carbon flux mechanisms.However, the assimilation of satellite column-average concentration observations of XCO2 is still in the exploratory stage.
Based on GEOS-Chem and NLS-4DVar (Tian et al., 2018) assimilation of XCO2 satellite observations, we introduce the Tan-Tracker (v1) carbon cycle data assimilation system.The novel dual-pass data assimilation framework consists of a CO2 assimilation pass and a flux assimilation pass, which have the same initial time but different assimilation window lengths.Specifically, the first CO2 assimilation pass uses a shorter window of 3 days to reduce the influence of background flux on the initial CO2 readings.By minimizing the initial CO2 deviation, better initial CO2 concentrations are derived for the subsequent flux assimilation pass.In the following flux assimilation pass, a properly elongated window of 2 weeks absorbs enough observations to reduce the influence of the initial CO2 concentration deviation on the flux, resulting in a better representation of the surface flux.Compared with the joint Tan-Tracker (v0) assimilation system, the  system uses a dual-pass framework to mitigate the effects of the initial CO2 concentration on surface flux, while using a more advanced assimilation algorithm, NLS-4DVar, to improve the accuracy of the optimized flux results.This paper is divided into four sections.Section 2 introduces the method and the framework of the Tan-Tracker (v1) system and its coupling to the NLS-4DVar algorithm.In Section 3, we describe the OSSE design using OCO-2 observations, and compare Tan-Tracker (v1), , and control experimental results to true results.The flux obtained using Tan-Tracker (v1) exhibited a total spatiotemporal flux distribution and optimized CO2 concentration that were closer to those of the true flux.A summary and conclusions are presented in Section 4.

Dual-pass Tan-Tracker (v1) assimilation system framework
The dual-pass carbon cycle data assimilation system Tan-Tracker (v1) is divided into two assimilation passes: a CO2 assimilation pass and a flux assimilation pass, in addition to an update section (Fig. 1).Based on the NLS-4DVar (Tian and Feng, 2015;Tian et al., 2018) assimilation method for satellite column-average CO2 concentration measurements of XCO2, we optimized the CO2 concentration and surface CO2 flux in different lengths of assimilation windows with the same initial time 0 t of CO2 concentration.First, the CO2 assimilation pass is implemented.The shorter 3-day window reduces the influence of background flux on the initial CO2 measurements, minimizing the initial CO2 deviation to obtain a better initial F is the prior flux and b λ is a linear scale factor (Peters et al., 2005; Tian et al., 2014) for the assimilation window, we simulated the 3-day CO2 concentration b U used as the background CO2.In the flux assimilation pass (the red portion shown in Fig. 1), we assume that there is no error in anthropogenic emissions,

Coupling of NLS-4DVar with Tan-Tracker (v1) assimilation framework
The NLS-4DVar algorithm is used to solve the optimal initial perturbation ' a x to satisfy the incremental form of the 4DVar cost function: where ' b =− x x x is the perturbation of the background field b x at initial time t0, and , where the superscript T is the matrix transpose, the subscript b is the background value, As an ensemble-based assimilation approach, NLS-4DVar (Tian and Feng, 2015;Tian et al., 2018) assumes that the optimal analysis increment ' ( ) Substituting Eqs. ( 9), ( 10) and ( 11) into Eq.( 5), it can be rewritten as follows (Dennis and Schnabel, 1996), ( ) Thinking approximations (Tian and Feng, 2015): where I denotes the N×N identity matrix.The Gauss-Newton iteration for the non-linear least squares problem ( 12) is defined by (Dennis and Schnabel, 1996): Substituting Eqs. ( 13) and ( 16) into Eq.( 17), the cost function Eq. ( 5) can be rewritten as the least squares form of the control variable β (Tian and Feng, 2015) :  Using an ensemble-estimated e B to replace the background error covariance matrix B will bring a spurious correlation that can be eliminated by a localization scheme.An efficient local correlation matrix decomposition approach (Zhang and Tian, 2018) can be used to quickly assimilate a large number of observations while ensuring the assimilation results, especially for satellite data assimilation with high spatiotemporal resolution.Its implementation in NLS-4DVar is as follows: .
is the correlation matrix between the model grids and observation positions constructed by the following fifthorder piecewise rational function (Gaspari and Cohn, 1999): where 0 C is defined as  In the Tan-Tracker (v1) assimilation system, the optimization variables for different assimilation passes differ.In the CO2 assimilation pass, the optimized state variable x is the CO2 concentration U , and ' a x is the increase in the initial CO2 concentration.IPs x P are the initial perturbations of the CO2 concentration, and OPs y P are the perturbations of simulated XCO2 within the 3-day window; k H is the observation operator of XCO2 given in Eq. ( 31).For the flux assimilation pass, state variable x is the scale factor λ , and ' a x is the increase in the scale factor within the window.IPs H can be considered as a two-part chemistry transport model and the observation operator of the column-average concentration XCO2.

Ensemble generation and update of the Tan-Tracker (v1) assimilation system
The NLS-4DVar assimilation algorithm is an ensemble-based algorithm that is used to approximate the analysis incremental solution space with the ensemble perturbation sample space.As such, the generation and update of the ensemble samples are essential for assimilation accuracy.According to the characteristics of CO2 and the flux assimilation pass, we designed different sampling and updating methods for the Tan-Tracker (v1) assimilation system.
A historical moving sampling scheme (Wang et al., 2010;Tian et al., 2014) was used in the CO2 assimilation pass to select samples from a long-term historical CO2 simulation, and a resampling scheme was used in the new assimilation window.
The advantage of selecting samples from the historical simulation is that the appropriate sample size can be selected to ensure good results at a low computational cost.In this study, N = 160 was selected in the experiments to achieve a better assimilation effect.
To ensure better flux results and minimize computational cost, we chose an ensemble number of N = 36 in the flux assimilation pass, integrating from the same initial CO2 concentration within each window; all ensembles ran throughout the entire assimilation process.The ensemble generation scheme of the flux assimilation pass combines the history sampling and ensemble update.The historical sampling was applied to the initial window, and the N = 36 initial ensemble members were selected by a moving strategy.Ensemble samples of subsequent windows were obtained using the ensemble update given by the Local Ensemble Transform Kalman Filter (Hunt et al., 2007;Tian and Xie, 2012): T .As the assimilation cycle progresses, the above ensemble update method usually reduces the dispersion of ensemble samples (Wang and Bishop, 2003), leading to an approximate distortion of the ensemble space a x P with respect to the solution space ' a x ; this ultimately causes the assimilation to fail.Therefore, we used an inflation factor  (see Zheng et al. (2013) for more details) with the ensemble perturbation a x P , in which a x P maintained the dispersion of the ensemble samples; this is referred to as adaptive ensemble inflation.

Model settings and observations
The Tan-Tracker (v1) carbon cycle data assimilation system is based on the global three-dimensional (3D) atmospheric chemistry model GEOS-Chem (version: v11-01, http://acmg.seas.harvard.edu/geos),driven by meteorological inputs of Modern-Era Retrospective analysis for Research and Applications (MERRA-2) from the GEOS of the National Aeronautics and Space Administration (NASA, United States) Global Modeling and Assimilation Office.The original GEOS-Chem CO2 simulation was developed by Suntharalingam et al. (2004).A major update to the CO2 simulation was completed by Nassar et al. (2010).The latest update to the CO2 simulation was developed by Nassar et al. (2013) and appears in GEOS-Chem v10-01, which was released in 2015.In the following experiments, we used the same spatiotemporal resolution: a horizontal resolution of 2° × 2.5° (latitude × longitude), 47 vertical layers, a chemical time step of 20 min, a transmission time step of 10 min, and an output time of 3 h for the CO2 concentration.
The fluxes used to drive GEOS-Chem for the CO2 simulation were integrated and provided by the Harvard-NASA Emissions Component (HEMCO) model (Keller et al., 2014).There are seven emission inputs from the following sources: fossil fuel, ocean exchange, terrestrial ecosystem fluxes, biomass burning, ships, aviation, and chemical oxidation.Fossil fuel emissions were acquired from the Open-source Data Inventory of Anthropogenic CO2 (ODIAC) (Oda and Maksyutov, 2011) daily emissions data.Ocean exchange emissions were obtained from daily scaling data by Takahashi et al. (2009).
Terrestrial ecosystem fluxes, specifically balanced biosphere exchange with a seasonal cycle but zero net annual uptake, were taken from the hourly data provided by the Simple Biosphere (SBI3) model (Baker et al., 2006;Messerschmidt et al., 2013).Biomass burning emissions were obtained from the Global Fire Emissions Database v4 (GFED4) (Randerson et al., 2018) daily biomass burning data.Ship emissions were based on monthly scaling data from Endresen et al. (2007).Aviation emissions were derived from monthly scaling data (Olsen et al., 2013) from the Aviation Emissions Inventory Code (AEIC) (Simone et al., 2013).Sources of carbonaceous compound oxidation were taken from monthly data provided by Nassar et al. (2010).The settings and parameters for TT_v0 can be found in Tan-Tracker (Tian et al., 2014), where only observation data, model versions, and prior flux replacements were performed.A comparison of the parameter settings of TT_v0 and TT_v1 is shown in Table 2.After the sensitivity test, the localization radii of the CO2 assimilation pass and flux assimilation pass were both selected to be 2000 km, and the localization truncation modes numbers were 50 x r = and 30 y r = (see Zhang and Tian (2018) for details regarding the selection of localization-related parameters).

CO2 concentration
The CO2 concentration reflects the state of the atmospheric carbon pool and can be used as a basic indicator for verification in flux inversion.Here, we analyzed the CO2 concentration results in detail from the time series and spatial distributions.We used the time series of the daily root-mean-square error (RMSE) and the time series of the mean deviations to characterize the deviations of Ctrl, TT_v0, and TT_v1 from True (Fig. 3).Overall, the indicators in Figure 3 showed that the results after the assimilation were better than the background results.
The daily RMSE of XCO2 between the simulation/assimilation and artificial observations (Fig. 3a), representing the change in column-average concentration at the observed position, provides a comparison between O-B and the difference between observations and assimilation (O-A) to explain the effectiveness of the assimilation.The results in Figure 3a showed that the two versions of the Tan-Tracker carbon cycle data assimilation system effectively absorbed observations for flux optimization, with the TT_v1 showing slightly better performance than TT_v0 and superior performance with respect to that of Ctrl.
Daily RMSE (Fig. 3b) and the daily mean bias (Fig. 3c) of the atmospheric 3D CO2 concentration between the simulation/assimilation results and True reflect the changes in the atmospheric carbon pool.Figure 3b shows the deviation of January to February, maintaining a lower value of 0.2 ppm from March to June.After a slight increase to 0.2-0.6 ppm from July to September, the deviation between TT_v1 and True finally fell back to 0.2 ppm from October to December.
Figure 3c shows the daily mean bias between the simulation/assimilation results and True.The daily mean bias of TT_v1 dropped rapidly from −0.4 to 0 ppm and then remained low (-0.05 to 0.05 ppm); this performance was superior to that of Ctrl, which showed a larger bias amplitude.Thus, TT_v1 exhibited a faster spin-up convergence speed and a smaller deviation over the entire simulation time than Ctrl; these improvements were attributed to an adjustment in the optimized flux.The effect of the initial CO2 optimized by the CO2 assimilation pass occurred only at the initial time of each window, thus only a small adjustment to the state of the atmospheric carbon pool, and mainly served to improve the accuracy of the optimized flux.This was achieved given the good continuity of the CO2 results (Figs. 3b and 3c).The results of TT_v0 were better than those of Ctrl but slightly inferior to those of TT_v1.
Figure 4 shows the spatial deviation between the simulation/assimilation results and True based on the RMSE spatial distribution of the vertical-averaged CO2 concentration grid time series.Figure 4a displays the RMSE spatial distribution between Ctrl and True.Large values over land appeared in Western Siberia (1.0-1.2 ppm) and Eastern Siberia, Eastern Central Asia, Eastern North America, and Central South America (0.8-1.0 ppm).Large values over the ocean appeared in the Northern Hemisphere, with an increasing bias trend from the Southern Hemisphere to the Northern Hemisphere (0.2-0.5 ppm).The results of TT_v1 were better than those of Ctrl, with a large bias over land of 0.3-0.5 ppm; the increasing bias trend over the ocean was lower at 0.1-0.2ppm.The results of TT_v0 were better than those of Ctrl and slightly inferior to those of TT_v1.

Flux
In real assimilation experiments, CO2 concentration results can be used as the main objective indicator of flux evaluation due to the lack of a real flux.However, in OSSEs, we can analyze the prior flux quantitatively and optimize the flux and real flux to give the most direct judgment.Below we present a detailed analysis of flux using time series, annual total amounts, and regional distributions. to February), a larger portion of the optimized flux increment was used to adjust the initial CO2 concentration deviation from the true simulation.As a result, the deviation between the optimized flux and the true flux was larger than the prior value (Fig. 5b); however, the CO2 concentration deviation continued to decrease (Fig. 3b).As the assimilation progressed, the concentration deviation became more stable.At this time point, the uncertainties in CO2 concentration and flux could not be distinguished; as such, the assimilation continued to run, allowing for adjustments to the flux and concentration.Finally, the deviations caused during the corresponding flux and concentration optimization processes were minimized.Here, we mainly discuss the flux results from May to December after reaching equilibrium.
The prior flux (Ctrl) was in good agreement with the true flux (True) (Fig. 5a).Additionally, a significant seasonal cycle was evident (Fig. 5a).April to September is the growing season of the Northern Hemisphere, when the total flux of the global terrestrial ecosystem and oceans is negative, reaching its lowest value in July and August.From October to March, corresponding to the non-growth season in the Northern Hemisphere, the global flux was positive, and there was no obvious monthly change.The main deviation of the prior flux (Ctrl in Fig. 5b) appeared in the Northern Hemisphere growing season from June to August, reaching −4.0 PgC yr −1 .In addition, there was a significant deviation of about 0.2 PgC yr −1 during the non-growth season of the Northern Hemisphere from October to December.The TT_v1 optimized flux of the dual-pass system showed significant improvement over Ctrl.The deviation was reduced to 0.0 PgC yr −1 from June to August, and the deviation decreased to 0.1 PgC yr −1 from October to December (Fig. 5b).The results from TT_v0 were better than those of Ctrl, but slightly inferior to those of TT_v1.Table 3 and Figure  We used the TransCom "super-regions" (Gurney et al., 2002) to calculate the regional total flux.Figure 7 shows the flux results of 11 land regions and the deviation from True.The results of TT_v1 had a positive effect on each region relative to the prior flux of Ctrl, with significant improvements in the mid-to-high latitudes of North America, Europe, and Eurasia, and the mid-latitudes of South America and Australia.The results in the equatorial region of South America and Asia did not show significant improvements.The prior flux in Africa was close to the true value; an increase was not obvious in the data.
TT_v0 showed slightly improved results compared with Ctrl, but both were inferior to the performance of TT_v1.

Sensitivity experiments
The parameters of the carbon cycle data assimilation system Tan-Tracker (v1) are listed in Table 2.The main parameters are the assimilation window length and the maximum NLS-4DVar assimilation iteration number of the flux assimilation pass, as described below.8a,) it could be concluded that the assimilation experiments of all three windows had positive effects; however, the assimilation results of v1_14 were better than those of v1_07, which was better than those of v1_30.The CO2 concentration results (Fig. 8c) showed that the assimilation experiments of all three windows had positive effects.The assimilation results of v1_07 were roughly equivalent to those of v1_14, both of which were better than those of v1_30.Thus, flux assimilation is sensitive to the length of the assimilation window.The window of the appropriate length (14 days) had a small initial CO2 concentration deviation, the appropriate integration time, and was closest to the OCO-2 satellite 16-day regression period, i.e., it was possible to absorb more observations to obtain good flux inversion results.
As the maximum NLS-4DVar iteration number increases, the assimilation results tend to converge, especially for solving the problem of high nonlinear systems.However, the computational cost increases with the number of iterations.The sensitivity experiments of the maximum NLS-4DVar iteration number selected one (Imax = 1), two (Imax = 2), and three (Imax = 3) iterations, with the remaining parameters retaining the values of TT_v1.The resulting flux and concentration results are shown in Figure 9.The time series of the monthly total flux (Fig. 9a) and the CO2 concentration (Fig. 9c) results showed that the assimilation results improved and tended to converge quickly as the number of maximum NLS-4DVar iterations increased.Considering the computational cost, we chose three maximum NLS-4DVar iterations as the final solution.

Conclusion
We conducted an assimilation of the aircraft observation Comprehensive Observation Network for Trace gases by Airline (CONTRAL) based on Carbon-Tracker.
CO2 concentration to drive the flux assimilation pass.In the following flux assimilation pass, a properly elongated window of 2 weeks absorbs enough observations to reduce the influence of the initial CO2 concentration deviation on the flux.The evolution of the CO2 concentration in the assimilation window is dominated by the flux for improved accuracy of surface flux measurements.The update section guarantees a connection between the two adjacent assimilation windows, in which the initial CO2 concentration and background flux of the CO2 assimilation pass are provided for the next window, allowing the background flux and flux ensembles of the flux assimilation pass to be updated.The CO2 assimilation pass is shown in the blue portion of Figure1.Given that NLS-4DVar is an ensemble-based hybrid assimilation algorithm, we first prepared a set of 3-day-length CO2 concentration ensembles, , where S denotes the ensembles and N is the ensemble number.In the CO2 assimilation pass, we used N = 160 be used as the initial CO2 of the flux assimilation pass.

F
and only optimize the terrestrial ecosystems flux and oceans flux: is the prior flux, with bio referring to the flux from the terrestrial biosphere, and oce representing the flux from the ocean.Starting from the optimized initial CO2 0 , at U , forcing by a set of prepared flux ensembles: Dev.Discuss., https://doi.org/10.5194/gmd-2019-54Manuscript under review for journal Geosci.Model Dev. Discussion started: 13 May 2019 c Author(s) 2019.CC BY 4.0 License.we obtain a set of 2-week CO2 ensembles of scale factors (see Section 2.3).Using an optimization variable for the flux and considering computational cost, we chose N = 36.is shown as the black portion of Figure 1.Starting from the optimized initial assimilation cycle forcing by optimized fluxes , ar F , and integrating through the window of the flux assimilation pass to the end, we obtain the background initial CO2 concentration 0 ., 1 b t r + U of the (r+1)th assimilation cycle.Unlike the joint Tan-Tracker (v0) system, the background initial CO2 concentration of Tan-Tracker (v1) is obtained by a running model, as opposed to a direct assimilation, thus eliminating the problem of CO2 over-optimization.Similar to the approach of Peters (2007), the (r+1)th background flux, applied using the mean value of the two previous time steps' scale factors a: the nonlinear forecast model integrating from t0 to tk, and B and Rk are the background error and observational error covariance matrices, respectively.For simplicity, Dev.Discuss., https://doi.org/10.5194/gmd-2019-54Manuscript under review for journal Geosci.Model Dev. Discussion started: 13 May 2019 c Author(s) 2019.CC BY 4.0 License.

ax
can be expressed by a linear combination of the pre-prepared initial perturbations (IPs): can replace the background error covariance matrix B with an ensemble perturbation estimate: R has the Cholesky factorization, Geosci.Model Dev.Discuss., https://doi.org/10.5194/gmd-2019-54Manuscript under review for journal Geosci.Model Dev. Discussion started: 13 May 2019 c Author(s) 2019.CC BY 4.0 License.

ijd
is the spatial spherical distance between ith model grid and jth observation, m n is the model grid number, o n is the observation number, and r is the number of selected truncation modes.
perturbations and simulated XCO2 perturbations, respectively, within the 2-week window.At this point, the observation operator k

P
represents the updated ensemble perturbations, and the transformation matrix T is given by Dev.Discuss., https://doi.org/10.5194/gmd-2019-54Manuscript under review for journal Geosci.Model Dev. Discussion started: 13 May 2019 c Author(s) 2019.CC BY 4.0 License.Equation (28) indicates that the updated ensemble perturbation a x P can be obtained from the initial perturbation x P and a transformation matrix Geosci.Model Dev.Discuss., https://doi.org/10.5194/gmd-2019-54Manuscript under review for journal Geosci.Model Dev. Discussion started: 13 May 2019 c Author(s) 2019.CC BY 4.0 License.https://www.esrl.noaa.gov/gmd/ccgg/carbontracker/),interpolated from a global resolution of 2° × 3° (latitude × longitude), with 25 vertical layers, to the GEOS-Chem model grid resolution.We designed a set of OSSEs as shown in Table 1.Experimental True represents the true simulation, starting from the initial CO2 of the Carbon-Tracker global CO2 at time 20151101 (for short: CT20151101), running from 20151101 to 20161231.Forcing was driven by true fluxes: the terrestrial ecosystem flux of SIB3 in 2010 and the Takahashi ocean flux in 2010.True were constructed as discussed in Section 3.1.We also designed a background simulation control run (denoted as Ctrl), an assimilation experiment Tan-Tracker (v0) (denoted as TT_v0), and an assimilation experiment Tan-Tracker (v1) (denoted as TT_v1), with the same initial CO2 CT20160101, the same running time from 20160101 to 20161231, and the same prior (background) fluxes: the terrestrial ecosystem flux of SIB3 in 2009 and the Takahashi ocean flux in 2009.The rest of the model settings remained the same as in True, with the difference being that TT_v0 and TT_v1 were assimilation experiments, assimilating artificial observations 2 Geosci.Model Dev.Discuss., https://doi.org/10.5194/gmd-2019-54Manuscript under review for journal Geosci.Model Dev. Discussion started: 13 May 2019 c Author(s) 2019.CC BY 4.0 License. the simulation/assimilation results from True.The deviation between Ctrl and True decreased from 1.4 to 0.4 ppm at the initial time from January to February, and remained low (0.4 ppm) from March to June; this showed that the initial concentration deviation was reduced gradually, which could be considered as a model spin-up process.The deviation increased from July to September from 0.6 to 1.0 ppm, which indicated that there was a large deviation between the prior flux and the true flux in the Northern Hemisphere growth season.Finally, the deviation from October to December fell back to 0.3-0.4ppm, indicating a decrease in the deviation between the prior flux and the true flux in the non-growth season of the Northern Hemisphere.The deviation between TT_v1 and True decreased from 1.4 to 0.2 ppm at the initial time from

Figure 5
Figure5shows the time series of the simulation/assimilation results of the monthly global total ecosystem, the ocean flux, and their deviations from True.Notably, similar to the spin-up process of the numerical model simulation, the first 4 months corresponded to the spin-up process of the flux assimilation pass.During the early stages of the spin-up phase (from January 6 show the assimilation/simulation deviations of the terrestrial ecosystem flux, ocean flux, and global total flux from True from May to December.Compared with Ctrl, the results of TT_v1 were better optimized for the terrestrial ecosystem flux and slightly improved for the ocean flux.In addition, the results of TT_v0 were better than those of Ctrl, but slightly inferior to those of TT_v1 for the terrestrial ecosystem flux and slightly superior to those of TT_v1 for ocean flux.
The flux assimilation pass window length determines the influence of the initial CO2 concentration and the time of transmission, thus affecting the flux inversion.The sensitivity experiments of the assimilation window were used to select a Geosci.Model Dev.Discuss., https://doi.org/10.5194/gmd-2019-54Manuscript under review for journal Geosci.Model Dev. Discussion started: 13 May 2019 c Author(s) 2019.CC BY 4.0 License.window length of 7 days (denoted as v1_07), 14 days (denoted as v1_14), or 30 days (denoted as v1_30); the other TT_v1 parameters remained unchanged.The flux and concentration results are shown in Figure 8. From the time series of the total flux (Fig. designed a new version of a carbon cycle data assimilation system,, based on the atmospheric chemical transport model GEOS-Chem and an advanced NLS-4DVar data assimilation algorithm.Using a dual-pass assimilation framework consisting of a CO2 assimilation pass and a flux assimilation pass, we assimilated atmospheric CO2 observations to obtain an optimized representation of the surface carbon flux.Compared with the joint assimilation system Tan-Tracker (v0), the dual-pass assimilation system Tan-Tracker (v1) innovatively uses a dual-pass assimilation framework to successively optimize CO2 concentration and surface carbon flux in different assimilation passes.Optimization of the CO2 concentration uses a shorter assimilation window to reduce the effects of background flux for a more accurate initial CO2 concentration measurement.Flux optimization uses a longer assimilation window, allowing the system to absorb enough observations to optimize the flux while reducing the effects of the initial CO2 concentration deviation, resulting in more accurate surface flux estimates.We designed a set of OSSEs based on OCO-2 satellite data, which we compared with the Tan-Tracker (v0) joint assimilation system.The Tan-Tracker (v1) performance was superior to that of in resolving the CO2 concentration and surface flux estimates, and was far better than direct background simulations.Thus, the dual-pass assimilation strategy offers an advantage in satellite carbon cycle data assimilation.The results of the sensitivity experiment of window length and Geosci.Model Dev.Discuss., https://doi.org/10.5194/gmd-2019-54Manuscript under review for journal Geosci.Model Dev. Discussion started: 13 May 2019 c Author(s) 2019.CC BY 4.0 License.

Figure
Figure 2. Spatial distribution of artificial observations 2 , CO O X

Figure 4 .
Figure 4. Root-mean-square error (RMSE) spatial distribution of vertical-averaged CO2 concentration grid time series.RMSE between a. Ctrl and True; b.TT_v0 and True; c. TT_v1 and True.

Figure 5 .
Figure 5.Time series simulation/assimilation results of the monthly global total ecosystem and ocean flux and their deviation from the truth True: a. monthly total flux; b. monthly delta flux.

Figure 6 .
Figure 6.Total flux from May to December and its deviation from True.

Figure 7 .
Figure 7. Monthly total flux of 11 land regions of TransCom "super-regions" and its deviation from True: a. flux of each region; b. deviation from True.

Figure 8 .
Figure 8. Window length sensitivity experiment results: a. monthly total flux; b. monthly total flux deviation; c. daily root-meansquare error (RMSE) of CO2 concentration between the simulation/assimilation results and True.

Figure 9 .
Figure 9. Maximum Nonlinear Least Squares Four-dimensional Variational Data Assimilation algorithm (NLS-4DVar) iteration sensitivity experimental results: a. monthly total flux; b. monthly total flux deviation; c. daily root-mean-square error (RMSE) of CO2 concentration between the simulation/assimilation results and True.5