Assimilating the dynamic spatial gradient of a bottom-up carbon flux estimation as a unique observation in COLA (v2.0)

Liu, Zhiqiang; Zeng, Ning; Liu, Yun; Kalnay, Eugenia; Asrar, Ghassem; Cai, Qixiang; Han, Pengfei

doi:10.5194/gmd-2023-15

Preprints

https://doi.org/10.5194/gmd-2023-15

Preprints

Submitted as: development and technical paper

21 Feb 2023

Submitted as: development and technical paper |

| 21 Feb 2023

Status: this preprint was under review for the journal GMD but the revision was not accepted.

Assimilating the dynamic spatial gradient of a bottom-up carbon flux estimation as a unique observation in COLA (v2.0)

Zhiqiang Liu, Ning Zeng, Yun Liu, Eugenia Kalnay, Ghassem Asrar, Qixiang Cai, and Pengfei Han

Abstract. Atmospheric inversion of high spatiotemporal surface CO₂ flux without dynamic constraints and sufficient observations is an ill-posed problem, and a priori flux from a "bottom-up" estimation is commonly used in "top-down" inversion systems for regularization purposes. Ensemble Kalman filter-based inversion algorithms usually weigh a priori flux to the background or directly replace the background with the a priori flux. However, the "bottom-up" flux estimations, especially the simulated terrestrial-atmosphere CO₂ exchange, are usually systematically biased at different spatiotemporal scales because of the deficiencies in understanding of some underlying processes. Here, we introduced a novel regularization algorithm into the Carbon in Ocean‒Land‒Atmosphere (COLA) data assimilation system, which assimilates a priori information as a unique observation (AAPO). The a priori information is not limited to "bottom-up" flux estimation. With the comprehensive assimilation regularization approach, COLA can apply the spatial gradient of the "bottom-up" flux estimation as a priori information to reduce the bias impact and enhance the dynamic information concerning the a priori "bottom-up" flux estimation. Benefiting from the enhanced signal-to-noise ratio in the spatial gradient, the global, regional, and grided flux estimations using the AAPO algorithm are significantly better than those obtained by the traditional regularization approach, especially over highly uncertain tropical regions in the context of observing simulation system experiments (OSSEs). We suggest that the AAPO algorithm can be applied to other greenhouse gas (e.g., CH₄, NO₂) and pollutant data assimilation studies.

Received: 01 Feb 2023 – Discussion started: 21 Feb 2023

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Zhiqiang Liu, Ning Zeng, Yun Liu, Eugenia Kalnay, Ghassem Asrar, Qixiang Cai, and Pengfei Han

Status: closed

RC1:
'Comment on gmd-2023-15', Anonymous Referee #1, 16 Apr 2023
General comments
This paper presents a new algorithm for ingesting information about surface CO2 fluxes from bottom-up estimations into a top-down atmospheric inversion system. The main innovation is that instead of using the bottom-up fluxes as a priori fluxes in the inversion, the spatial gradients of the fluxes are instead assimilated as observations, which the authors argue increases the signal-to-noise ratio and avoids contaminating the analysis with biases in the bottom-up fluxes.
While the idea of assimilating the spatial flux gradients rather than the fluxes themselves is interesting and worth investigating, I had a hard time assessing the manuscript on scientific quality and validity because of incomplete information, confusing use of terms, and distracting issues in the text and figures. More detailed comments are provided below. I believe addressing these issues goes beyond the scope of a major revision, and recommend the authors to substantially rework the manuscript in terms of content and presentation. Only then can a proper assessment be made about the scientific quality.
1. Incomplete information
Many important pieces of information are missing from the manuscript, which makes it difficult to follow and fully understand the experiments and results. Sometimes the reader is referred to previous papers (e.g., Liu et al. (2019, 2022) and Kang et al. (2012)), and at other times it is not clear where to find the missing information.
I am not saying that the authors need to explain everything in this manuscript— for example, I think it is perfectly fine to refer to other papers for details about the LETKF algorithm— but there are many cases where the lack of information hinders the reader’s comprehension of the present study. Often this information can be provided in one or two short sentences, so text length should not be a problem. Here are just some examples of information that I miss in the current manuscript:
Ensemble size, and if the results are sensitive to ensemble size.

What the background (both fluxes and concentrations) at t=0 is.

How the ensemble members were perturbed at t=0 (i.e., what is the assumed initial background error covariances?).

The time resolution of the assimilated observations. The text says “For satellite data, we used a 10-second averaged …”, but I assume that the system did not actually assimilate 10-second satellite observations? Were the observations averaged to daily values before being assimilated? For in situ observations, were observations from all hours used, or only e.g. measurements that were identified as being representative of large well-mixed air masses?

The manuscript should be explicit about OCO-2 observations being column-integrated CO2 (XCO2) and explain how the synthetic XCO2 observations were calculated from the truth simulation.

The text refers to Liu et al. (2022) for information about the observation errors, but it would be nice if a short description could be provided here as well.

The inflation method is not sufficiently explained. I assume the mean of the additive perturbations were subtracted to not influence the ensemble mean (similar to what was done in Liu et al. (2022))? Were the magnitudes of the perturbations also scaled to a predefined value, and if so, what is the value?

The text mentions a short (1 day) and a long (7 days) assimilation window. How were the parameters adjusted within the long assimilation window? Liu et al. (2022) states that the parameters were adjusted on a daily basis within the long assimilation window— if it is the same in this study, it should be explicitly stated. Or are the parameters assumed to be constant within the assimilation window?

What are the temporal error correlations for the parameters within the same assimilation window, and do they allow the parameters to be adjusted differently on a daily basis?

What is the temporal resolution of the flux products? From my understanding VEGAS can simulate daily fluxes, but what about CASA, are those monthly fluxes? If so, how were they downscaled to daily values?

Were the fluxes/spatial gradients of fluxes assimilated before or after the CO2 observations? Does the order matter?

This is not an exhaustive list, and I do not think it should be the reviewers’ job to list everything that should be covered. The authors should think carefully about what information is relevant to include in the manuscript.
References
Kang, J.-S., E. Kalnay, T. Miyoshi, J. Liu, and I. Fung (2012): Estimation of surface carbon fluxes with an advanced data assimilation methodology. Geophys. Res. Atmospheres, 117, D24101, doi:10.1029/2012JD018259.
Liu, Y., E. Kalnay, N. Zeng, G. Asrar, Z. Chen, and B. Jia (2019): Estimating surface carbon fluxes based on a local ensemble transform Kalman filter with a short assimilation window and a long observation window: an observing system simulation experiment test in GEOS-Chem 10.1. Geosci. Model Dev., 12, 2899–2914, doi:10.5194/gmd-12-2899-2019.
Liu, Z., N. Zeng, Y. Liu, E. Kalnay, G. Asrar, B. Wu, Q. Cai, D. Liu, and P. Han (2022): Improving the joint estimation of CO2 and surface carbon fluxes using a constrained ensemble Kalman filter in COLA (v1.0). Geosci Model Dev, 18, doi:10.5194/gmd-15-5511-2022.
2. Confusing use of terms
There are several terms that are used in a way that is confusing or add little to no value. Again, here are just some examples:
“Unique” is used throughout the manuscript, for example, “as a unique observation” (Line 21), “a unique feature of a short assimilation window” (Line 80), and “Another unique feature” (Line 82). I do not see the point of stating that these features are unique. For assimilating a priori information as observations, this has been done before (a recent example is Kaminski et al., 2022). For the first case, maybe the authors meant “as a special observation” or something similar? Similar to the first point, “comprehensive” is often used (Lines 22, 59, 111, 292), but it is not clear in what sense the described objects are comprehensive.

I found it confusing that fluxes from the CASA model are described as a priori fluxes even though they were not used as a prior in the inversions.

Maybe refer to them as “bottom-up fluxes” instead?

Many acronyms are confusing. For example, in Summary and Discussion, AAPO is redefined on Line 290, but it is not clear in this context how the acronym was formed.

Many equations introduce new acronyms, for example TSG and PSG in Equation (5), and ASG in Equation (8). How about using a different variable (e.g., ∇f) or a subscript (e.g., “SG”) for the spatial gradient of the fluxes, and then superscripts A, T, and P for analysis, truth, and prior, respectively? This would reduce the number of acronyms the reader has to keep track of.

Equation (6) introduces the “S” superscript, which I assume stands for “SCF”, but it is not defined or used anywhere else in the manuscript, which is confusing. I would suggest dropping the superscript, or have “SG” and “f” as subscripts in Equations (5) and (6) instead. Generally, try to be consistent throughout the manuscript.

Similarly, what do "NP”, "P”, "ASG", and "AP" in the experiment names stand for? (I have my guesses, but it should not be the reader’s job to figure this out.)

Line 138 says that “COLA is a flow-dependent ensemble-based DA”. The term “flow-dependent” makes sense in the context of e.g. atmospheric data assimilation, where the errors depend (and often follow) the atmospheric flow, but it is not clear what exactly it means here. Given that there is no dynamic model for the parameters, how is the system flow-dependent?

The text uses “significant/significantly” in many places. I would recommend reserving these words for describing statistical significance when comparing results. If the authors performed statistical significance testing, they should state what tests were used and the significance level.

Lines 289–290 state that “the spatial gradient of a bottom-up model estimation is dynamically assimilated”. What is “dynamically” referring to here?

References
Kaminski, T., M. Scholze, P. Rayner, M. Voßbeck, M. Buchwitz, M. Reuter, W. Knorr, H. Chen, A. Agusti-Panareda, A. Löscher, and Y. Meijer (2022): Assimilation of atmospheric CO2 observations from space can support national CO2 emission inventories. Environmental Research Letters, 17, 014015, doi:10.1088/1748-9326/ac3cea.
3. Language and figures
The language could be improved in several instances, for example:
Articles are sometimes missing or incorrectly added, or the noun should be plural instead of singular or vice versa. Here are just a few examples: “and a priori flux from …” -> “and a priori fluxes from” (Line 15), “weigh a priori flux to the background” -> “weigh a priori fluxes to the background” (Line 16), “because of the deficiencies in understanding” -> “because of deficiencies in understanding” (Line 19), “LETKF estimates SCF as …” -> “LETKF estimates SCFs as …” (Line 76).

The text often uses subjective words such as “good” and “better”. I would suggest using more objective and descriptive words. For example, instead of saying “produces much better estimation …”, I would say something like “produces more accurate flux estimates in terms of x and y”.

When describing what was done, I suggest using past tense (“GEOS-Chem was run” (Line 160), “Four experiments were performed” (Line 205), etc.) rather than present tense.

The figures are not the clearest. Here are a few suggestions for improvements:
Figure 2: The blue-red colorbar has similar lightness at the upper/lower ranges of the colorbar, especially for the red part, which makes it hard to see the difference between e.g. 2.0 and 1.6. Maybe use a discrete colorbar, and change to a colormap with a larger range of lightness.

Figure 3: It is sometimes hard to discern the “truth” line in the top panel. Maybe make the black line thicker.

Also Figure 3: It would be easier to read the plot if the RMSE/BIAS values were listed in the same order as the legend showing the experiments.

4. Concerns about the experimental setup
I have several concerns about the experimental setup, which could affect the interpretation of the results. My main concern is that the assumptions are too idealistic to be applicable in real data cases. In particular, the OCO-2 observations in these perfect model OSSEs will directly reflect the surface CO2 fluxes and therefore provide a strong constraint on the spatiotemporal flux variations. In the real world, XCO2 observations are generally sensitive to other error sources such as systematic errors, representativeness errors (the swath width of OCO-2 is much smaller than the model grid points here and the satellite measurements are also affected by e.g. clouds), errors in the atmospheric CO2 background, and atmospheric transport errors. As a first step, how were representativeness errors included in the synthetic observations? Did the authors assume that there was an XCO2 observation in a 4°×5° model grid cell if there were more than x% XCO2 measurements in the grid cell? If so, what was the x threshold?
To make it easier to understand the results and compare them with other established inversion systems, I suggest the authors perform another set of OSSEs where only in situ observations are assimilated.
I am also wondering if some of the results could due to the specific algorithms in COLA rather than indicative of the performance of general inversion systems. For example, the uncertainty of the assimilated bottom-up fluxes/flux gradients were assumed to be proportional to the analysis uncertainty. Thus, if I understand this correctly, the data assimilation system will place more weight on the assimilated bottom-up fluxes/flux gradients compared with the CO2/XCO2 observations the more confident it is in its analysis. Is this a reasonable assumption? It seems to me that this would bias the EXP-P experiment toward the bottom-up fluxes, which seems to be what happens if you look at Figure 4, given that the analysis uncertainty will likely be reduced a lot when assimilating direct observations of the fluxes.
For the additive inflation method, given that the ensemble flux deviations and the additive flux perturbations can both be positive and negative, is it possible that the ensemble spread will actually decrease when adding the perturbations?
Another point about the inflation method: the daily fluxes vary mostly due to varying weather and advancing seasonal cycles (this is especially true if the fluxes were downscaled from e.g. monthly values). Thus, there are likely strong temporal correlations in the daily flux variations. I wonder if way of choosing additive inflation anomalies is optimal considering for example the subspace spanned by the ensemble members after adding the inflation perturbations. A quick check could be to perform a principal component analysis on the additive ensemble perturbations from the inflation method and see how the explained variance declines.
Related to the previous point, would this scheme not put less emphasis on fluxes in regions with smaller seasonal cycles and/or less variations in weather (in particular temperature and solar radiation)? Is this desirable? It would be informative if there was a figure (for example in an appendix) showing some examples of the additive inflation flux anomalies, or the leading modes and their explained variance from the principal component analysis.
5. Comparison with traditional methods
The manuscript claims to compare the results from the new algorithm with a “traditional” method. Here the “traditional” method is basically Equation (2), whereby fluxes are forecast forward in time based on the analysis from the previous two time steps and the bottom-up fluxes at the forecast time. However, I would not call this the traditional method as (i) to my knowledge, this method is only used by CarbonTracker/CarbonTracker Europe and maybe one or two other systems; and (ii) the implementation here is not the same as the one in CarbonTracker or in the cited reference. For one thing, the dynamic model in CarbonTracker is applied on flux parameters, which are scaling factors on bottom-up fluxes, and not on the fluxes themselves (as in this manuscript). The idea is that the bottom-up fluxes capture the high-frequency variations in the fluxes, for example due to shifting weather, while the flux parameters correct for long-term systematic errors. It may therefore be reasonable to smooth the flux parameters in time and relax them back to the prior values. In this manuscript, however, Equation (2) would partly relax the fluxes back to the previous days’ values, which would likely degrade the forecast fluxes in most cases (considering the advancement of the seasonal cycle and movement of weather systems) compared with the persistence model.
This together with the concerns raised in the previous points makes it unsuitable in my opinion to refer to EXP-P as using “traditional” methods. I do agree that there should be a comparison with established CO2 inversion methods to assess the added value of the new algorithms introduced in this paper. Thus, I recommend the authors to set up a more “standard” inversion experiment and compare the results from this inversion with their results.
Specific comments
Lines 18–19: “systematically biased at different spatiotemporal scales because of the deficiencies in understanding of some underlying processes”. It could also be due to uncertain parameter values and/or initial conditions in variables related to the carbon cycle.
Lines 26–28: “We suggest that the AAPO algorithm can be applied to other greenhouse gas (e.g., CH4, NO2) and pollutant data assimilation studies.” This was not shown in the study, so I suggest removing it from the abstract.
Lines 35–36: “of the Bayesian synthesis (…) and data assimilation (DA) techniques”. I wonder why the authors separate between Bayesian synthesis and data assimilation techniques here. Most data assimilation methods in widespread use are based on Bayesian inference (Rayner et al., 2019).
References
Rayner, P. J., A. M. Michalak, and F. Chevallier (2019): Fundamentals of data assimilation applied to biogeochemistry. Atmos. Chem. Phys., 19, 13911–13932, doi:10.5194/acp-19-13911-2019.
Lines 37–28: “However, the top-down estimation could be ill-posed because of … and systematic errors of the transport model and satellite retrieval”. I would argue that the most common reason for ill-posed problems in inverse modeling is that there is not a unique solution, which is mainly caused by a lack of observational constraint and insufficient regularization. I would not consider systematic errors to be the primarily cause of ill-posed problems. From a quick scan I could not find the term “ill-posed” mentioned in any of the cited references.
Lines 66–67: “The COLA DA system consists of …, and the assimilated observations”. I would not include the assimilated observations in the COLA DA system (by this logic, you should also include the meteorological driving data, the bottom-up fluxes, etc.).
Lines 76 and 91: “LETKF estimates SCF as evolving parameters” and “COLA treats SCFs as stationary parameters”. I get what the authors mean, but for readers who are not familiar with the method, this could sound contradictory.
Lines 76–77: “by augmenting it with the state vector CO2”. I would rather say that “the state vector, which contains atmospheric CO2 concentrations, was augmented to also include the SCF parameters” or something to that effect.
Line 77: “the LETKF prefers a short assimilation window”. I would argue that it depends on the application, and in some applications an ensemble Kalman Smoother with a longer assimilation window is more desirable.
Line 94: “There are two widely used a priori regularization approaches for EnKF-based carbon inversion systems.” As previously mentioned, I believe this is misrepresenting the regularization approaches commonly used in inversions.
Lines 98–99: “This approach omits useful information on the temporal dependency of SCFs”. I do not understand the authors’ point here—how does this omit information on the temporal dependence, given that the temporal variations are captured by the bottom-up fluxes?
Line 106: “we propose a new a priori regularization method to better follow the DA principle”. The DA principle, as I see it, is to optimally combine a priori information with observations. I do not see how this new regularization method better follows the data assimilation principle, considering that it does not use an informative prior for the fluxes.
Lines 127–128: “we define the SCF spatial gradient at a given grid as the SCF difference with its surrounding grids divided by the distance between them.” Is this considering only grid points orthogonally adjacent, or also diagonally adjacent (I assume the former)? The gradient of a 2D field has two components, in this example x and y, but this is not mentioned anywhere. I suspect that the authors accounted for this given that Equations (5) and (8) have an overhead arrow for the gradient variables, but the meaning of the arrow is not mentioned anywhere. Additionally, the vertical bars are not defined—I assume they are used to denote the Euclidean norm?
Lines 168–170: "A nature run is driven by the F_OA from Rödenbeck et al. (2014), F_FE from the Open-source Data Inventory of Anthropogenic CO2 emissions (ODIAC) (Oda et al., 2018), and the F_IR ..." Do these other flux components (ocean, fossil fuel, and fire) matter if they are identical in the truth and inversion experiments (considering that the atmospheric transport is perfect)?
Equations (7)–(9): Do t1 and t2 represent the actual time (as the text suggests), or are those time steps? If it is the latter, why was t2-t1 used rather than t2-t1+1? (The latter would correspond to division by the number of samples.)
Figure 3: The lower panel of Figure 3 shows that the RMSE and bias of EXP-NP are equal (1.36 GtC/yr). Normally in such a situation I would think that almost all of the RMSE can be explained by the bias, but this is not the case when looking at the lines in Figure 3. Do the RMSE and bias happen to be the same here because the denominator uses N-1 rather than N (where N is the number of samples) in the definitions of RMSE and MB (see previous point)?
3. Technical corrections
Line 76: Remove parentheses around citation.
Line 77: “Similar to the other EnKF” -> “Similar to other EnKF variations”
Line 79: “long training period”. Do the authors mean “a long assimilation window”? “Training” is not defined in the manuscript.
Line 97: “and the subscripts b, p, and t” -> “and the superscripts b and p and subscript t”
Figures 6 and 7: Change "Kg" to "kg".
Line 232: “show an increase in RMSE of FTA concerning EXP-NP” Do the authors mean “compared with” instead of “concerning”?
As mentioned there could be other issues that I have not raised here, which can be properly assessed only after the authors have included the missing information and clarified the manuscript.
Citation: https://doi.org/10.5194/gmd-2023-15-RC1
- AC1: 'Reply on RC1', Zhiqiang Liu, 19 Jul 2023
  
  Many thanks for your constructive comments. Please see the attached file for the detailed responses.
  
  Citation: https://doi.org/10.5194/gmd-2023-15-AC1
RC2:
'Comment on gmd-2023-15', Anonymous Referee #2, 28 Apr 2023

Please see the attached PDF.

Citation: https://doi.org/10.5194/gmd-2023-15-RC2
- AC2: 'Reply on RC2', Zhiqiang Liu, 19 Jul 2023
  
  Many thanks for your constructive comments. Please see the attached file for the detailed responses.
  
  Citation: https://doi.org/10.5194/gmd-2023-15-AC2

Status: closed

RC1:
'Comment on gmd-2023-15', Anonymous Referee #1, 16 Apr 2023
General comments
This paper presents a new algorithm for ingesting information about surface CO2 fluxes from bottom-up estimations into a top-down atmospheric inversion system. The main innovation is that instead of using the bottom-up fluxes as a priori fluxes in the inversion, the spatial gradients of the fluxes are instead assimilated as observations, which the authors argue increases the signal-to-noise ratio and avoids contaminating the analysis with biases in the bottom-up fluxes.
While the idea of assimilating the spatial flux gradients rather than the fluxes themselves is interesting and worth investigating, I had a hard time assessing the manuscript on scientific quality and validity because of incomplete information, confusing use of terms, and distracting issues in the text and figures. More detailed comments are provided below. I believe addressing these issues goes beyond the scope of a major revision, and recommend the authors to substantially rework the manuscript in terms of content and presentation. Only then can a proper assessment be made about the scientific quality.
1. Incomplete information
Many important pieces of information are missing from the manuscript, which makes it difficult to follow and fully understand the experiments and results. Sometimes the reader is referred to previous papers (e.g., Liu et al. (2019, 2022) and Kang et al. (2012)), and at other times it is not clear where to find the missing information.
I am not saying that the authors need to explain everything in this manuscript— for example, I think it is perfectly fine to refer to other papers for details about the LETKF algorithm— but there are many cases where the lack of information hinders the reader’s comprehension of the present study. Often this information can be provided in one or two short sentences, so text length should not be a problem. Here are just some examples of information that I miss in the current manuscript:
Ensemble size, and if the results are sensitive to ensemble size.

What the background (both fluxes and concentrations) at t=0 is.

How the ensemble members were perturbed at t=0 (i.e., what is the assumed initial background error covariances?).

The time resolution of the assimilated observations. The text says “For satellite data, we used a 10-second averaged …”, but I assume that the system did not actually assimilate 10-second satellite observations? Were the observations averaged to daily values before being assimilated? For in situ observations, were observations from all hours used, or only e.g. measurements that were identified as being representative of large well-mixed air masses?

The manuscript should be explicit about OCO-2 observations being column-integrated CO2 (XCO2) and explain how the synthetic XCO2 observations were calculated from the truth simulation.

The text refers to Liu et al. (2022) for information about the observation errors, but it would be nice if a short description could be provided here as well.

The inflation method is not sufficiently explained. I assume the mean of the additive perturbations were subtracted to not influence the ensemble mean (similar to what was done in Liu et al. (2022))? Were the magnitudes of the perturbations also scaled to a predefined value, and if so, what is the value?

The text mentions a short (1 day) and a long (7 days) assimilation window. How were the parameters adjusted within the long assimilation window? Liu et al. (2022) states that the parameters were adjusted on a daily basis within the long assimilation window— if it is the same in this study, it should be explicitly stated. Or are the parameters assumed to be constant within the assimilation window?

What are the temporal error correlations for the parameters within the same assimilation window, and do they allow the parameters to be adjusted differently on a daily basis?

What is the temporal resolution of the flux products? From my understanding VEGAS can simulate daily fluxes, but what about CASA, are those monthly fluxes? If so, how were they downscaled to daily values?

Were the fluxes/spatial gradients of fluxes assimilated before or after the CO2 observations? Does the order matter?

This is not an exhaustive list, and I do not think it should be the reviewers’ job to list everything that should be covered. The authors should think carefully about what information is relevant to include in the manuscript.
References
Kang, J.-S., E. Kalnay, T. Miyoshi, J. Liu, and I. Fung (2012): Estimation of surface carbon fluxes with an advanced data assimilation methodology. Geophys. Res. Atmospheres, 117, D24101, doi:10.1029/2012JD018259.
Liu, Y., E. Kalnay, N. Zeng, G. Asrar, Z. Chen, and B. Jia (2019): Estimating surface carbon fluxes based on a local ensemble transform Kalman filter with a short assimilation window and a long observation window: an observing system simulation experiment test in GEOS-Chem 10.1. Geosci. Model Dev., 12, 2899–2914, doi:10.5194/gmd-12-2899-2019.
Liu, Z., N. Zeng, Y. Liu, E. Kalnay, G. Asrar, B. Wu, Q. Cai, D. Liu, and P. Han (2022): Improving the joint estimation of CO2 and surface carbon fluxes using a constrained ensemble Kalman filter in COLA (v1.0). Geosci Model Dev, 18, doi:10.5194/gmd-15-5511-2022.
2. Confusing use of terms
There are several terms that are used in a way that is confusing or add little to no value. Again, here are just some examples:
“Unique” is used throughout the manuscript, for example, “as a unique observation” (Line 21), “a unique feature of a short assimilation window” (Line 80), and “Another unique feature” (Line 82). I do not see the point of stating that these features are unique. For assimilating a priori information as observations, this has been done before (a recent example is Kaminski et al., 2022). For the first case, maybe the authors meant “as a special observation” or something similar? Similar to the first point, “comprehensive” is often used (Lines 22, 59, 111, 292), but it is not clear in what sense the described objects are comprehensive.

I found it confusing that fluxes from the CASA model are described as a priori fluxes even though they were not used as a prior in the inversions.

Maybe refer to them as “bottom-up fluxes” instead?

Many acronyms are confusing. For example, in Summary and Discussion, AAPO is redefined on Line 290, but it is not clear in this context how the acronym was formed.

Many equations introduce new acronyms, for example TSG and PSG in Equation (5), and ASG in Equation (8). How about using a different variable (e.g., ∇f) or a subscript (e.g., “SG”) for the spatial gradient of the fluxes, and then superscripts A, T, and P for analysis, truth, and prior, respectively? This would reduce the number of acronyms the reader has to keep track of.

Equation (6) introduces the “S” superscript, which I assume stands for “SCF”, but it is not defined or used anywhere else in the manuscript, which is confusing. I would suggest dropping the superscript, or have “SG” and “f” as subscripts in Equations (5) and (6) instead. Generally, try to be consistent throughout the manuscript.

Similarly, what do "NP”, "P”, "ASG", and "AP" in the experiment names stand for? (I have my guesses, but it should not be the reader’s job to figure this out.)

Line 138 says that “COLA is a flow-dependent ensemble-based DA”. The term “flow-dependent” makes sense in the context of e.g. atmospheric data assimilation, where the errors depend (and often follow) the atmospheric flow, but it is not clear what exactly it means here. Given that there is no dynamic model for the parameters, how is the system flow-dependent?

The text uses “significant/significantly” in many places. I would recommend reserving these words for describing statistical significance when comparing results. If the authors performed statistical significance testing, they should state what tests were used and the significance level.

Lines 289–290 state that “the spatial gradient of a bottom-up model estimation is dynamically assimilated”. What is “dynamically” referring to here?

References
Kaminski, T., M. Scholze, P. Rayner, M. Voßbeck, M. Buchwitz, M. Reuter, W. Knorr, H. Chen, A. Agusti-Panareda, A. Löscher, and Y. Meijer (2022): Assimilation of atmospheric CO2 observations from space can support national CO2 emission inventories. Environmental Research Letters, 17, 014015, doi:10.1088/1748-9326/ac3cea.
3. Language and figures
The language could be improved in several instances, for example:
Articles are sometimes missing or incorrectly added, or the noun should be plural instead of singular or vice versa. Here are just a few examples: “and a priori flux from …” -> “and a priori fluxes from” (Line 15), “weigh a priori flux to the background” -> “weigh a priori fluxes to the background” (Line 16), “because of the deficiencies in understanding” -> “because of deficiencies in understanding” (Line 19), “LETKF estimates SCF as …” -> “LETKF estimates SCFs as …” (Line 76).

The text often uses subjective words such as “good” and “better”. I would suggest using more objective and descriptive words. For example, instead of saying “produces much better estimation …”, I would say something like “produces more accurate flux estimates in terms of x and y”.

When describing what was done, I suggest using past tense (“GEOS-Chem was run” (Line 160), “Four experiments were performed” (Line 205), etc.) rather than present tense.

The figures are not the clearest. Here are a few suggestions for improvements:
Figure 2: The blue-red colorbar has similar lightness at the upper/lower ranges of the colorbar, especially for the red part, which makes it hard to see the difference between e.g. 2.0 and 1.6. Maybe use a discrete colorbar, and change to a colormap with a larger range of lightness.

Figure 3: It is sometimes hard to discern the “truth” line in the top panel. Maybe make the black line thicker.

Also Figure 3: It would be easier to read the plot if the RMSE/BIAS values were listed in the same order as the legend showing the experiments.

4. Concerns about the experimental setup
I have several concerns about the experimental setup, which could affect the interpretation of the results. My main concern is that the assumptions are too idealistic to be applicable in real data cases. In particular, the OCO-2 observations in these perfect model OSSEs will directly reflect the surface CO2 fluxes and therefore provide a strong constraint on the spatiotemporal flux variations. In the real world, XCO2 observations are generally sensitive to other error sources such as systematic errors, representativeness errors (the swath width of OCO-2 is much smaller than the model grid points here and the satellite measurements are also affected by e.g. clouds), errors in the atmospheric CO2 background, and atmospheric transport errors. As a first step, how were representativeness errors included in the synthetic observations? Did the authors assume that there was an XCO2 observation in a 4°×5° model grid cell if there were more than x% XCO2 measurements in the grid cell? If so, what was the x threshold?
To make it easier to understand the results and compare them with other established inversion systems, I suggest the authors perform another set of OSSEs where only in situ observations are assimilated.
I am also wondering if some of the results could due to the specific algorithms in COLA rather than indicative of the performance of general inversion systems. For example, the uncertainty of the assimilated bottom-up fluxes/flux gradients were assumed to be proportional to the analysis uncertainty. Thus, if I understand this correctly, the data assimilation system will place more weight on the assimilated bottom-up fluxes/flux gradients compared with the CO2/XCO2 observations the more confident it is in its analysis. Is this a reasonable assumption? It seems to me that this would bias the EXP-P experiment toward the bottom-up fluxes, which seems to be what happens if you look at Figure 4, given that the analysis uncertainty will likely be reduced a lot when assimilating direct observations of the fluxes.
For the additive inflation method, given that the ensemble flux deviations and the additive flux perturbations can both be positive and negative, is it possible that the ensemble spread will actually decrease when adding the perturbations?
Another point about the inflation method: the daily fluxes vary mostly due to varying weather and advancing seasonal cycles (this is especially true if the fluxes were downscaled from e.g. monthly values). Thus, there are likely strong temporal correlations in the daily flux variations. I wonder if way of choosing additive inflation anomalies is optimal considering for example the subspace spanned by the ensemble members after adding the inflation perturbations. A quick check could be to perform a principal component analysis on the additive ensemble perturbations from the inflation method and see how the explained variance declines.
Related to the previous point, would this scheme not put less emphasis on fluxes in regions with smaller seasonal cycles and/or less variations in weather (in particular temperature and solar radiation)? Is this desirable? It would be informative if there was a figure (for example in an appendix) showing some examples of the additive inflation flux anomalies, or the leading modes and their explained variance from the principal component analysis.
5. Comparison with traditional methods
The manuscript claims to compare the results from the new algorithm with a “traditional” method. Here the “traditional” method is basically Equation (2), whereby fluxes are forecast forward in time based on the analysis from the previous two time steps and the bottom-up fluxes at the forecast time. However, I would not call this the traditional method as (i) to my knowledge, this method is only used by CarbonTracker/CarbonTracker Europe and maybe one or two other systems; and (ii) the implementation here is not the same as the one in CarbonTracker or in the cited reference. For one thing, the dynamic model in CarbonTracker is applied on flux parameters, which are scaling factors on bottom-up fluxes, and not on the fluxes themselves (as in this manuscript). The idea is that the bottom-up fluxes capture the high-frequency variations in the fluxes, for example due to shifting weather, while the flux parameters correct for long-term systematic errors. It may therefore be reasonable to smooth the flux parameters in time and relax them back to the prior values. In this manuscript, however, Equation (2) would partly relax the fluxes back to the previous days’ values, which would likely degrade the forecast fluxes in most cases (considering the advancement of the seasonal cycle and movement of weather systems) compared with the persistence model.
This together with the concerns raised in the previous points makes it unsuitable in my opinion to refer to EXP-P as using “traditional” methods. I do agree that there should be a comparison with established CO2 inversion methods to assess the added value of the new algorithms introduced in this paper. Thus, I recommend the authors to set up a more “standard” inversion experiment and compare the results from this inversion with their results.
Specific comments
Lines 18–19: “systematically biased at different spatiotemporal scales because of the deficiencies in understanding of some underlying processes”. It could also be due to uncertain parameter values and/or initial conditions in variables related to the carbon cycle.
Lines 26–28: “We suggest that the AAPO algorithm can be applied to other greenhouse gas (e.g., CH4, NO2) and pollutant data assimilation studies.” This was not shown in the study, so I suggest removing it from the abstract.
Lines 35–36: “of the Bayesian synthesis (…) and data assimilation (DA) techniques”. I wonder why the authors separate between Bayesian synthesis and data assimilation techniques here. Most data assimilation methods in widespread use are based on Bayesian inference (Rayner et al., 2019).
References
Rayner, P. J., A. M. Michalak, and F. Chevallier (2019): Fundamentals of data assimilation applied to biogeochemistry. Atmos. Chem. Phys., 19, 13911–13932, doi:10.5194/acp-19-13911-2019.
Lines 37–28: “However, the top-down estimation could be ill-posed because of … and systematic errors of the transport model and satellite retrieval”. I would argue that the most common reason for ill-posed problems in inverse modeling is that there is not a unique solution, which is mainly caused by a lack of observational constraint and insufficient regularization. I would not consider systematic errors to be the primarily cause of ill-posed problems. From a quick scan I could not find the term “ill-posed” mentioned in any of the cited references.
Lines 66–67: “The COLA DA system consists of …, and the assimilated observations”. I would not include the assimilated observations in the COLA DA system (by this logic, you should also include the meteorological driving data, the bottom-up fluxes, etc.).
Lines 76 and 91: “LETKF estimates SCF as evolving parameters” and “COLA treats SCFs as stationary parameters”. I get what the authors mean, but for readers who are not familiar with the method, this could sound contradictory.
Lines 76–77: “by augmenting it with the state vector CO2”. I would rather say that “the state vector, which contains atmospheric CO2 concentrations, was augmented to also include the SCF parameters” or something to that effect.
Line 77: “the LETKF prefers a short assimilation window”. I would argue that it depends on the application, and in some applications an ensemble Kalman Smoother with a longer assimilation window is more desirable.
Line 94: “There are two widely used a priori regularization approaches for EnKF-based carbon inversion systems.” As previously mentioned, I believe this is misrepresenting the regularization approaches commonly used in inversions.
Lines 98–99: “This approach omits useful information on the temporal dependency of SCFs”. I do not understand the authors’ point here—how does this omit information on the temporal dependence, given that the temporal variations are captured by the bottom-up fluxes?
Line 106: “we propose a new a priori regularization method to better follow the DA principle”. The DA principle, as I see it, is to optimally combine a priori information with observations. I do not see how this new regularization method better follows the data assimilation principle, considering that it does not use an informative prior for the fluxes.
Lines 127–128: “we define the SCF spatial gradient at a given grid as the SCF difference with its surrounding grids divided by the distance between them.” Is this considering only grid points orthogonally adjacent, or also diagonally adjacent (I assume the former)? The gradient of a 2D field has two components, in this example x and y, but this is not mentioned anywhere. I suspect that the authors accounted for this given that Equations (5) and (8) have an overhead arrow for the gradient variables, but the meaning of the arrow is not mentioned anywhere. Additionally, the vertical bars are not defined—I assume they are used to denote the Euclidean norm?
Lines 168–170: "A nature run is driven by the F_OA from Rödenbeck et al. (2014), F_FE from the Open-source Data Inventory of Anthropogenic CO2 emissions (ODIAC) (Oda et al., 2018), and the F_IR ..." Do these other flux components (ocean, fossil fuel, and fire) matter if they are identical in the truth and inversion experiments (considering that the atmospheric transport is perfect)?
Equations (7)–(9): Do t1 and t2 represent the actual time (as the text suggests), or are those time steps? If it is the latter, why was t2-t1 used rather than t2-t1+1? (The latter would correspond to division by the number of samples.)
Figure 3: The lower panel of Figure 3 shows that the RMSE and bias of EXP-NP are equal (1.36 GtC/yr). Normally in such a situation I would think that almost all of the RMSE can be explained by the bias, but this is not the case when looking at the lines in Figure 3. Do the RMSE and bias happen to be the same here because the denominator uses N-1 rather than N (where N is the number of samples) in the definitions of RMSE and MB (see previous point)?
3. Technical corrections
Line 76: Remove parentheses around citation.
Line 77: “Similar to the other EnKF” -> “Similar to other EnKF variations”
Line 79: “long training period”. Do the authors mean “a long assimilation window”? “Training” is not defined in the manuscript.
Line 97: “and the subscripts b, p, and t” -> “and the superscripts b and p and subscript t”
Figures 6 and 7: Change "Kg" to "kg".
Line 232: “show an increase in RMSE of FTA concerning EXP-NP” Do the authors mean “compared with” instead of “concerning”?
As mentioned there could be other issues that I have not raised here, which can be properly assessed only after the authors have included the missing information and clarified the manuscript.
Citation: https://doi.org/10.5194/gmd-2023-15-RC1
- AC1: 'Reply on RC1', Zhiqiang Liu, 19 Jul 2023
  
  Many thanks for your constructive comments. Please see the attached file for the detailed responses.
  
  Citation: https://doi.org/10.5194/gmd-2023-15-AC1
RC2:
'Comment on gmd-2023-15', Anonymous Referee #2, 28 Apr 2023

Please see the attached PDF.

Citation: https://doi.org/10.5194/gmd-2023-15-RC2
- AC2: 'Reply on RC2', Zhiqiang Liu, 19 Jul 2023
  
  Many thanks for your constructive comments. Please see the attached file for the detailed responses.
  
  Citation: https://doi.org/10.5194/gmd-2023-15-AC2

Zhiqiang Liu, Ning Zeng, Yun Liu, Eugenia Kalnay, Ghassem Asrar, Qixiang Cai, and Pengfei Han

Model code and software

Assimilating a priori as special observation (AAPO) algorithm Zhiqiang Liu, Ning Zeng, and Yun Liu https://doi.org/10.5281/zenodo.7592827

Zhiqiang Liu, Ning Zeng, Yun Liu, Eugenia Kalnay, Ghassem Asrar, Qixiang Cai, and Pengfei Han

Viewed

Total article views: 1,838 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
1,426	344	68	1,838	78	111

HTML: 1,426
PDF: 344
XML: 68
Total: 1,838
BibTeX: 78
EndNote: 111

Views and downloads (calculated since 21 Feb 2023)

Month	HTML	PDF	XML	Total
Feb 2023	132	18	2	152
Mar 2023	122	16	3	141
Apr 2023	115	23	4	142
May 2023	29	15	1	45
Jun 2023	19	6	0	25
Jul 2023	30	9	4	43
Aug 2023	23	9	0	32
Sep 2023	31	18	1	50
Oct 2023	31	5	1	37
Nov 2023	23	13	0	36
Dec 2023	33	8	1	42
Jan 2024	24	4	0	28
Feb 2024	45	17	1	63
Mar 2024	38	11	5	54
Apr 2024	24	2	4	30
May 2024	19	8	5	32
Jun 2024	43	3	2	48
Jul 2024	12	2	3	17
Aug 2024	15	3	7	25
Sep 2024	20	3	5	28
Oct 2024	12	3	0	15
Nov 2024	26	3	0	29
Dec 2024	8	4	0	12
Jan 2025	12	7	0	19
Feb 2025	13	4	0	17
Mar 2025	15	6	2	23
Apr 2025	9	9	0	18
May 2025	18	7	1	26
Jun 2025	24	9	4	37
Jul 2025	21	10	1	32
Aug 2025	56	11	2	69
Sep 2025	284	11	1	296
Oct 2025	45	20	5	70
Nov 2025	35	32	2	69
Dec 2025	20	15	1	36

Cumulative views and downloads (calculated since 21 Feb 2023)

Month	HTML	PDF	XML	Total
Feb 2023	132	18	2	152
Mar 2023	122	16	3	141
Apr 2023	115	23	4	142
May 2023	29	15	1	45
Jun 2023	19	6	0	25
Jul 2023	30	9	4	43
Aug 2023	23	9	0	32
Sep 2023	31	18	1	50
Oct 2023	31	5	1	37
Nov 2023	23	13	0	36
Dec 2023	33	8	1	42
Jan 2024	24	4	0	28
Feb 2024	45	17	1	63
Mar 2024	38	11	5	54
Apr 2024	24	2	4	30
May 2024	19	8	5	32
Jun 2024	43	3	2	48
Jul 2024	12	2	3	17
Aug 2024	15	3	7	25
Sep 2024	20	3	5	28
Oct 2024	12	3	0	15
Nov 2024	26	3	0	29
Dec 2024	8	4	0	12
Jan 2025	12	7	0	19
Feb 2025	13	4	0	17
Mar 2025	15	6	2	23
Apr 2025	9	9	0	18
May 2025	18	7	1	26
Jun 2025	24	9	4	37
Jul 2025	21	10	1	32
Aug 2025	56	11	2	69
Sep 2025	284	11	1	296
Oct 2025	45	20	5	70
Nov 2025	35	32	2	69
Dec 2025	20	15	1	36

Viewed (geographical distribution)

Total article views: 1,790 (including HTML, PDF, and XML) Thereof 1,790 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 27 Dec 2025

Short summary

We introduced a novel algorithm that assimilates a better a priori knowledge to improve the estimation of global surface carbon flux. The algorithm aims at separating the first-order systematic biases in the a priori "bottom-up" flux estimations out of the inversion framework from a comprehensive data assimilation perspective.


Total:	0
HTML:	0
PDF:	0
XML:	0