|Review paper Zheng et al. 4dvar WRF-CO2|
The paper presents a newly developed WRF-CO2 4DVar assimilation system. The authors have documented the full system and provided a reasonable demonstration of the system. The system is well-documented and based on known optimization methods. The adjoint model has been carefully developed and implemented for tracer applications, and therefore provide an important step toward a full 4DVar system for regional inversions of fluxes. The main concerns and technical comments listed hereafter need to be addressed before final publication.
- Missing definition of error covariances (both observations and prior flux errors): Here the system has been described and tested without any details on the construction of error covariances. The inverse framework needs to address that problem. The authors should include a paragraph about it.
In addition, one of the tests used an incorrect observation error covariance matrix R, inconsistent between the fully correlated errors (bias of 50% everywhere) and the R matrix (diagonal terms only - no correlation). The test should reconcile or address this problem.
- Boundary conditions: The inversion system needs to address this problem which is not trivial and could increase significantly the cost of the 4DVar. In addition, the optimization becomes more complicated as the two unknowns (boundary inflow and surface fluxes) could be optimized to correct for concentrations. The authors need to address that problem, and if not solving for, should describe a path toward implementing this part of the code. At this point, the WRF-CO2 4DVar is not usable for real-data inversions because of this gap. This is a major weakness in the current study.
- Adjoint evaluation: Instead of a qualitative comparison of adjoint sensitivity and back-trajectories, the authors can compute the actual HYSPLIT footprints, and combine them with prior fluxes to compare to the adjoint sensitivity results. This analysis is fast considering the short simulation period and that the tools to compute tower footprints are publicly available. This evaluation would reinforce the confidence in the adjoint model.
P1-L16-21: This paragraph needs to include more references. Only two papers are cited, somehow arbitrarily. Please add the references corresponding to the different statements. Chevallier et al., 2005 is a good reference but some of the major papers should also be cited here.
P1-L22: An ensemble approach would also need a CTM. Correct the statement.
P1-L23: What about the boundary conditions? You are developing a regional system within a bounded domain. The state vector also includes prior information related to the boundary inflow. Revise the sentence.
P2-L1: Same comment related to boundary conditions. Some bibliography on this problem is needed here. Background conditions are critical to limited-domain inversions.
P2-L2-3: Why do you bring the dimensions of the Jacobian in the introduction while not discussing it further in this paragraph?
P2-L5-6: THis is only true for single tracer simulations in Eulerian mode. One can run multiple tracers in one simulation, or run millions of particles at once in a Lagrangian framework. Please revise this statement.
P2-L6-11: The idea here is to explain the advantage of using a variational approach. While the arguments are correct, the justification lacks some clarity and a more rigorous description of the variational approach (minimzation using the gradients, adjoint model,...). There are also some disadvantages to the variational approach that should be stated here (no explicit posterior uncertainty, problem of convergence,...). This is an important paragraph that highlights the value of your work. You should revise this text to explain the major technical features in variational methods, and link it with the previous examples of 3d/4d-var studies in the following paragraphs.
P2-L19: CarbonTracker is using a lagged Ensemble Kalman Smoother, not a variational approach. Revise the statement.
P2-L13-26: this paragraph focuses entirely on 4d-var systems. What about other variational approaches?
P2-29: More examples are needed here. Gerbig et al. (2009) is not an inversion but an overview on regional inverse modeling strategies. Include previous regional inversion studies with their achievements.
P2-L31: GEOS-Chem is a global system not optimla for regional applications. Multiple examples of regional inversions have been published over the years.
P2-L34: Include references for LPDM-based inversion studies.
P2-L35: "assimilated meteorology". Do you mean "meteorological analyses"?
P3-L1-10: These are the studies you need to describe up front. THis paragraph should be merged with the previous one.
P3-L12: This statement should come before describing inversions and methods.
P3-L13: The differences in CMS products is an illustration but other examples would be better suited here. Consider inter-comparison studies in particular (e.g. Peylin et al. (2013) for CO2).
P3-L14-15: "high priority". Confusing. What do you mean here? Please rephrase.
P3-L29: The problem of convective transport for tracers is not trivial, and most of the convective schemes in WRF do not even produce mass fluxes explicitely. Some of them use a simple parameterization with 1D variables while other schemes only provide mass fluxes at the cloud base/top which is insufficient for convective mass transport. Few of the schemes have a full 3D mass flux, and mass conservation is not even guaranteed. Please explain with more details here which schemes would allow for convective transport of tracers with "limited new code development".
P3-L34: Justify this statement with a quantity or a reference.
P4-L1: Typo. Subscript for "2" missing in "CO2".
P4-L1-4: Provide a simple quantity instead of several sentences. For example, caluclate the impact on the solar energy when considering higher CO2 concentrations at the local scale. Or cite a study describing it.
P4-L10: Typo. "estimated"
P4-L6-12: This is even more relevant for turbulence with eddy turn-over times in minutes.
P4-L22: Replace "observational data" by "atmospheric observations" or "observations"
P4-L22 (also L23): Add "the" to "flux", or replace "flux" by "fluxes".
P4-L25: Replace "emission" by "fluxes". "emissions" refer to positive fluxes only.
P5-Eq 2-4: Describe the variables used in the equation (x, x0, y, H, and M) and refer to Table 1 for the full list of symbols.
P5-L10: Why "essentially"? Is it something else?
P5-L11: Replace "emission" by "flux"
P5-L12: Is a lagged approach? Or independent flux estimates? Refer to later sections to describe the approach.
P7-L17: Replace "CO2 related processes" by "physical and dynamical processes involved in the atmospheric transport of CO2"
P7-L20: Typo. "to keep"
P7-L22: To be clear for the readers who are not familiar with WRF, the chemistry module was added, which is more accurate than stating that WRF-Chem "replaced" WRF. WRF is still used in WRF-Chem.
P7-L24-25: The tracer here has no impact on the code. One can substitute CO2 by CO or CH4. Assuming there is no chemistry involved, the transport of an inert and mass-free tracer has no incidence on the selected gas. Revise the sentence.
P7-L27: It means that you removed the GHG option and used a passive tracer. It would be clear to the future users if you state that. WRF-CO2 assumes that VPRM is used for the biogenic component. Otherwise it becomes equivalent to the original passive tracer mode. Clarify.
P8-L25: Typo. "was"
Section 2.4.2: This section is clear and helpful for future code development. One comment here: Additional thoughts on joint optimization of meteo and CO2 data would be useful. How could a combined assimilation be implemented using your 4DVar system?
P9-L29: The argument to select the PBL scheme is understandable, but lacks some scientific background. Is ACM2 a good option for CO2 turbulent mixing in the PBL? Several schemes have been tested with known systematic errors. The selection process should be based on model performances rather than technical reasons. You should at least comment on the model performances of the ACM2 PBL scheme. Add some metrics or published studies to assess the ACM2 schemes.
P9-L31: To clarify this sentence, you need to explain that this scheme is parameterized based on precipitation rate. It is not the actuall mass flux from the Grell scheme but rather a crude representation of the vertical convective transport. Clarify in the text.
P10-L27: Biospheric fluxes vary diurnally from negative to positive values depending on the time of day. For this reason, inversion problems need to address separately the two components or by time of day (night versus day) or by component (respiration versus photosynthesis). The daily mean will be irrelevant when transported into the concentration space as the timing of the atmospheric mixing is coupled to the timing of the fluxes and therefore cannot be simply averaged over an entire day. The 4DVar could be used for 3-hourly fluxes, which would be more accurate and avoid biases due to day/night components. This problem has been discussed in several papers (e.g. Gourdji et al., 2012 - https://www.biogeosciences.net/9/457/2012/bg-9-457-2012.html). Whereas this problem will not be critical in a pseudo-data study, it will be critical in real-data inversions.
P10-L31 to P11-L4: The definition of boundary conditions by a global model is highly uncertain and cannot be ignored in the optimization process. Several studies have discussed that problem (Trusilova et al., 2010; Schuh et al., 2010; Goeckede et al., 2011; Lauvaux et al., 2012). See the general comment. If the problem is not solved in this paper, it should be highlighted as a major shortcoming in this study.
Figure 5: Conventional contour lines to illustrate pressure systems and frontal structures would be much easier to catch for the readers.
P11-L22: Not only variances but covariances as well.
P11-L29 to P12-L7: This evaluation is limited to wind speed and direction but is not directly representative of CO2 concentration errors. Clarify here how these errors could be used to inform about transport model error variances and covariances.
From Figure 6, the wind speed is biased high, which is consistent with other studies. How would this problem be considered in the 4DVar framework?
Section 3.2: The evaluation is convincing and seems to confirm that the adjoint has been correctly implemented. but please explain why the differences are considered "accurate". Cite other references for the threshold. For example: why is "10^-10" acceptable?
Section 3.3: THis section remains qualitative and not highly informative. The adjoint sensitivity seems to agree with the overall shape of the footprint. This comparison would be more convincing if the footprints were computed using HYSPLIT and combined with the prior fluxes. A short simulation (24 hours in your case) is really inexpensive for a Lagrangian model and would provide an independent evaluation of your adjoint transport.
P15-L1: The definition of R is inconsistent for case 1. If all pixels are multiplied by 1.5, the error correlations are equal to 1 over the entire domain. But you defined the R matrix with an independent error space. Clarify.