the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Observational operator for fair model calibration with ground NO2 measurements
Abstract. Measurements collected from ground monitoring stations have gained popularity as a valuable data source for calibrating numerical models and correcting model errors through data assimilation. Both model calibration and assimilation are driven by the penalty quantified by simulation-minus-observations. However, the penal forces are challenged by the existence of a spatial scale disparity between model simulations and observations. The Chemical Transport Models (CTMs) allow the division of the atmosphere into grid cells, yet their spatial resolution may not align with the limited range of in-situ measurements, particularly for short-lived air pollutants. Within a broad grid pattern, air pollutant concentrations can exhibit significant heterogeneity due to their rapid generation and dissipation. Ground observations with traditional methods (including nearest search and grid mean) are less representative when compared to model simulations. This study develops a new land-use-based representative (LUBR) observational operator to generate spatially representative gridded observation for model calibration and evaluation. It incorporates high-resolution urban-rural land use data to address intra-grid variability. The LUBR operator is validated to consistently provide insights that align with satellite OMI measurements. It is an effective solution to accurately quantify these spatial scale mismatches and further resolve them via assimilation. Model calibrations with 2015–2017 NO2 measurement in China demonstrates biases and errors differed substantially when the LUBR and other operator are used, respectively. The results highlight the importance of considering fine-scale urban-rural differences when comparing models and observations, especially for short-lived pollutants like NO2.
- Preprint
(19244 KB) - Metadata XML
-
Supplement
(5263 KB) - BibTeX
- EndNote
Status: closed
-
RC1: 'Comment on gmd-2023-216', Anonymous Referee #1, 14 Mar 2024
Representation error has posed a challenge in achieving consistent comparisons between models and ground-based observations. This issue arises because model grids are relatively coarse, whereas site-specific observations are locally representative, especially in heterogeneous environments targeting short-lived pollutants. This manuscript addresses this problem for NO2 by introducing a land-use-based representative (LUBR) observational operator, enabling the processed NO2 observations to better represent the means of 0.5x0.625 grid cells. This algorithm is proved effective for short-lived NO2 and is well evaluated in the paper. This method is helpful for accurately interpreting the bias between models and ground-based observations and is applicable to data assimilation research. I recommend this manuscript for publication once the issues outlined below are addressed.
Major comments:
- An assumption underlying this LUBR algorithm is that observations from urban/rural sites can represent the average conditions of the entire urban/rural areas within this grid cell, which is not necessarily accurate. In other words, this algorithm only partially corrects the representation error, a point that needs clarifications.
- In section 2.3, the authors compare modeled surface NO2 with ground-based observations, and modeled NO2 column with OMI observations. They note an inconsistent performance of model in simulating surface and column NO2. In their interpretation throughout the paper, satellite observations are considered more representative, and the model-to-satellite bias is treated as the true bias for simulating NO2. However, this assumption may not be accurate, for reasons that are listed below. It is important to address these issues throughout the paper, although they do not compromise the paper’s overall conclusion. (1) Satellite observations have their own representative issues and should be treated carefully. OMI provides observations only for the 1-2pm overpassing window and are most reliable under clear-sky conditions, when chemistry/meteorology might differ from monthly means. OMI retrievals require a prior NO2 profile shape, which can be a major source of retrieval error. A consistent comparison between OMI and GEOS-Chem requires the same sampling process for modeled NO2, and replacing the a prior NO2 profile shape in OMI retrieval with one simulated by GEOS-Chem. Only after these processes can the bias between the resampled model and reprocessed retrieval be considered the actual bias between the model and satellite observations. It appears that in this paper, the authors lack this preprocessing before determining the model-retrieval bias. This should be corrected. (2) Even with a correctly determined bias between model and satellite observations, it does not imply that this bias will align with the bias between model and ground-based observations. This is because satellites measure column density of NO2, capturing information not just from the surface but also from the troposphere and stratosphere (I assume they use total column density which includes stratospheric contribution – this needs to be clarified in the paper). Thus, it is entirely reasonable for column bias to differ from the surface bias. The authors should not regard column bias as the true bias for ground-level comparisons.
- Additional information is needed regarding OMI product. Why was the OMAEROe product chosen? How does this product perform in comparison to ground based NO2 column observations and to other more popular OMI NO2 products, such as OMNO2 from NASA or the OMI product from KNMI?
- I am curious if the urban/rural factor exhibits any seasonality, considering the longer lifetime of NO2 during winter compared to summer? Can soil NOx emissions during summer (dominant in rural areas?) influence the urban/rural factor? Please consider adding a discussion on this.
- I don’t see the point of figure 4, as it appears to convey ideas similar to those presented in figure 5 or 6. Please consider removing one of these figures to remain concise for evaluation section.
Citation: https://doi.org/10.5194/gmd-2023-216-RC1 - AC1: 'Reply on RC1', Jianbing Jin, 08 May 2024
-
RC2: 'Comment on gmd-2023-216', Anonymous Referee #2, 28 Mar 2024
General comments
The authors develop an observational operator to improve the agreement between NO2 measurements from monitoring stations with low-resolution chemistry transport simulations. The operator uses VIIRS nighttime lights to estimate the urban and rural fraction in the grid cells and adjust the NO2 values based on fraction of urban/rural monitoring sites. The authors test the approach using GEOS-Chem simulations of China. They show that the operator reduces biases in grid cells compared to other operators (nearest search and grid means). The paper is within scope of GMD and provides some advances and tools that can be of interest to GMD readers. However, the paper is a few severe shortcomings that would need to be addressed.
- The manuscript does not have a clear structure, which makes it difficult (and sometimes impossible) to follow the arguments of the authors. The "method" section includes many elements that would fit better to the "introduction" (motivation for the study) or the "result and discussion" section. For example, Section 2.3 is about the validation of GEOS-Chem with satellites and monitoring stations. Likewise, the result section includes element that would better fit into the method section. The manuscript is not very concise with many statements repeated at different places. The language is sometimes difficult to understand and inappropriate for a scientific paper (e.g.,: "achieving perfection", "intriguing phenomena", "truth revealed").
- The authors make heavy use of the term "model calibration". I have not seen the term anywhere before nor could I find a reference where the term is explained. The authors cite Zhu et al. (2021), who does not use the term, and Kalnay (2002), who writes that "…representativeness errors can be systematic or random. Systematic errors and biases should be determined by calibration or other means such as time averages." (p199). I interpret this as calibration of the observations and not the model. The suggested LUBR operator is also applied to adjust the observations and not the model. I think it would be a good idea to use a more common term.
- The validation of the GEOS-Chem with OMI NO2 in Section 2.3 is not reproducible from the provided information. The authors mention their previous study without adding a reference. OMI observations are mentioned for the first time in Section 2.3. It is not clear which OMI product (NASA, Dutch or a custom product) is actually used for validation. However, when comparing OMI NO2 columns data with model simulations it is necessary to update the air mass factors (AMFs) using the averaging kernels or scattering weights provided by the product. OMI NO2 can be biased for various reasons (NO2 profile shapes, surface reflectance and aerosols), which are not considered in the validation. Importantly, the OMI standard products tend to underestimate NO2 columns in China, which can explain the discrepancy (e.g., Lin et al. 2015, https://acp.copernicus.org/articles/15/11217/2015/).
- The authors conclude from their validation with OMI data that the GEOS-Chem simulations are overestimated, which, as written above, might be wrong. However, this is assumption is used to argue the improvements in the RMSE and MAE from the LUBR operator. Given this importance in the paper, the potential problem with the OMI data need to be addressed and the impact on the statistical evaluation (Section 3.2.2) reassessed.
- The references often do not support the statement in the manuscript (see specific comments for examples). The authors should check there references especially when making general statements. Several links are not working.
- The authors provide Python code and NO2 measurements, but no GEOS-Chem fields. It is therefore not possible to test the new operator even on a small dataset. It would be great if at least a test dataset can be provided that demonstrates the application of the operator. In the current version, the datasets are very small and it would easier to add them as a supplement directly to the manuscript.
Specific comments
P2L23: Many measurement techniques (incl. remote sensing) are not direct measurements.
P2L29: The references do not support the statement that "satellite remote sensing […] made it possible to observe near-surface air pollutant […] from space". Zhang et al. 2020 describes the EMI NO2 retrieval algorithm and Jin et al. 2023a describes top-down NH3 emission estimates based on IASI. Possible references are Xu et al. 2019 (https://doi.org/10.1016/j.scitotenv.2018.11.125), Kim et al. 2021 (https://doi.org/10.1016/j.rse.2021.112573) and many other studies.
P3L1ff: The rest of this section discusses monitoring stations, but what is the role of the satellite observations introduced in the previous paragraph?
P3L4ff: A third approach is only using stations that spatially and temporally representative for the model grid.
P3L15: What does it mean when an approach is "unfair"?
P3L16f: A more common interpretation of grid cells is that they represent the mean state in a region. I do not think that Tessum et al. (2017) claim that a grid cell corresponds to distinct spatial location.
P3L17f: A spatial resolution of 0.5° is not very high for regional chemistry simulations, which nowadays are often run at about 10 km simulations.
P3L23ff: Wu et al. 2021b do not claim that anthropogenic NO2 emissions primarily occur in the troposphere. They actually write on the spatial variability NO2 emissions: "traffic-related pollutants in urban environments can vary substantially within a few meters (Pattinson et al., 2014; Targino et al., 2016)."
P4L19f / Figure 1: The country borders in the figure do not follow the rules for GMD papers: "Please adhere to United Nations naming conventions for maps used in your manuscript. In order to depoliticize scientific articles, authors should avoid the drawing of borders or use of contested topographical names." (https://www.geoscientific-model-development.net/submission.html#mapsaerials)
P5L1ff: The first sentences of the paragraph repeat partly the introduction.
P6L16f: The model validation with monitoring stations should not be in the supplement, but in the main part of the paper.
P7L13f: The vertical profiles of NO2 and PM2.5 should be quite different, exactly because of their different lifetimes. Therefore, you cannot argue that incorrect vertical profiles cannot be the reason for the differences.
P7L23f: Do the stations are grouped by GEOS-Chem grid cells? Please also clarify if Figure 3 depicts GEOS-Chem or measurement stations.
P7L29ff: The motivation and meaning of the "dynamic urban/rural factor" needs to be explained in more details here.
P10L5ff: Since it is unclear if OMI NO2 observations are not biased (see previous comment), I think the statement that the simulations overestimate atmospheric NO2 needs to be reassessed. The statement that the "truth revealed by the OMI comparison" is also very bold and should be rephrased.
P11L10f: The model calibration should be explained in the method section.
P15L15ff: Please provide an explanation why "grid mean" and "nearest search" have different statistics.
Technical correction (incomplete)
P3L27 ("grid pattern"): Do you mean grid cell?
P5L9 (also P9L8ff): grids -> grid cells
Figure 1: "lightning" -> "night lights" and "blue" -> "purple"
P6L9f: The statement repeats P3L19f
P7L22: "locales" -> "sites"
Figure 3a: The colors of the dashed black and blue lines are reversed.
References: Please check your references. Several of the links in the references are badly formatted or at not working.
Citation: https://doi.org/10.5194/gmd-2023-216-RC2 - AC2: 'Reply on RC2', Jianbing Jin, 08 May 2024
Status: closed
-
RC1: 'Comment on gmd-2023-216', Anonymous Referee #1, 14 Mar 2024
Representation error has posed a challenge in achieving consistent comparisons between models and ground-based observations. This issue arises because model grids are relatively coarse, whereas site-specific observations are locally representative, especially in heterogeneous environments targeting short-lived pollutants. This manuscript addresses this problem for NO2 by introducing a land-use-based representative (LUBR) observational operator, enabling the processed NO2 observations to better represent the means of 0.5x0.625 grid cells. This algorithm is proved effective for short-lived NO2 and is well evaluated in the paper. This method is helpful for accurately interpreting the bias between models and ground-based observations and is applicable to data assimilation research. I recommend this manuscript for publication once the issues outlined below are addressed.
Major comments:
- An assumption underlying this LUBR algorithm is that observations from urban/rural sites can represent the average conditions of the entire urban/rural areas within this grid cell, which is not necessarily accurate. In other words, this algorithm only partially corrects the representation error, a point that needs clarifications.
- In section 2.3, the authors compare modeled surface NO2 with ground-based observations, and modeled NO2 column with OMI observations. They note an inconsistent performance of model in simulating surface and column NO2. In their interpretation throughout the paper, satellite observations are considered more representative, and the model-to-satellite bias is treated as the true bias for simulating NO2. However, this assumption may not be accurate, for reasons that are listed below. It is important to address these issues throughout the paper, although they do not compromise the paper’s overall conclusion. (1) Satellite observations have their own representative issues and should be treated carefully. OMI provides observations only for the 1-2pm overpassing window and are most reliable under clear-sky conditions, when chemistry/meteorology might differ from monthly means. OMI retrievals require a prior NO2 profile shape, which can be a major source of retrieval error. A consistent comparison between OMI and GEOS-Chem requires the same sampling process for modeled NO2, and replacing the a prior NO2 profile shape in OMI retrieval with one simulated by GEOS-Chem. Only after these processes can the bias between the resampled model and reprocessed retrieval be considered the actual bias between the model and satellite observations. It appears that in this paper, the authors lack this preprocessing before determining the model-retrieval bias. This should be corrected. (2) Even with a correctly determined bias between model and satellite observations, it does not imply that this bias will align with the bias between model and ground-based observations. This is because satellites measure column density of NO2, capturing information not just from the surface but also from the troposphere and stratosphere (I assume they use total column density which includes stratospheric contribution – this needs to be clarified in the paper). Thus, it is entirely reasonable for column bias to differ from the surface bias. The authors should not regard column bias as the true bias for ground-level comparisons.
- Additional information is needed regarding OMI product. Why was the OMAEROe product chosen? How does this product perform in comparison to ground based NO2 column observations and to other more popular OMI NO2 products, such as OMNO2 from NASA or the OMI product from KNMI?
- I am curious if the urban/rural factor exhibits any seasonality, considering the longer lifetime of NO2 during winter compared to summer? Can soil NOx emissions during summer (dominant in rural areas?) influence the urban/rural factor? Please consider adding a discussion on this.
- I don’t see the point of figure 4, as it appears to convey ideas similar to those presented in figure 5 or 6. Please consider removing one of these figures to remain concise for evaluation section.
Citation: https://doi.org/10.5194/gmd-2023-216-RC1 - AC1: 'Reply on RC1', Jianbing Jin, 08 May 2024
-
RC2: 'Comment on gmd-2023-216', Anonymous Referee #2, 28 Mar 2024
General comments
The authors develop an observational operator to improve the agreement between NO2 measurements from monitoring stations with low-resolution chemistry transport simulations. The operator uses VIIRS nighttime lights to estimate the urban and rural fraction in the grid cells and adjust the NO2 values based on fraction of urban/rural monitoring sites. The authors test the approach using GEOS-Chem simulations of China. They show that the operator reduces biases in grid cells compared to other operators (nearest search and grid means). The paper is within scope of GMD and provides some advances and tools that can be of interest to GMD readers. However, the paper is a few severe shortcomings that would need to be addressed.
- The manuscript does not have a clear structure, which makes it difficult (and sometimes impossible) to follow the arguments of the authors. The "method" section includes many elements that would fit better to the "introduction" (motivation for the study) or the "result and discussion" section. For example, Section 2.3 is about the validation of GEOS-Chem with satellites and monitoring stations. Likewise, the result section includes element that would better fit into the method section. The manuscript is not very concise with many statements repeated at different places. The language is sometimes difficult to understand and inappropriate for a scientific paper (e.g.,: "achieving perfection", "intriguing phenomena", "truth revealed").
- The authors make heavy use of the term "model calibration". I have not seen the term anywhere before nor could I find a reference where the term is explained. The authors cite Zhu et al. (2021), who does not use the term, and Kalnay (2002), who writes that "…representativeness errors can be systematic or random. Systematic errors and biases should be determined by calibration or other means such as time averages." (p199). I interpret this as calibration of the observations and not the model. The suggested LUBR operator is also applied to adjust the observations and not the model. I think it would be a good idea to use a more common term.
- The validation of the GEOS-Chem with OMI NO2 in Section 2.3 is not reproducible from the provided information. The authors mention their previous study without adding a reference. OMI observations are mentioned for the first time in Section 2.3. It is not clear which OMI product (NASA, Dutch or a custom product) is actually used for validation. However, when comparing OMI NO2 columns data with model simulations it is necessary to update the air mass factors (AMFs) using the averaging kernels or scattering weights provided by the product. OMI NO2 can be biased for various reasons (NO2 profile shapes, surface reflectance and aerosols), which are not considered in the validation. Importantly, the OMI standard products tend to underestimate NO2 columns in China, which can explain the discrepancy (e.g., Lin et al. 2015, https://acp.copernicus.org/articles/15/11217/2015/).
- The authors conclude from their validation with OMI data that the GEOS-Chem simulations are overestimated, which, as written above, might be wrong. However, this is assumption is used to argue the improvements in the RMSE and MAE from the LUBR operator. Given this importance in the paper, the potential problem with the OMI data need to be addressed and the impact on the statistical evaluation (Section 3.2.2) reassessed.
- The references often do not support the statement in the manuscript (see specific comments for examples). The authors should check there references especially when making general statements. Several links are not working.
- The authors provide Python code and NO2 measurements, but no GEOS-Chem fields. It is therefore not possible to test the new operator even on a small dataset. It would be great if at least a test dataset can be provided that demonstrates the application of the operator. In the current version, the datasets are very small and it would easier to add them as a supplement directly to the manuscript.
Specific comments
P2L23: Many measurement techniques (incl. remote sensing) are not direct measurements.
P2L29: The references do not support the statement that "satellite remote sensing […] made it possible to observe near-surface air pollutant […] from space". Zhang et al. 2020 describes the EMI NO2 retrieval algorithm and Jin et al. 2023a describes top-down NH3 emission estimates based on IASI. Possible references are Xu et al. 2019 (https://doi.org/10.1016/j.scitotenv.2018.11.125), Kim et al. 2021 (https://doi.org/10.1016/j.rse.2021.112573) and many other studies.
P3L1ff: The rest of this section discusses monitoring stations, but what is the role of the satellite observations introduced in the previous paragraph?
P3L4ff: A third approach is only using stations that spatially and temporally representative for the model grid.
P3L15: What does it mean when an approach is "unfair"?
P3L16f: A more common interpretation of grid cells is that they represent the mean state in a region. I do not think that Tessum et al. (2017) claim that a grid cell corresponds to distinct spatial location.
P3L17f: A spatial resolution of 0.5° is not very high for regional chemistry simulations, which nowadays are often run at about 10 km simulations.
P3L23ff: Wu et al. 2021b do not claim that anthropogenic NO2 emissions primarily occur in the troposphere. They actually write on the spatial variability NO2 emissions: "traffic-related pollutants in urban environments can vary substantially within a few meters (Pattinson et al., 2014; Targino et al., 2016)."
P4L19f / Figure 1: The country borders in the figure do not follow the rules for GMD papers: "Please adhere to United Nations naming conventions for maps used in your manuscript. In order to depoliticize scientific articles, authors should avoid the drawing of borders or use of contested topographical names." (https://www.geoscientific-model-development.net/submission.html#mapsaerials)
P5L1ff: The first sentences of the paragraph repeat partly the introduction.
P6L16f: The model validation with monitoring stations should not be in the supplement, but in the main part of the paper.
P7L13f: The vertical profiles of NO2 and PM2.5 should be quite different, exactly because of their different lifetimes. Therefore, you cannot argue that incorrect vertical profiles cannot be the reason for the differences.
P7L23f: Do the stations are grouped by GEOS-Chem grid cells? Please also clarify if Figure 3 depicts GEOS-Chem or measurement stations.
P7L29ff: The motivation and meaning of the "dynamic urban/rural factor" needs to be explained in more details here.
P10L5ff: Since it is unclear if OMI NO2 observations are not biased (see previous comment), I think the statement that the simulations overestimate atmospheric NO2 needs to be reassessed. The statement that the "truth revealed by the OMI comparison" is also very bold and should be rephrased.
P11L10f: The model calibration should be explained in the method section.
P15L15ff: Please provide an explanation why "grid mean" and "nearest search" have different statistics.
Technical correction (incomplete)
P3L27 ("grid pattern"): Do you mean grid cell?
P5L9 (also P9L8ff): grids -> grid cells
Figure 1: "lightning" -> "night lights" and "blue" -> "purple"
P6L9f: The statement repeats P3L19f
P7L22: "locales" -> "sites"
Figure 3a: The colors of the dashed black and blue lines are reversed.
References: Please check your references. Several of the links in the references are badly formatted or at not working.
Citation: https://doi.org/10.5194/gmd-2023-216-RC2 - AC2: 'Reply on RC2', Jianbing Jin, 08 May 2024
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
407 | 96 | 40 | 543 | 54 | 32 | 30 |
- HTML: 407
- PDF: 96
- XML: 40
- Total: 543
- Supplement: 54
- BibTeX: 32
- EndNote: 30
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1