Global, high-resolution mapping of tropospheric ozone – explainable machine learning and impact of uncertainties
- 1Jülich Supercomputing Centre, Jülich Research Centre, Wilhelm-Johnen-Straße, 52425 Jülich, Germany
- 2Institute of Geodesy and Geoinformation, University of Bonn, Nußallee 17, 53115 Bonn, Germany
- 3Aachen Institute for Advanced Study in Computational Engineering Science (AICES), RWTH Aachen University, Schinkelstrasse 2a, 52056 Aachen, Germany
- 4Data Science in Earth Observation, Technical University of Munich, Lise-Meitner-Str. 9, 85521 Ottobrunn, Germany
- 5Methods for Model-based Development in Computational Engineering, RWTH Aachen University, Eilfschornsteinstr. 18, 52062 Aachen, Germany
- 1Jülich Supercomputing Centre, Jülich Research Centre, Wilhelm-Johnen-Straße, 52425 Jülich, Germany
- 2Institute of Geodesy and Geoinformation, University of Bonn, Nußallee 17, 53115 Bonn, Germany
- 3Aachen Institute for Advanced Study in Computational Engineering Science (AICES), RWTH Aachen University, Schinkelstrasse 2a, 52056 Aachen, Germany
- 4Data Science in Earth Observation, Technical University of Munich, Lise-Meitner-Str. 9, 85521 Ottobrunn, Germany
- 5Methods for Model-based Development in Computational Engineering, RWTH Aachen University, Eilfschornsteinstr. 18, 52062 Aachen, Germany
Abstract. Tropospheric ozone is a toxic greenhouse gas with a highly variable spatial distribution which is challenging to map on a global scale. Here we present a data-driven ozone mapping workflow generating a transparent and reliable product. We map the global distribution of tropospheric ozone from sparse, irregularly placed measurement stations to a high-resolution regular grid using machine learning methods. The produced map contains the average tropospheric ozone concentration of the years 2010–2014 with a resolution of 0.1° × 0.1°. The machine learning model is trained on AQ-Bench, a precompiled benchmark dataset consisting of multi-year ground-based ozone measurements combined with an abundance of high-resolution geospatial data.
Going beyond standard mapping methods, this work focuses on two key aspects to increase the integrity of the produced map. Using explainable machine learning methods we ensure that the trained machine learning model is consistent with commonly accepted knowledge about tropospheric ozone. To assess the impact of data and model uncertainties on our ozone map, we show that the machine learning model is robust against typical fluctuations in ozone values and geospatial data. By inspecting the feature space, we ensure that the model is only applied in regions where it is reliable.
We provide a rationale for the tools we use to conduct a thorough global analysis. The methods presented here can thus be easily transferred to other mapping applications to ensure the transparency and reliability of the maps produced.
Clara Betancourt et al.
Status: closed
-
CEC1: 'Comment on gmd-2022-2', Juan Antonio Añel, 23 Feb 2022
Dear authors,
After checking your manuscript, it has come to our attention that it does not comply with our Code and Data Policy.
https://www.geoscientific-model-development.net/policies/code_and_data_policy.htmlYou have archived your code in Gitlab. However, Gitlab is not a suitable repository. Therefore, please, publish your code in one of the appropriate repositories listed in our policy. In this way, you must include in a potential reviewed version of your manuscript the modified 'Code and Data Availability' section, including the DOI of the code.
Please, reply as soon as possible to this comment with the new link to the repository, so that it is available for the peer-review process, as it should be.
Regards,
Juan A. Añel
Geosci. Model Dev. Exec. Editor- AC1: 'Reply on CEC1', Clara Betancourt, 03 Mar 2022
-
RC1: 'Comment on gmd-2022-2', Anonymous Referee #1, 25 Feb 2022
I appreciate the effort that the authors have put in assessing spatial uncertainties associated with their predictions. Unfortunately, this is often not a priority in global mapping papers, so I welcome this focus on uncertainty very much. The manuscript is structured well, but overall fairly lengthy and is written in a relaxed, almost conversational writing style - which I like, but from sometimes it’s perhaps a bit too much. Another suggestion would be to move most of the tables and figures to the supplemental to improve readability. With 12 figures and 5 tables in the main text, I found it sometimes hard to navigate and find the most relevant results.
I must admit that I know close to nothing about ozone and what variables structure its spatial patterns in the troposphere, but essentially the model is predicting O3 using latitude, altitude and human development (nearby nightlight); collectively explaining the majority of variation in the model. However, latitude and altitude are both merely a proxy for temperature and radiation; which are probably the actual main drivers of O3 levels (L153). Why not use them directly as predictors in the model? There are various high quality global layers available.
With the very strong effect of absolute latitude in the model (~25% global importance), the predicted pattern will strongly reflect absolute latitude – which is clearly visible in the final map (fig 11).
Next, I have a couple of doubts with the current “area of applicability” (AOA) approach, where features are weighted by their respective importances as in the model. By weighting the features, you’re essentially using absolute latitude and altitude (twice, including relative altitude) as to define feature space in which you apply the model. But as above; it’s not latitude or altitude that define this space, but temperature and radiation (L153). A major drawback, I think, from scaling the features used in the AOA analysis using their respective importances, is that you’re now basically assuming you can predict everywhere on the same latitude, save for some places (some of) the fewer important features fall outside the sampled space.
It would be useful if the authors would include a comparison with previously published tropospheric ozone predictions, based on mechanistic models. Of course, the authors did do a substantial test of their model’s validity and stability – but these really only hold true for this particular model and dataset; and don’t provide insight in how the results compare to other tropospheric ozone predictions.
Further points
- The statement on L440 (“The effect of ‘absolute latitude’ on predictions is consistent with what is known about ozone formation – ozone generally increases toward the equator”) seems to be the opposite of the patterns that are predicted (Fig 11), with lower values near the equator, and higher values in temperate regions?
- The selection of the threshold distance for ‘non-predictable’ data (ie., the upper whisker of all the cv distances), is seemingly arbitrary. It is in line with the AOA paper by Meyer and Pebesma, but neither that paper provides a statistical reasoning for picking this particular threshold.
- The manuscript includes various subjective interpretations of the results. I believe the manuscript would benefit from a more objective wording. Some examples:
- L310: “East Asia is a special case because the ozone value distribution is rather narrow there”
- L325: “Very high ‘nightlight in 5km area’ values”
- L406: “are considered significant”
- Figure C1: “really high values”
- The authors do use “not shown” rather often, a total of 8 times in the entire manuscript. I guess this is ok, but if the authors feel that the data/results aren’t necessary to show, arguable the entire section can be removed. If not, I would suggest placing the evidence for the statement in the supplemental materials.
- Figure 3: are these ‘example data points’ points in the AQ-bench dataset or in the raster data you’re predicting?
- AC2: 'Reply on RC1', Clara Betancourt, 14 Apr 2022
-
RC2: 'Comment on gmd-2022-2', Anonymous Referee #2, 13 Mar 2022
The authors demonstrate a machine learning approach to generate high-resolution surface ozone concentration products, and evaluate the uncertainties from models and data sources. Many techniques are used in this study and they are generally explained well. The surface ozone products can be potentially used for other studies if the produced ozone mapping is robust. The manuscript is written well in a conversational way and I can feel that the authors try to add the novelty in explaining machine learning results, but overinterpreting should be avoided. There are a few major concerns that I think should be addressed carefully about the motivation of the study and the usage of final ozone products.
Major comments:
- High-resolution ozone mapping is a highlight of this study, but are there any differences between directly using interpolated original ozone products (TOAR) and the products generated here? I think the authors try to extend the application to the regions where measurement sites are not available, but it is clearly that the trained results are limited to the number of measurement sites (mentioned in Sect. 3.2.1).
- High-resolution ozone mapping may introduce extra uncertainties because input features or surface ozone concentrations may have large biases. Since surface ozone is regionally spread, slightly decreasing resolution may reduce uncertainties. This should be discussed to strengthen the motivation of the study.
- The averaged surface ozone concentrations over 2004-2014 are reproduced and the authors also mentioned that the products are static in Sect. 4.4. The geographic variables are used to drive the machine learning model and many of them (e.g. latitude, altitude) instead of physical or chemical variables show high importance to simulated ozone. The relationships between some variables are intuitive, but the issue is that simulated future ozone concentrations may be quite similar to those simulated over 2004-2014 because the geographic variables with high importance are static – they will not change in the future. The temporal relationships may not be captured by the machine learning model. It may be useful to justify the usage of average ozone earlier in the data description section, and to state the benefits of using final ozone products.
Other comments:
- Line 11: “By inspecting the feature space, …”. Not clear in the abstract.
- Line 59: Need to clearly point out the key issues in the current mapping field and what the benefits are by using machine learning approaches.
- Line 79: Please justify the usage of annual mean surface ozone concentrations. I suppose that using monthly data would make the model more robust as more data are involved in the training?
- Line 76: Are there only 5577 data used for machine learning?
- A more specific title is needed for Figure 1 instead of saying ‘average ozone values’.
- Line 86: I am not convinced by the association between ‘latitude’ and ozone photochemistry.
- In Table. 1, many land cover variables are used so they may principally reflect ozone dry deposition? Some discussions are needed here.
- NOx emissions and columns are used. What about other ozone precursor emissions?
- Line 106: It is too confident to state that the random forest is the most suitable; apparently, it is not.
- Figure 3: Does the data points outside the area of applicability simply mean they have extreme high or low values that are not easily to predict? As you scale feature values with SHAP values, it is likely that the threshold used to filter large values is largely dependent on altitude.
- Line 303: I cannot judge if RMSEs in the range of 3.84 to 4.04 ppb are large or small, even though the authors said this is acceptable. I think it will be better to show temporal one standard deviation of surface ozone concentrations along with surface ozone mixing ratios (annual mean) in Fig. 1 for readers.
- Line 327: SHAP value discussions are in Sect. 4.2. I suggest that the authors avoid using many forward references, and merge some discussions in the corresponding sections.
- The evaluation picture (Fig. 10) is important, and I suggest to move it forward.
- Two panels should be indicated in Fig. 11. It would be interesting to show the readers the predicted surface ozone mixing ratios across the globe, even if the authors identify some areas as inapplicable.
- Line 445: I think this is overinterpreted as you are using nightlight conditions to explain monthly or annual mean ozone variation. Ozone chemical production or destruction depends on NOx concentrations and NO titration is one aspect. It is fine if some relationships cannot be explained and I don’t expect the relationships derived from SHAP values can explain every feature because machine learning model is not process-based.
- How do authors think of the relative importance of training data number and training strategies (e.g. model types, feature selections) in ozone mapping? The number of training data may be more important shown in the study, and there is a need to discuss this aspect.
- AC3: 'Reply on RC2', Clara Betancourt, 14 Apr 2022
Status: closed
-
CEC1: 'Comment on gmd-2022-2', Juan Antonio Añel, 23 Feb 2022
Dear authors,
After checking your manuscript, it has come to our attention that it does not comply with our Code and Data Policy.
https://www.geoscientific-model-development.net/policies/code_and_data_policy.htmlYou have archived your code in Gitlab. However, Gitlab is not a suitable repository. Therefore, please, publish your code in one of the appropriate repositories listed in our policy. In this way, you must include in a potential reviewed version of your manuscript the modified 'Code and Data Availability' section, including the DOI of the code.
Please, reply as soon as possible to this comment with the new link to the repository, so that it is available for the peer-review process, as it should be.
Regards,
Juan A. Añel
Geosci. Model Dev. Exec. Editor- AC1: 'Reply on CEC1', Clara Betancourt, 03 Mar 2022
-
RC1: 'Comment on gmd-2022-2', Anonymous Referee #1, 25 Feb 2022
I appreciate the effort that the authors have put in assessing spatial uncertainties associated with their predictions. Unfortunately, this is often not a priority in global mapping papers, so I welcome this focus on uncertainty very much. The manuscript is structured well, but overall fairly lengthy and is written in a relaxed, almost conversational writing style - which I like, but from sometimes it’s perhaps a bit too much. Another suggestion would be to move most of the tables and figures to the supplemental to improve readability. With 12 figures and 5 tables in the main text, I found it sometimes hard to navigate and find the most relevant results.
I must admit that I know close to nothing about ozone and what variables structure its spatial patterns in the troposphere, but essentially the model is predicting O3 using latitude, altitude and human development (nearby nightlight); collectively explaining the majority of variation in the model. However, latitude and altitude are both merely a proxy for temperature and radiation; which are probably the actual main drivers of O3 levels (L153). Why not use them directly as predictors in the model? There are various high quality global layers available.
With the very strong effect of absolute latitude in the model (~25% global importance), the predicted pattern will strongly reflect absolute latitude – which is clearly visible in the final map (fig 11).
Next, I have a couple of doubts with the current “area of applicability” (AOA) approach, where features are weighted by their respective importances as in the model. By weighting the features, you’re essentially using absolute latitude and altitude (twice, including relative altitude) as to define feature space in which you apply the model. But as above; it’s not latitude or altitude that define this space, but temperature and radiation (L153). A major drawback, I think, from scaling the features used in the AOA analysis using their respective importances, is that you’re now basically assuming you can predict everywhere on the same latitude, save for some places (some of) the fewer important features fall outside the sampled space.
It would be useful if the authors would include a comparison with previously published tropospheric ozone predictions, based on mechanistic models. Of course, the authors did do a substantial test of their model’s validity and stability – but these really only hold true for this particular model and dataset; and don’t provide insight in how the results compare to other tropospheric ozone predictions.
Further points
- The statement on L440 (“The effect of ‘absolute latitude’ on predictions is consistent with what is known about ozone formation – ozone generally increases toward the equator”) seems to be the opposite of the patterns that are predicted (Fig 11), with lower values near the equator, and higher values in temperate regions?
- The selection of the threshold distance for ‘non-predictable’ data (ie., the upper whisker of all the cv distances), is seemingly arbitrary. It is in line with the AOA paper by Meyer and Pebesma, but neither that paper provides a statistical reasoning for picking this particular threshold.
- The manuscript includes various subjective interpretations of the results. I believe the manuscript would benefit from a more objective wording. Some examples:
- L310: “East Asia is a special case because the ozone value distribution is rather narrow there”
- L325: “Very high ‘nightlight in 5km area’ values”
- L406: “are considered significant”
- Figure C1: “really high values”
- The authors do use “not shown” rather often, a total of 8 times in the entire manuscript. I guess this is ok, but if the authors feel that the data/results aren’t necessary to show, arguable the entire section can be removed. If not, I would suggest placing the evidence for the statement in the supplemental materials.
- Figure 3: are these ‘example data points’ points in the AQ-bench dataset or in the raster data you’re predicting?
- AC2: 'Reply on RC1', Clara Betancourt, 14 Apr 2022
-
RC2: 'Comment on gmd-2022-2', Anonymous Referee #2, 13 Mar 2022
The authors demonstrate a machine learning approach to generate high-resolution surface ozone concentration products, and evaluate the uncertainties from models and data sources. Many techniques are used in this study and they are generally explained well. The surface ozone products can be potentially used for other studies if the produced ozone mapping is robust. The manuscript is written well in a conversational way and I can feel that the authors try to add the novelty in explaining machine learning results, but overinterpreting should be avoided. There are a few major concerns that I think should be addressed carefully about the motivation of the study and the usage of final ozone products.
Major comments:
- High-resolution ozone mapping is a highlight of this study, but are there any differences between directly using interpolated original ozone products (TOAR) and the products generated here? I think the authors try to extend the application to the regions where measurement sites are not available, but it is clearly that the trained results are limited to the number of measurement sites (mentioned in Sect. 3.2.1).
- High-resolution ozone mapping may introduce extra uncertainties because input features or surface ozone concentrations may have large biases. Since surface ozone is regionally spread, slightly decreasing resolution may reduce uncertainties. This should be discussed to strengthen the motivation of the study.
- The averaged surface ozone concentrations over 2004-2014 are reproduced and the authors also mentioned that the products are static in Sect. 4.4. The geographic variables are used to drive the machine learning model and many of them (e.g. latitude, altitude) instead of physical or chemical variables show high importance to simulated ozone. The relationships between some variables are intuitive, but the issue is that simulated future ozone concentrations may be quite similar to those simulated over 2004-2014 because the geographic variables with high importance are static – they will not change in the future. The temporal relationships may not be captured by the machine learning model. It may be useful to justify the usage of average ozone earlier in the data description section, and to state the benefits of using final ozone products.
Other comments:
- Line 11: “By inspecting the feature space, …”. Not clear in the abstract.
- Line 59: Need to clearly point out the key issues in the current mapping field and what the benefits are by using machine learning approaches.
- Line 79: Please justify the usage of annual mean surface ozone concentrations. I suppose that using monthly data would make the model more robust as more data are involved in the training?
- Line 76: Are there only 5577 data used for machine learning?
- A more specific title is needed for Figure 1 instead of saying ‘average ozone values’.
- Line 86: I am not convinced by the association between ‘latitude’ and ozone photochemistry.
- In Table. 1, many land cover variables are used so they may principally reflect ozone dry deposition? Some discussions are needed here.
- NOx emissions and columns are used. What about other ozone precursor emissions?
- Line 106: It is too confident to state that the random forest is the most suitable; apparently, it is not.
- Figure 3: Does the data points outside the area of applicability simply mean they have extreme high or low values that are not easily to predict? As you scale feature values with SHAP values, it is likely that the threshold used to filter large values is largely dependent on altitude.
- Line 303: I cannot judge if RMSEs in the range of 3.84 to 4.04 ppb are large or small, even though the authors said this is acceptable. I think it will be better to show temporal one standard deviation of surface ozone concentrations along with surface ozone mixing ratios (annual mean) in Fig. 1 for readers.
- Line 327: SHAP value discussions are in Sect. 4.2. I suggest that the authors avoid using many forward references, and merge some discussions in the corresponding sections.
- The evaluation picture (Fig. 10) is important, and I suggest to move it forward.
- Two panels should be indicated in Fig. 11. It would be interesting to show the readers the predicted surface ozone mixing ratios across the globe, even if the authors identify some areas as inapplicable.
- Line 445: I think this is overinterpreted as you are using nightlight conditions to explain monthly or annual mean ozone variation. Ozone chemical production or destruction depends on NOx concentrations and NO titration is one aspect. It is fine if some relationships cannot be explained and I don’t expect the relationships derived from SHAP values can explain every feature because machine learning model is not process-based.
- How do authors think of the relative importance of training data number and training strategies (e.g. model types, feature selections) in ozone mapping? The number of training data may be more important shown in the study, and there is a need to discuss this aspect.
- AC3: 'Reply on RC2', Clara Betancourt, 14 Apr 2022
Clara Betancourt et al.
Clara Betancourt et al.
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
734 | 187 | 24 | 945 | 12 | 18 |
- HTML: 734
- PDF: 187
- XML: 24
- Total: 945
- BibTeX: 12
- EndNote: 18
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1