Articles | Volume 15, issue 11
https://doi.org/10.5194/gmd-15-4331-2022
© Author(s) 2022. This work is distributed under
the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
https://doi.org/10.5194/gmd-15-4331-2022
© Author(s) 2022. This work is distributed under
the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Global, high-resolution mapping of tropospheric ozone – explainable machine learning and impact of uncertainties
Clara Betancourt
Jülich Supercomputing Centre, Jülich Research Centre, Wilhelm-Johnen-Straße, 52425 Jülich, Germany
Timo T. Stomberg
Institute of Geodesy and Geoinformation, University of Bonn, Niebuhrstraße 1a, 53113 Bonn, Germany
Ann-Kathrin Edrich
Aachen Institute for Advanced Study in Computational Engineering Science (AICES), RWTH Aachen University, Schinkelstrasse 2a, 52062 Aachen, Germany
Methods for Model-based Development in Computational Engineering, RWTH Aachen University, Eilfschornsteinstr. 18, 52062 Aachen, Germany
Ankit Patnala
Jülich Supercomputing Centre, Jülich Research Centre, Wilhelm-Johnen-Straße, 52425 Jülich, Germany
Martin G. Schultz
Jülich Supercomputing Centre, Jülich Research Centre, Wilhelm-Johnen-Straße, 52425 Jülich, Germany
Ribana Roscher
Institute of Geodesy and Geoinformation, University of Bonn, Niebuhrstraße 1a, 53113 Bonn, Germany
Data Science in Earth Observation, Technical University of Munich, Lise-Meitner-Str. 9, 85521 Ottobrunn, Germany
Julia Kowalski
Methods for Model-based Development in Computational Engineering, RWTH Aachen University, Eilfschornsteinstr. 18, 52062 Aachen, Germany
Scarlet Stadtler
CORRESPONDING AUTHOR
Jülich Supercomputing Centre, Jülich Research Centre, Wilhelm-Johnen-Straße, 52425 Jülich, Germany
Related authors
Clara Betancourt, Timo Stomberg, Ribana Roscher, Martin G. Schultz, and Scarlet Stadtler
Earth Syst. Sci. Data, 13, 3013–3033, https://doi.org/10.5194/essd-13-3013-2021, https://doi.org/10.5194/essd-13-3013-2021, 2021
Short summary
Short summary
With the AQ-Bench dataset, we contribute to shared data usage and machine learning methods in the field of environmental science. The AQ-Bench dataset contains air quality data and metadata from more than 5500 air quality observation stations all over the world. The dataset offers a low-threshold entrance to machine learning on a real-world environmental dataset. AQ-Bench thus provides a blueprint for environmental benchmark datasets.
Clara Betancourt, Christoph Küppers, Tammarat Piansawan, Uta Sager, Andrea B. Hoyer, Heinz Kaminski, Gerhard Rapp, Astrid C. John, Miriam Küpper, Ulrich Quass, Thomas Kuhlbusch, Jochen Rudolph, Astrid Kiendler-Scharr, and Iulia Gensch
Atmos. Chem. Phys., 21, 5953–5964, https://doi.org/10.5194/acp-21-5953-2021, https://doi.org/10.5194/acp-21-5953-2021, 2021
Short summary
Short summary
For the first time, we included stable isotopes in the Lagrangian particle dispersion model FLEXPART to investigate firewood home heating aerosol. This is an innovative source apportionment methodology since comparison of stable isotope ratio model predictions with observations delivers quantitative understanding of atmospheric processes. The main outcome of this study is that the home heating aerosol in residential areas was not of remote origin.
Biplob Dey, Toke Due Sjøgren, Peeyush Khare, Georgios I. Gkatzelis, Yizhen Wu, Sindhu Vasireddy, Martin Schultz, Alexander Knohl, Riikka Rinnan, Thorsten Hohaus, and Eva Y. Pfannerstill
EGUsphere, https://doi.org/10.5194/egusphere-2025-3779, https://doi.org/10.5194/egusphere-2025-3779, 2025
This preprint is open for discussion and under review for Biogeosciences (BG).
Short summary
Short summary
Trees release reactive gases that affect air quality and climate. We studied how these emissions from European beech and English oak change under realistic scenarios of combined and single heat and ozone stress. Heat increased emissions, while ozone reduced most of them. When stressors were combined, the effects were complex and varied by species. Machine learning identified key stress-related compounds. Our findings show that future tree stress may alter air quality and climate interactions.
Alexander Hermanns, Anne Caroline Lange, Julia Kowalski, Hendrik Fuchs, and Philipp Franke
EGUsphere, https://doi.org/10.5194/egusphere-2025-450, https://doi.org/10.5194/egusphere-2025-450, 2025
Short summary
Short summary
For air quality analyses, data assimilation models split available data into assimilation and validation data sets. The former is used to generate the analysis, the latter to verify the simulations. A preprocessor classifying the observations by the data characteristics is developed based on clustering algorithms. The assimilation and validation data sets are compiled by equally allocating data of each cluster. The resulting improvement of the analysis is evaluated with EURAD-IM.
Ramiyou Karim Mache, Sabine Schröder, Michael Langguth, Ankit Patnala, and Martin G. Schultz
EGUsphere, https://doi.org/10.5194/egusphere-2025-1399, https://doi.org/10.5194/egusphere-2025-1399, 2025
Short summary
Short summary
The TOAR-classifier model is a data-driven tool that allows for an objective classification of air quality measuring stations as urban, rural, or suburban. Such classification is important in the analysis of air pollutant trends and regional signatures. The model is employed in the second Tropospheric Ozone Assessment Report but can also be used in other research work.
Yugo Kanaya, Roberto Sommariva, Alfonso Saiz-Lopez, Andrea Mazzeo, Theodore K. Koenig, Kaori Kawana, James E. Johnson, Aurélie Colomb, Pierre Tulet, Suzie Molloy, Ian E. Galbally, Rainer Volkamer, Anoop Mahajan, John W. Halfacre, Paul B. Shepson, Julia Schmale, Hélène Angot, Byron Blomquist, Matthew D. Shupe, Detlev Helmig, Junsu Gil, Meehye Lee, Sean C. Coburn, Ivan Ortega, Gao Chen, James Lee, Kenneth C. Aikin, David D. Parrish, John S. Holloway, Thomas B. Ryerson, Ilana B. Pollack, Eric J. Williams, Brian M. Lerner, Andrew J. Weinheimer, Teresa Campos, Frank M. Flocke, J. Ryan Spackman, Ilann Bourgeois, Jeff Peischl, Chelsea R. Thompson, Ralf M. Staebler, Amir A. Aliabadi, Wanmin Gong, Roeland Van Malderen, Anne M. Thompson, Ryan M. Stauffer, Debra E. Kollonige, Juan Carlos Gómez Martin, Masatomo Fujiwara, Katie Read, Matthew Rowlinson, Keiichi Sato, Junichi Kurokawa, Yoko Iwamoto, Fumikazu Taketani, Hisahiro Takashima, Monica Navarro Comas, Marios Panagi, and Martin G. Schultz
Earth Syst. Sci. Data Discuss., https://doi.org/10.5194/essd-2024-566, https://doi.org/10.5194/essd-2024-566, 2025
Revised manuscript accepted for ESSD
Short summary
Short summary
The first comprehensive dataset of tropospheric ozone over oceans/polar regions is presented, including 77 ship/buoy and 48 aircraft campaign observations (1977–2022, 0–5000 m altitude), supplemented by ozonesonde and surface data. Air masses isolated from land for 72+ hours are systematically selected as essentially oceanic. Among the 11 global regions, they show daytime decreases of 10–16% in the tropics, while near-zero depletions are rare, unlike in the Arctic, implying different mechanisms.
Sebastian H. M. Hickman, Makoto Kelp, Paul T. Griffiths, Kelsey Doerksen, Kazuyuki Miyazaki, Elyse A. Pennington, Gerbrand Koren, Fernando Iglesias-Suarez, Martin G. Schultz, Kai-Lan Chang, Owen R. Cooper, Alexander T. Archibald, Roberto Sommariva, David Carlson, Hantao Wang, J. Jason West, and Zhenze Liu
EGUsphere, https://doi.org/10.5194/egusphere-2024-3739, https://doi.org/10.5194/egusphere-2024-3739, 2025
Short summary
Short summary
Machine learning is being more widely used across environmental and climate science. This work reviews the use of machine learning in tropospheric ozone research, focusing on three main application areas in which significant progress has been made. Common challenges in using machine learning across the three areas are highlighted, and future directions for the field are indicated.
Hantao Wang, Kazuyuki Miyazaki, Haitong Zhe Sun, Zhen Qu, Xiang Liu, Antje Inness, Martin Schultz, Sabine Schröder, Marc Serre, and J. Jason West
EGUsphere, https://doi.org/10.5194/egusphere-2024-3723, https://doi.org/10.5194/egusphere-2024-3723, 2025
Short summary
Short summary
We compare six datasets of global ground-level ozone, developed using geostatistical, machine learning, or reanalysis methods. The datasets show important differences from one another in ozone magnitude, greater than 5 ppb, and trends, globally and regionally. Compared with measurements, performance varies among datasets, and most overestimate ozone, particularly at lower concentrations. These differences among datasets highlight uncertainties for applications to health and other impacts.
Matthias Rauter and Julia Kowalski
Geosci. Model Dev., 17, 6545–6569, https://doi.org/10.5194/gmd-17-6545-2024, https://doi.org/10.5194/gmd-17-6545-2024, 2024
Short summary
Short summary
Snow avalanches can form large powder clouds that substantially exceed the velocity and reach of the dense core. Only a few complex models exist to simulate this phenomenon, and the respective hazard is hard to predict. This work provides a novel flow model that focuses on simple relations while still encapsulating the significant behaviour. The model is applied to reconstruct two catastrophic powder snow avalanche events in Austria.
Bing Gong, Michael Langguth, Yan Ji, Amirpasha Mozaffari, Scarlet Stadtler, Karim Mache, and Martin G. Schultz
Geosci. Model Dev., 15, 8931–8956, https://doi.org/10.5194/gmd-15-8931-2022, https://doi.org/10.5194/gmd-15-8931-2022, 2022
Short summary
Short summary
Inspired by the success of deep learning in various domains, we test the applicability of video prediction methods by generative adversarial network (GAN)-based deep learning to predict the 2 m temperature over Europe. Our video prediction models have skill in predicting the diurnal cycle of 2 m temperature up to 12 h ahead. Complemented by probing the relevance of several model parameters, this study confirms the potential of deep learning in meteorological forecasting applications.
Felix Kleinert, Lukas H. Leufen, Aurelia Lupascu, Tim Butler, and Martin G. Schultz
Geosci. Model Dev., 15, 8913–8930, https://doi.org/10.5194/gmd-15-8913-2022, https://doi.org/10.5194/gmd-15-8913-2022, 2022
Short summary
Short summary
We examine the effects of spatially aggregated upstream information as input for a deep learning model forecasting near-surface ozone levels. Using aggregated data from one upstream sector (45°) improves the forecast by ~ 10 % for 4 prediction days. Three upstream sectors improve the forecasts by ~ 14 % on the first 2 d only. Our results serve as an orientation for other researchers or environmental agencies focusing on pointwise time-series predictions, for example, due to regulatory purposes.
Swantje Preuschmann, Tanja Blome, Knut Görl, Fiona Köhnke, Bettina Steuri, Juliane El Zohbi, Diana Rechid, Martin Schultz, Jianing Sun, and Daniela Jacob
Adv. Sci. Res., 19, 51–71, https://doi.org/10.5194/asr-19-51-2022, https://doi.org/10.5194/asr-19-51-2022, 2022
Short summary
Short summary
The main aspect of the paper is to obtain transferable principles for the development of digital knowledge transfer products. As such products are still unstandardised, the authors explored challenges and approaches for product developments. The authors report what they see as useful principles for developing digital knowledge transfer products, by describing the experience of developing the Net-Zero-2050 Web-Atlas and the "Bodenkohlenstoff-App".
Konstantin Schürholt, Julia Kowalski, and Henning Löwe
The Cryosphere, 16, 903–923, https://doi.org/10.5194/tc-16-903-2022, https://doi.org/10.5194/tc-16-903-2022, 2022
Short summary
Short summary
This companion paper deals with numerical particularities of partial differential equations underlying 1D snow models. In this first part we neglect mechanical settling and demonstrate that the nonlinear coupling between diffusive transport (heat and vapor), phase changes and ice mass conservation contains a wave instability that may be relevant for weak layer formation. Numerical requirements are discussed in view of the underlying homogenization scheme.
Anna Simson, Henning Löwe, and Julia Kowalski
The Cryosphere, 15, 5423–5445, https://doi.org/10.5194/tc-15-5423-2021, https://doi.org/10.5194/tc-15-5423-2021, 2021
Short summary
Short summary
This companion paper deals with numerical particularities of partial differential equations underlying one-dimensional snow models. In this second part we include mechanical settling and develop a new hybrid (Eulerian–Lagrangian) method for solving the advection-dominated ice mass conservation on a moving mesh alongside Eulerian diffusion (heat and vapor) and phase changes. The scheme facilitates a modular and extendable solver strategy while retaining controls on numerical accuracy.
Clara Betancourt, Timo Stomberg, Ribana Roscher, Martin G. Schultz, and Scarlet Stadtler
Earth Syst. Sci. Data, 13, 3013–3033, https://doi.org/10.5194/essd-13-3013-2021, https://doi.org/10.5194/essd-13-3013-2021, 2021
Short summary
Short summary
With the AQ-Bench dataset, we contribute to shared data usage and machine learning methods in the field of environmental science. The AQ-Bench dataset contains air quality data and metadata from more than 5500 air quality observation stations all over the world. The dataset offers a low-threshold entrance to machine learning on a real-world environmental dataset. AQ-Bench thus provides a blueprint for environmental benchmark datasets.
T. Stomberg, I. Weber, M. Schmitt, and R. Roscher
ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci., V-3-2021, 317–324, https://doi.org/10.5194/isprs-annals-V-3-2021-317-2021, https://doi.org/10.5194/isprs-annals-V-3-2021-317-2021, 2021
Clara Betancourt, Christoph Küppers, Tammarat Piansawan, Uta Sager, Andrea B. Hoyer, Heinz Kaminski, Gerhard Rapp, Astrid C. John, Miriam Küpper, Ulrich Quass, Thomas Kuhlbusch, Jochen Rudolph, Astrid Kiendler-Scharr, and Iulia Gensch
Atmos. Chem. Phys., 21, 5953–5964, https://doi.org/10.5194/acp-21-5953-2021, https://doi.org/10.5194/acp-21-5953-2021, 2021
Short summary
Short summary
For the first time, we included stable isotopes in the Lagrangian particle dispersion model FLEXPART to investigate firewood home heating aerosol. This is an innovative source apportionment methodology since comparison of stable isotope ratio model predictions with observations delivers quantitative understanding of atmospheric processes. The main outcome of this study is that the home heating aerosol in residential areas was not of remote origin.
Lukas Hubert Leufen, Felix Kleinert, and Martin G. Schultz
Geosci. Model Dev., 14, 1553–1574, https://doi.org/10.5194/gmd-14-1553-2021, https://doi.org/10.5194/gmd-14-1553-2021, 2021
Short summary
Short summary
MLAir provides a coherent end-to-end structure for a typical time series analysis workflow using machine learning (ML). MLAir is adaptable to a wide range of ML use cases, focusing in particular on deep learning. The user has a free hand with the ML model itself and can select from different methods during preprocessing, training, and postprocessing. MLAir offers tools to track the experiment conduction, documents necessary ML parameters, and creates a variety of publication-ready plots.
Felix Kleinert, Lukas H. Leufen, and Martin G. Schultz
Geosci. Model Dev., 14, 1–25, https://doi.org/10.5194/gmd-14-1-2021, https://doi.org/10.5194/gmd-14-1-2021, 2021
Short summary
Short summary
With IntelliO3-ts v1.0, we present an artificial neural network as a new forecasting model for daily aggregated near-surface ozone concentrations with a lead time of up to 4 d. We used measurement and reanalysis data from more than 300 German monitoring stations to train, fine tune, and test the model. We show that the model outperforms standard reference models like persistence models and demonstrate that IntelliO3-ts outperforms climatological reference models for the first 2 d.
Cited articles
Amante, C. and Eakins, B. W.: ETOPO1 arc-minute global relief model:
procedures, data sources and analysis, Tech. rep., NOAA National Geophysical
Data Center, Boulder, Colorado, https://doi.org/10.7289/V5C8276M, 2009. a, b
Bastin, J.-F., Finegold, Y., Garcia, C., Mollicone, D., Rezende, M., Routh, D.,
Zohner, C. M., and Crowther, T. W.: The global tree restoration potential,
Science, 365, 76–79, https://doi.org/10.1126/science.aax0848, 2019. a
Betancourt, C., Stomberg, T., Stadtler, S., Roscher, R., and Schultz, M. G.: AQ-Bench, B2SHARE [data set], https://doi.org/10.23728/b2share.30d42b5a87344e82855a486bf2123e9f, 2020. a
Betancourt, C., Stadtler, S., Stomberg, T., Edrich, A.-K., Patnala, A., Roscher, R., Kowalski, J., and Schultz, M. G.: Global fine resolution mapping of ozone metrics through explainable machine learning, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-7596, https://doi.org/10.5194/egusphere-egu21-7596, 2021. a
Betancourt, C., Stomberg, T., Edrich, A.-K., Patnala, A., and Stadtler, S.: Global, high-resolution mapping of tropospheric ozone – explainable machine learning and impact of uncertainties – Source Code, B2SHARE [code], https://doi.org/10.34730/af084443e1c444feb12d83a93a65fa33, 2022. a
Blanke, S.: Hyperactive: An optimization and data collection toolbox for
convenient and fast prototyping of computationally expensive models, v2.3.0, GitHub [code],
https://github.com/SimonBlanke/Hyperactive, last access: 4 December
2021. a
Brasseur, G., Orlando, J. J., and Tyndall, G. S. (Eds.): Atmospheric chemistry
and global change, Oxford University Press, New York, US, 1st Edn., ISBN-10 0195105214, 1999. a
Breiman, L.: Random forests, Mach. Learn., 45, 5–32,
https://doi.org/10.1023/A:1010933404324, 2001. a
Briggs, D. J., Collins, S., Elliott, P., Fischer, P., Kingham, S., Lebret, E.,
Pryl, K., Van Reeuwijk, H., Smallbone, K., and Van Der Veen, A.: Mapping
urban air pollution using GIS: a regression-based approach, Int. J. Geogr.
Inf. Sci., 11, 699–718, https://doi.org/10.1080/136588197242158, 1997. a, b
Chevalier, A., Gheusi, F., Delmas, R., Ordóñez, C., Sarrat, C., Zbinden, R., Thouret, V., Athier, G., and Cousin, J.-M.: Influence of altitude on ozone levels and variability in the lower troposphere: a ground-based study for western Europe over the period 2001–2004, Atmos. Chem. Phys., 7, 4311–4326, https://doi.org/10.5194/acp-7-4311-2007, 2007. a
CIESIN: Gridded Population of the World, Version 3 (GPWv3): Population Count
Grid, Center for International Earth Science Information Network
– CIESIN – Columbia University, United Nations Food and Agriculture Programme
– FAO, and Centro Internacional de Agricultura Tropical – CIAT,
CIAT, Palisades, NY, NASA Socioeconomic Data and Applications Center (SEDAC),
https://doi.org/10.7927/H4639MPP, 2005. a
Cobourn, W. G., Dolcine, L., French, M., and Hubbard, M. C.: A Comparison of
Nonlinear Regression and Neural Network Models for Ground-Level Ozone
Forecasting, J. Air. Waste Manage., 50, 1999–2009,
https://doi.org/10.1080/10473289.2000.10464228, 2000. a
Comrie, A. C.: Comparing Neural Networks and Regression Models for
Ozone Forecasting, J. Air. Waste Manage., 47, 653–663,
https://doi.org/10.1080/10473289.1997.10463925, 1997. a
DeLang, M. N., Becker, J. S., Chang, K.-L., Serre, M. L., Cooper, O. R.,
Schultz, M. G., Schröder, S., Lu, X., Zhang, L., Deushi, M., Josse, B.,
Keller, C. A., Lamarque, J.-F., Lin, M., Liu, J., Marécal, V., Strode,
S. A., Sudo, K., Tilmes, S., Zhang, L., Cleland, S. E., Collins, E. L.,
Brauer, M., and West, J. J.: Mapping Yearly Fine Resolution Global Surface
Ozone through the Bayesian Maximum Entropy Data Fusion of Observations and
Model Output for 1990–2017, Environ. Sci. Technol., 55, 4389–4398,
https://doi.org/10.1021/acs.est.0c07742, 2021. a, b
Duda, R. O., Hart, P. E., and Stork, D. G.: Pattern Classification, chap. 10,
John Wiley & Sons, Inc., New York, US, 2nd Edn., ISBN-10 0471056693, 2001. a
Ester, M., Kriegel, H.-P., Sander, J., and Xu, X.: A density-based algorithm
for discovering clusters in large spatial databases with noise, in: KDD-96
Proceedings, Portland, OR, US, second International
Conference on Knowledge Discovery and Data Mining (KDD), 2–4 August 1996, 34, 226–231, 1996. a
European Union: Directive 2008/50/EC of the European Parliament and of the
Council of 21 May 2008 on ambient air quality and cleaner air for Europe,
Official Journal of the European Union, OJ L, 1–44,
http://data.europa.eu/eli/dir/2008/50/oj (last access: 31 May 2022), 2008. a
Fleming, Z. L., Doherty, R. M., Von Schneidemesser, E., Malley, C. S., Cooper,
O. R., Pinto, J. P., Colette, A., Xu, X., Simpson, D., Schultz, M. G.,
Lefohn, A. S., Hamad, S., Moolla, R., Solberg, S., and Feng, Z.: Tropospheric
Ozone Assessment Report: Present-day ozone distribution and trends relevant
to human health, Elem. Sci. Anth., 6, 12, https://doi.org/10.1525/elementa.273, 2018. a, b, c
Gaudel, A., Cooper, O. R., Ancellet, G., Barret, B., Boynard, A., Burrows,
J. P., Clerbaux, C., Coheur, P. F., Cuesta, J., Cuevas, E., Doniki, S.,
Dufour, G., Ebojie, F., Foret, G., Garcia, O., Granados Muños, M. J.,
Hannigan, J. W., Hase, F., Huang, G., Hassler, B., Hurtmans, D., Jaffe, D.,
Jones, N., Kalabokas, P., Kerridge, B., Kulawik, S. S., Latter, B., Leblanc,
T., Le Flochmoën, E., Lin, W., Liu, J., Liu, X., Mahieu, E.,
McClure-Begley, A., Neu, J. L., Osman, M., Palm, M., Petetin, H.,
Petropavlovskikh, I., Querel, R., Rahpoe, N., Rozanov, A., Schultz, M. G.,
Schwab, J., Siddans, R., Smale, D., Steinbacher, M., Tanimoto, H., Tarasick,
D. W., Thouret, V., Thompson, A. M., Trickl, T., Weatherhead, E., Wespes, C.,
Worden, H. M., Vigouroux, C., Xu, X., Zeng, G., and Ziemke, J.: Tropospheric
Ozone Assessment Report: Present-day distribution and trends of tropospheric
ozone relevant to climate and global atmospheric chemistry model evaluation,
Elem. Sci. Anth., 6, 39, https://doi.org/10.1525/elementa.291, 2018. a
Gawlikowski, J., Tassi, C. R. N., Ali, M., Lee, J., Humt, M., Feng, J., Kruspe,
A., Triebel, R., Jung, P., Roscher, R., Shahzad, M., Yang, W., Bamler, R.,
and Zhu, X. X.: A Survey of Uncertainty in Deep Neural Networks, arXiv [preprint], arXiv:2107.03342v1, 2021. a
Guth, S. and Sapsis, T. P.: Machine Learning Predictors of Extreme Events
Occurring in Complex Dynamical Systems, Entropy, 21, 925, https://doi.org/10.3390/e21100925,
2019. a
Hamon, R., Junklewitz, H., and Sanchez, I.: Robustness and explainability of
artificial intelligence, Tech. Rep. JRC119336, Publications Office of the
European Union, Luxembourg, Luxembourg, https://doi.org/10.2760/57493, 2020. a
Heuvelink, G. B. M., Angelini, M. E., Poggio, L., Bai, Z., Batjes, N. H.,
van den Bosch, R., Bossio, D., Estella, S., Lehmann, J., Olmedo, G. F., and
Sanderman, J.: Machine learning in space and time for modelling soil organic
carbon change, Eur. J. Soil Sci., 72, 1607–1623, https://doi.org/10.1111/ejss.12998,
2020. a
Hoek, G., Beelen, R., de Hoogh, K., Vienneau, D., Gulliver, J., Fischer, P.,
and Briggs, D.: A review of land-use regression models to assess spatial
variation of outdoor air pollution, Atmos. Environ., 42, 7561–7578,
https://doi.org/10.1016/j.atmosenv.2008.05.057, 2008. a
Hoogen, J. V. D., Geisen, S., Routh, D., Ferris, H., Traunspurger, W., Wardle,
D. A., de Goede, R. G. M., Adams, B. J., Ahmad, W., Andriuzzi, W. S.,
Bardgett, R. D., Bonkowski, M., Campos-Herrera, R., Cares, J. E., Caruso, T.,
de Brito Caixeta, L., Chen, X., Costa, S. R., Creamer, R., Mauro da
Cunha Castro, J., Dam, M., Djigal, D., Escuer, M., Griffiths, B. S.,
Gutiérrez, C., Hohberg, K., Kalinkina, D., Kardol, P., Kergunteuil, A.,
Korthals, G., Krashevska, V., Kudrin, A. A., Li, Q., Liang, W., Magilton, M.,
Marais, M., Martín, J. A. R., Matveeva, E., Mayad, E. H., Mulder, C.,
Mullin, P., Neilson, R., Nguyen, T. A. D., Nielsen, U. N., Okada, H., Rius,
J. E. P., Pan, K., Peneva, V., Pellissier, L., Carlos Pereira da Silva, J.,
Pitteloud, C., Powers, T. O., Powers, K., Quist, C. W., Rasmann, S., Moreno,
S. S., Scheu, S., Setälä, H., Sushchuk, A., Tiunov, A. V., Trap, J.,
van der Putten, W., Vestergård, M., Villenave, C., Waeyenberge, L., Wall,
D. H., Wilschut, R., Wright, D. G., Yang, J.-I., and Crowther, T. W.: Soil
nematode abundance and functional group composition at a global scale,
Nature, 572, 194–198, https://doi.org/10.1038/s41586-019-1418-6, 2019. a
Irrgang, C., Boers, N., Sonnewald, M., Barnes, E. A., Kadow, C., Staneva, J.,
and Saynisch-Wagner, J.: Towards neural Earth system modelling by integrating
artificial intelligence in Earth system science, Nat. Mach. Intell., 3,
667–674, https://doi.org/10.1038/s42256-021-00374-3, 2021. a
Janssens-Maenhout, G., Crippa, M., Guizzardi, D., Dentener, F., Muntean, M., Pouliot, G., Keating, T., Zhang, Q., Kurokawa, J., Wankmüller, R., Denier van der Gon, H., Kuenen, J. J. P., Klimont, Z., Frost, G., Darras, S., Koffi, B., and Li, M.: HTAP_v2.2: a mosaic of regional and global emission grid maps for 2008 and 2010 to study hemispheric transport of air pollution, Atmos. Chem. Phys., 15, 11411–11432, https://doi.org/10.5194/acp-15-11411-2015, 2015. a
Keller, C. A. and Evans, M. J.: Application of random forest regression to the calculation of gas-phase chemistry within the GEOS-Chem chemistry model v10, Geosci. Model Dev., 12, 1209–1225, https://doi.org/10.5194/gmd-12-1209-2019, 2019. a
Keller, C. A., Evans, M. J., Kutz, J. N., and Pawson, S.: Machine learning and
air quality modeling, in: Proceedings of the 2017 IEEE International
Conference on Big Data (Big Data), IEEE, Boston, MA, USA, 4570–4576,
https://doi.org/10.1109/BigData.2017.8258500, 2017. a
Kleinert, F., Leufen, L. H., and Schultz, M. G.: IntelliO3-ts v1.0: a neural network approach to predict near-surface ozone concentrations in Germany, Geosci. Model Dev., 14, 1–25, https://doi.org/10.5194/gmd-14-1-2021, 2021. a
Krause, D.: JUWELS: Modular Tier-0/1 Supercomputer at Jülich
Supercomputing Centre, Journal of large-scale research facilities (JLSRF), 5,
1–8, https://doi.org/10.17815/jlsrf-5-171, 2019. a
Krotkov, N. A., McLinden, C. A., Li, C., Lamsal, L. N., Celarier, E. A., Marchenko, S. V., Swartz, W. H., Bucsela, E. J., Joiner, J., Duncan, B. N., Boersma, K. F., Veefkind, J. P., Levelt, P. F., Fioletov, V. E., Dickerson, R. R., He, H., Lu, Z., and Streets, D. G.: Aura OMI observations of regional SO2 and NO2 pollution changes from 2005 to 2015, Atmos. Chem. Phys., 16, 4605–4629, https://doi.org/10.5194/acp-16-4605-2016, 2016. a
Lary, D. J., Faruque, F. S., Malakar, N., Moore, A., Roscoe, B., Adams, Z. L.,
and Eggelston, Y.: Estimating the global abundance of ground level presence
of particulate matter (PM2.5), Geospatial Health, 8, S611–S630,
https://doi.org/10.4081/gh.2014.292, 2014. a
Lee, K., Lee, H., Lee, K., and Shin, J.: Training confidence-calibrated
classifiers for detecting out-of-distribution samples, arXiv [preprint],
arXiv:1711.09325, 2017. a
Li, J., Siwabessy, J., Huang, Z., and Nichol, S.: Developing an Optimal Spatial
Predictive Model for Seabed Sand Content Using Machine Learning,
Geostatistics, and Their Hybrid Methods, Geosciences, 9, 4,
https://doi.org/10.3390/geosciences9040180, 2019. a
Lundberg, S. M. and Lee, S.-I.: A Unified Approach to Interpreting Model
Predictions, in: Advances in Neural Information Processing Systems 30
(NeurIPS 2017 proceedings), edited by: Guyon, I., Luxburg, U. V., Bengio, S.,
Wallach, H., Fergus, R., Vishwanathan, S., and Garnett, R., 4765–4774,
Long Beach, CA, USA,
http://papers.nips.cc/paper/7062-a-unified-approach-to-interpreting-model-predictions.pdf (last access: 31 May 2022),
2017. a, b, c
Lundberg, S. M., Erion, G. G., and Lee, S.-I.: Consistent individualized
feature attribution for tree ensembles, arXiv [preprint],
arXiv:1802.03888, 2018. a
Lundberg, S. M., Erion, G., Chen, H., DeGrave, A., Prutkin, J. M., Nair, B.,
Katz, R., Himmelfarb, J., Bansal, N., and Lee, S.-I.: From local explanations
to global understanding with explainable AI for trees, Nature machine
intelligence, 2, 56–67, https://doi.org/10.1038/s42256-019-0138-9, 2020. a, b
Mattson, M. D. and Godfrey, P. J.: Identification of road salt contamination
using multiple regression and GIS, Environ. Manage., 18, 767–773,
https://doi.org/10.1007/BF02394639, 1994. a
Meyer, H.: Machine learning as a tool to “map the world”? On remote
sensing and predictive modelling for environmental monitoring, 17th
Biodiversity Exploratories Assembly, Wernigerode, Germany [keynote], 4 March
2020. a
Meyer, H. and Pebesma, E.: Predicting into unknown space? Estimating the area
of applicability of spatial prediction models, Methods Ecol. Evol., 12,
1620–1633, https://doi.org/10.1111/2041-210X.13650, 2021. a, b, c, d
Meyer, H., Reudenbach, C., Hengl, T., Katurji, M., and Nauss, T.: Improving
performance of spatio-temporal machine learning models using forward feature
selection and target-oriented validation, Environ. Modell. Softw., 101, 1–9,
https://doi.org/10.1016/j.envsoft.2017.12.001, 2018. a, b, c, d
Mills, G., Pleijel, H., Malley, C. S., Sinha, B., Cooper, O. R., Schultz,
M. G., Neufeld, H. S., Simpson, D., Sharps, K., Feng, Z., Gerosa, G.,
Harmens, H., Kobayashi, K., Saxena, P., Paoletti, E., Sinha, V., and Xu, X.:
Tropospheric Ozone Assessment Report: Present-day tropospheric ozone
distribution and trends relevant to vegetation, Elem. Sci. Anth., 6, 47,
https://doi.org/10.1525/elementa.302, 2018. a, b, c
Monks, P. S., Archibald, A. T., Colette, A., Cooper, O., Coyle, M., Derwent, R., Fowler, D., Granier, C., Law, K. S., Mills, G. E., Stevenson, D. S., Tarasova, O., Thouret, V., von Schneidemesser, E., Sommariva, R., Wild, O., and Williams, M. L.: Tropospheric ozone and its precursors from the urban to the global scale from air quality to short-lived climate forcer, Atmos. Chem. Phys., 15, 8889–8973, https://doi.org/10.5194/acp-15-8889-2015, 2015. a, b
Nussbaum, M., Spiess, K., Baltensweiler, A., Grob, U., Keller, A., Greiner, L., Schaepman, M. E., and Papritz, A.: Evaluation of digital soil mapping approaches with large sets of environmental covariates, SOIL, 4, 1–22, https://doi.org/10.5194/soil-4-1-2018, 2018. a
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel,
O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J.,
Passos, A., Cournapeau, D., Brucher, M., Perrot, M., and Duchesnay, E.:
Scikit-learn: Machine Learning in Python, J. Mach. Learn. Res., 12,
2825–2830,
2011. a
Petermann, E., Meyer, H., Nussbaum, M., and Bossew, P.: Mapping the geogenic
radon potential for Germany by machine learning, Sci. Total Environ., 754,
142291, https://doi.org/10.1016/j.scitotenv.2020.142291, 2021. a, b, c
Ploton, P., Mortier, F., Réjou-Méchain, M., Barbier, N., Picard, N.,
Rossi, V., Dormann, C., Cornu, G., Viennois, G., Bayol, N., Lyapustin, A., Gourlet-Fleury, S., and Pélissier, R.: Spatial
validation reveals poor predictive performance of large-scale ecological
mapping models, Nat. Commun., 11, 1–11, https://doi.org/10.1038/s41467-020-18321-y,
2020. a, b
Ren, X., Mi, Z., and Georgopoulos, P. G.: Comparison of Machine Learning and
Land Use Regression for fine scale spatiotemporal estimation of ambient air
pollution: Modeling ozone concentrations across the contiguous United States,
Environ. Int., 142, 105827, https://doi.org/10.1016/j.envint.2020.105827, 2020. a, b
Roscher, R., Bohn, B., Duarte, M. F., and Garcke, J.: Explainable Machine
Learning for Scientific Insights and Discoveries, IEEE Access, 8,
42200–42216, https://doi.org/10.1109/ACCESS.2020.2976199, 2020. a
Sayeed, A., Choi, Y., Eslami, E., Jung, J., Lops, Y., Salman, A. K., Lee,
J.-B., Park, H.-J., and Choi, M.-H.: A novel CMAQ-CNN hybrid model to
forecast hourly surface-ozone concentrations 14 days in advance, Sci. Rep.,
11, 1–8, https://doi.org/10.1038/s41598-021-90446-6, 2021. a
Schmitz, S., Towers, S., Villena, G., Caseiro, A., Wegener, R., Klemp, D., Langer, I., Meier, F., and von Schneidemesser, E.: Unravelling a black box: an open-source methodology for the field calibration of small air quality sensors, Atmos. Meas. Tech., 14, 7221–7241, https://doi.org/10.5194/amt-14-7221-2021, 2021. a
Schultz, M. G., Akimoto, H., Bottenheim, J., Buchmann, B., Galbally, I. E.,
Gilge, S., Helmig, D., Koide, H., Lewis, A. C., Novelli, P. C.,
Plass-Dülmer, C., Ryerson, T. B., Steinbacher, M., Steinbrecher, R., Tarasova, O.,
Tørseth, K., Thouret, V., and Zellweger, C.: The
Global Atmosphere Watch reactive gases measurement network, Elem. Sci. Anth.,
3, 000067, https://doi.org/10.12952/journal.elementa.000067, 2015. a
Schultz, M. G., Schröder, S., Lyapina, O., Cooper, O., Galbally, I.,
Petropavlovskikh, I., Von Schneidemesser, E., Tanimoto, H., Elshorbany, Y.,
Naja, M., Seguel, R., Dauert, U., Eckhardt, P., Feigenspahn, S., Fiebig, M.,
Hjellbrekke, A.-G., Hong, Y.-D., Christian Kjeld, P., Koide, H., Lear, G.,
Tarasick, D., Ueno, M., Wallasch, M., Baumgardner, D., Chuang, M.-T.,
Gillett, R., Lee, M., Molloy, S., Moolla, R., Wang, T., Sharps, K., Adame,
J. A., Ancellet, G., Apadula, F., Artaxo, P., Barlasina, M., Bogucka, M.,
Bonasoni, P., Chang, L., Colomb, A., Cuevas, E., Cupeiro, M., Degorska, A.,
Ding, A., Fröhlich, M., Frolova, M., Gadhavi, H., Gheusi, F., Gilge, S.,
Gonzalez, M. Y., Gros, V., Hamad, S. H., Helmig, D., Henriques, D.,
Hermansen, O., Holla, R., Huber, J., Im, U., Jaffe, D. A., Komala, N.,
Kubistin, D., Lam, K.-S., Laurila, T., Lee, H., Levy, I., Mazzoleni, C.,
Mazzoleni, L., McClure-Begley, A., Mohamad, M., Murovic, M., Navarro-Comas,
M., Nicodim, F., Parrish, D., Read, K. A., Reid, N., Ries, L., Saxena, P.,
Schwab, J. J., Scorgie, Y., Senik, I., Simmonds, P., Sinha, V., Skorokhod,
A., Spain, G., Spangl, W., Spoor, R., Springston, S. R., Steer, K.,
Steinbacher, M., Suharguniyawan, E., Torre, P., Trickl, T., Weili, L.,
Weller, R., Xu, X., Xue, L., and Zhiqiang, M.: Tropospheric Ozone Assessment
Report: Database and Metrics Data of Global Surface Ozone Observations, Elem.
Sci. Anth., 5, 58, https://doi.org/10.1525/elementa.244, 2017. a, b, c, d, e, f, g, h, i
Shapley, L.: A Value for n-Person Games, vol. II of Contributions to the
Theory of Games, Princeton University Press,
Princeton, UK, chap. 17, 307–318, https://doi.org/10.1515/9781400881970-018, 1953. a
Sofen, E. D., Bowdalo, D., and Evans, M. J.: How to most effectively expand the global surface ozone observing network, Atmos. Chem. Phys., 16, 1445–1457, https://doi.org/10.5194/acp-16-1445-2016, 2016. a
Stadtler, S., Betancourt, C., and Roscher, R.: Explainable Machine Learning
Reveals Capabilities, Redundancy, and Limitations of a Geospatial Air Quality
Benchmark Dataset, Machine Learning and Knowledge Extraction, 4, 150–171,
https://doi.org/10.3390/make4010008, 2022. a
Wallace, J. and Hobbs, P.: Atmospheric Science: An Introductory Survey, vol. 92
of International Geophysics Series, Elsevier Academic Press,
Burlington, MA, USA, 2nd Edn., https://doi.org/10.1016/C2009-0-00034-8, 2006. a
Wang, S., Ma, Y., Wang, Z., Wang, L., Chi, X., Ding, A., Yao, M., Li, Y., Li, Q., Wu, M., Zhang, L., Xiao, Y., and Zhang, Y.: Mobile monitoring of urban air quality at high spatial resolution by low-cost sensors: impacts of COVID-19 pandemic lockdown, Atmos. Chem. Phys., 21, 7199–7215, https://doi.org/10.5194/acp-21-7199-2021, 2021. a
Wilkinson, M. D., Dumontier, M., Aalbersberg, I. J., Appleton, G., Axton, M.,
Baak, A., Blomberg, N., Boiten, J.-W., da Silva Santos, L. B., Bourne, P. E.,
Bouwman, J., Brookes, A. J., Clark, T., Crosas, M., Dillo, I., Dumon, O.,
Edmunds, S., Evelo, C. T., Finkers, R., Gonzalez-Beltran, A., Gray, A. J.,
Groth, P., Goble, C., Grethe, J. S., Heringa, J., ’t Hoen, P. A., Hooft,
R., Kuhn, T., Kok, R., Kok, J., Lusher, S. J., Martone, M. E., Mons, A.,
Packer, A. L., Persson, B., Rocca-Serra, P., Roos, M., van Schaik, R.,
Sansone, S.-A., Schultes, E., Sengstag, T., Slater, T., Strawn, G., Swertz,
M. A., Thompson, M., van der Lei, J., van Mulligen, E., Velterop, J.,
Waagmeester, A., Wittenburg, P., Wolstencroft, K., Zhao, J., and Mons, B.:
The FAIR Guiding Principles for scientific data management and stewardship,
Sci. Data, 3, 160018, https://doi.org/10.1038/sdata.2016.18, 2016.
a
Young, P. J., Naik, V., Fiore, A. M., Gaudel, A., Guo, J., Lin, M. Y., Neu,
J. L., Parrish, D. D., Rieder, H. E., Schnell, J. L., Tilmes, S., Wild, O.,
Zhang, L., Ziemke, J. R., Brandt, J., Delcloo, A., Doherty, R. M., Geels, C.,
Hegglin, M. I., Hu, L., Im, U., Kumar, R., Luhar, A., Murray, L., Plummer,
D., Rodriguez, J., Saiz-Lopez, A., Schultz, M. G., Woodhouse, M. T., and
Zeng, G.: Tropospheric Ozone Assessment Report: Assessment of global-scale
model performance for global and regional ozone distributions, variability,
and trends, Elem. Sci. Anth., 6, 10, https://doi.org/10.1525/elementa.265, 2018. a
Short summary
Ozone is a toxic greenhouse gas with high spatial variability. We present a machine-learning-based ozone-mapping workflow generating a transparent and reliable product. Going beyond standard mapping methods, this work combines explainable machine learning with uncertainty assessment to increase the integrity of the produced map.
Ozone is a toxic greenhouse gas with high spatial variability. We present a...