Submitted as: model experiment description paper
11 May 2022
Submitted as: model experiment description paper | 11 May 2022
Status: this preprint is currently under review for the journal GMD.

Representing chemical history in ozone time-series predictions – a model experiment study building on the MLAir (v1.5) deep learning framework

Felix Kleinert1,2, Lukas Hubert Leufen1,2, Aurelia Lupascu3, Tim Butler3, and Martin G. Schultz1 Felix Kleinert et al.
  • 1Forschungszentrum Jülich GmbH, Jülich Supercomputing Centre (JSC) , Jülich, Germany
  • 2Institute of Geosciences, Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany
  • 3Institute for Advanced Sustainability Studies, Potsdam, Germany

Abstract. Tropospheric ozone is a secondary air pollutant that is harmful to living beings and crops. Predicting ozone concentrations at specific locations is thus important to initiate protection measures, i.e. emission reductions or warnings to the population. Ozone levels at specific locations result from emission and sink processes, mixing and chemical transformation along an air parcel's trajectory. Current ozone forecasting systems generally rely on computationally expensive chemistry transport models (CTMs). However, recently several studies have demonstrated the potential of deep learning for this task. While a few of these studies were trained on gridded model data, most efforts focus on forecasting time series from individual measurement locations. In this study, we present a hybrid approach which is based on time series forecasting (up to four days) but uses spatially aggregated meteorological and chemical data from upstream wind sectors to represent some aspects of the chemical history of air parcels arriving at the measurement location. To demonstrate the value of this additional information we extracted pseudo observation data for Germany from a CTM to avoid extra complications with irregularly spaced and missing data. However, our method can be extended so that it can be applied to observational time series. Using one upstream sector alone improves the forecasts by 10 % during all four days while the use of three sectors improves the mean squared error (MSE) skill score by 14 % during the first two days of the prediction but depends on the upstream wind direction. Our method shows its best performance in the northern half of Germany for the first two prediction days. Based on the data's seasonality and simulation period, we shed some light on our models' open challenges with i) spatial structures in terms of decreasing skill scores from the northern German plain to the mountainous south and ii) concept drifts related to an unusually cold winter season. Here we expect that the inclusion of explainable artificial intelligence methods could reveal additional insights in future versions of our model.

Felix Kleinert et al.

Status: open (until 06 Jul 2022)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse

Felix Kleinert et al.

Felix Kleinert et al.


Total article views: 208 (including HTML, PDF, and XML)
HTML PDF XML Total BibTeX EndNote
176 28 4 208 2 2
  • HTML: 176
  • PDF: 28
  • XML: 4
  • Total: 208
  • BibTeX: 2
  • EndNote: 2
Views and downloads (calculated since 11 May 2022)
Cumulative views and downloads (calculated since 11 May 2022)

Viewed (geographical distribution)

Total article views: 182 (including HTML, PDF, and XML) Thereof 182 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
Latest update: 20 May 2022
Short summary
We examine the effects of spatially aggregated upstream information as input for a deep learning model forecasting near-surface ozone levels. Using aggregated data from one upstream sector (45°) improves the forecast by ~10 % for four prediction days. Three upstream sectors improve the forecasts by ~14 % on the first two days only. Our results serve as an orientation for other researchers or environmental agencies focusing on pointwise time-series predictions – e.g. due to regulatory purposes.