Data clustering to optimise the representativity of observational data in air quality data assimilation: a case study with EURAD-IM (version 5.9.1 DA)

Hermanns, Alexander; Lange, Anne Caroline; Kowalski, Julia; Fuchs, Hendrik; Franke, Philipp

doi:10.5194/gmd-18-9417-2025

Articles | Volume 18, issue 23

https://doi.org/10.5194/gmd-18-9417-2025

© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.

https://doi.org/10.5194/gmd-18-9417-2025

© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.

Articles | Volume 18, issue 23

Development and technical paper

|

03 Dec 2025

Development and technical paper |

| 03 Dec 2025

Data clustering to optimise the representativity of observational data in air quality data assimilation: a case study with EURAD-IM (version 5.9.1 DA)

Alexander Hermanns, Anne Caroline Lange, Julia Kowalski, Hendrik Fuchs, and Philipp Franke

Data sets

KSC - Observational Data Clustering Preprocessor Alexander Hermanns https://doi.org/10.5281/zenodo.14711881

Model code and software

KSC - Observational Data Clustering Preprocessor Alexander Hermanns https://doi.org/10.5281/zenodo.14711881

Short summary

For air quality analyses, data assimilation models split available data into assimilation and validation data sets. The former is used to generate the analysis, the latter to verify the simulations. A preprocessor classifying the observations by the data characteristics is developed based on clustering algorithms. The assimilation and validation data sets are compiled by equally allocating data of each cluster. The resulting improvement of the analysis is evaluated with an air quality model.