Articles | Volume 18, issue 23
https://doi.org/10.5194/gmd-18-9417-2025
https://doi.org/10.5194/gmd-18-9417-2025
Development and technical paper
 | 
03 Dec 2025
Development and technical paper |  | 03 Dec 2025

Data clustering to optimise the representativity of observational data in air quality data assimilation: a case study with EURAD-IM (version 5.9.1 DA)

Alexander Hermanns, Anne Caroline Lange, Julia Kowalski, Hendrik Fuchs, and Philipp Franke

Related authors

Measurement report: Comprehensive seasonal study of the composition and sources of submicron aerosol during the JULIAC campaign in Germany
Lu Liu, Thorsten Hohaus, Andreas Hofzumahaus, Frank Holland, Hendrik Fuchs, Ralf Tillmann, Birger Bohn, Stefanie Andres, Zhaofeng Tan, Franz Rohrer, Vlassis A. Karydis, Vaishali Vardhan, Philipp Franke, Anne C. Lange, Anna Novelli, Benjamin Winter, Changmin Cho, Iulia Gensch, Sergej Wedel, Andreas Wahner, and Astrid Kiendler-Scharr
Atmos. Chem. Phys., 25, 16189–16213, https://doi.org/10.5194/acp-25-16189-2025,https://doi.org/10.5194/acp-25-16189-2025, 2025
Short summary
Copernicus Atmosphere Monitoring Service – Regional Air Quality Production System v1.0
Augustin Colette, Gaëlle Collin, François Besson, Etienne Blot, Vincent Guidard, Frédérik Meleux, Adrien Royer, Valentin Petiot, Claire Miller, Oihana Fermond, Alizé Jeant, Mario Adani, Joaquim Arteta, Anna Benedictow, Robert Bergström, Dene Bowdalo, Jorgen Brandt, Gino Briganti, Ana C. Carvalho, Jesper Heile Christensen, Florian Couvidat, Ilaria D'Elia, Massimo D'Isidoro, Hugo Denier van der Gon, Gaël Descombes, Enza Di Tomaso, John Douros, Jeronimo Escribano, Henk Eskes, Hilde Fagerli, Yalda Fatahi, Johannes Flemming, Elmar Friese, Lise Frohn, Michael Gauss, Camilla Geels, Guido Guarnieri, Marc Guevara, Antoine Guion, Jonathan Guth, Risto Hänninen, Kaj Hansen, Ulas Im, Ruud Janssen, Marine Jeoffrion, Mathieu Joly, Luke Jones, Oriol Jorba, Evgeni Kadantsev, Michael Kahnert, Jacek W. Kaminski, Rostislav Kouznetsov, Richard Kranenburg, Jeroen Kuenen, Anne Caroline Lange, Joachim Langner, Victor Lannuque, Francesca Macchia, Astrid Manders, Mihaela Mircea, Agnes Nyiri, Miriam Olid, Carlos Pérez García-Pando, Yuliia Palamarchuk, Antonio Piersanti, Blandine Raux, Miha Razinger, Lennard Robertson, Arjo Segers, Martijn Schaap, Pilvi Siljamo, David Simpson, Mikhail Sofiev, Anders Stangel, Joanna Struzewska, Carles Tena, Renske Timmermans, Thanos Tsikerdekis, Svetlana Tsyro, Svyatoslav Tyuryakov, Anthony Ung, Andreas Uppstu, Alvaro Valdebenito, Peter van Velthoven, Lina Vitali, Zhuyun Ye, Vincent-Henri Peuch, and Laurence Rouïl
Geosci. Model Dev., 18, 6835–6883, https://doi.org/10.5194/gmd-18-6835-2025,https://doi.org/10.5194/gmd-18-6835-2025, 2025
Short summary
Bayesian data selection to quantify the value of data for landslide runout calibration
V Mithlesh Kumar, Anil Yildiz, and Julia Kowalski
EGUsphere, https://doi.org/10.5194/egusphere-2025-4531,https://doi.org/10.5194/egusphere-2025-4531, 2025
This preprint is open for discussion and under review for Nonlinear Processes in Geophysics (NPG).
Short summary
Incorporation of lumped IVOC emissions into the ORACLE model (V1.1): A multi-product framework for assessing global SOA formation from internal combustion engines
Susanne M. C. Scholz, Vlassis A. Karydis, Georgios I. Gkatzelis, Hendrik Fuchs, Spyros N. Pandis, and Alexandra P. Tsimpidi
EGUsphere, https://doi.org/10.5194/egusphere-2025-2510,https://doi.org/10.5194/egusphere-2025-2510, 2025
Short summary
Kinetics of the reactions of OH with CO, NO, and NO2 and of HO2 with NO2 in air at 1 atm pressure, room temperature, and tropospheric water vapour concentrations
Michael Rolletter, Andreas Hofzumahaus, Anna Novelli, Andreas Wahner, and Hendrik Fuchs
Atmos. Chem. Phys., 25, 3481–3502, https://doi.org/10.5194/acp-25-3481-2025,https://doi.org/10.5194/acp-25-3481-2025, 2025
Short summary

Cited articles

Beyer, K., Goldstein, J., Ramakrishnan, R., and Shaft, U.: When Is “Nearest Neighbor” Meaningful?, in: Database Theory – ICDT'99, edited by Beeri, C. and Buneman, P., Springer Berlin Heidelberg, Berlin, Heidelberg, 217–235, ISBN 978-3-540-49257-3, 1999. a
Borge, R., Jung, D., Lejarraga, I., de la Paz, D., and Cordero, J. M.: Assessment of the Madrid region air quality zoning based on mesoscale modelling and k-means clustering, Atmospheric Environment, 287, 119258, https://doi.org/10.1016/j.atmosenv.2022.119258, 2022. a
Breiman, L.: Random Forests, Machine Learning, 45, 5–32, https://doi.org/10.1023/A:1010933404324, 2001. a
Carro-Calvo, L., Ordóñez, C., García-Herrera, R., and Schnell, J. L.: Spatial clustering and meteorological drivers of summer ozone in Europe, Atmospheric Environment, 167, 496–510, https://doi.org/10.1016/j.atmosenv.2017.08.050, 2017. a
Colette, A., Collin, G., Besson, F., Blot, E., Guidard, V., Meleux, F., Royer, A., Petiot, V., Miller, C., Fermond, O., Jeant, A., Adani, M., Arteta, J., Benedictow, A., Bergström, R., Bowdalo, D., Brandt, J., Briganti, G., Carvalho, A. C., Christensen, J. H., Couvidat, F., D'Elia, I., D'Isidoro, M., Denier van der Gon, H., Descombes, G., Di Tomaso, E., Douros, J., Escribano, J., Eskes, H., Fagerli, H., Fatahi, Y., Flemming, J., Friese, E., Frohn, L., Gauss, M., Geels, C., Guarnieri, G., Guevara, M., Guion, A., Guth, J., Hänninen, R., Hansen, K., Im, U., Janssen, R., Jeoffrion, M., Joly, M., Jones, L., Jorba, O., Kadantsev, E., Kahnert, M., Kaminski, J. W., Kouznetsov, R., Kranenburg, R., Kuenen, J., Lange, A. C., Langner, J., Lannuque, V., Macchia, F., Manders, A., Mircea, M., Nyiri, A., Olid, M., Pérez García-Pando, C., Palamarchuk, Y., Piersanti, A., Raux, B., Razinger, M., Robertson, L., Segers, A., Schaap, M., Siljamo, P., Simpson, D., Sofiev, M., Stangel, A., Struzewska, J., Tena, C., Timmermans, R., Tsikerdekis, T., Tsyro, S., Tyuryakov, S., Ung, A., Uppstu, A., Valdebenito, A., van Velthoven, P., Vitali, L., Ye, Z., Peuch, V.-H., and Rouïl, L.: Copernicus Atmosphere Monitoring Service – Regional Air Quality Production System v1.0, Geosci. Model Dev., 18, 6835–6883, https://doi.org/10.5194/gmd-18-6835-2025, 2025. a
Download
Short summary
For air quality analyses, data assimilation models split available data into assimilation and validation data sets. The former is used to generate the analysis, the latter to verify the simulations. A preprocessor classifying the observations by the data characteristics is developed based on clustering algorithms. The assimilation and validation data sets are compiled by equally allocating data of each cluster. The resulting improvement of the analysis is evaluated with an air quality model.
Share