Articles | Volume 19, issue 12
https://doi.org/10.5194/gmd-19-5765-2026
© Author(s) 2026. This work is distributed under
the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
https://doi.org/10.5194/gmd-19-5765-2026
© Author(s) 2026. This work is distributed under
the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
TOAR-classifier v2: a data-driven classification tool for global air quality stations
Ramiyou Karim Mache
CORRESPONDING AUTHOR
independent researcher
formerly at: Jülich Supercomputing Centre, Forschungszentrum Jülich, 52425 Jülich, Germany
Sabine Schröder
Jülich Supercomputing Centre, Forschungszentrum Jülich, 52425 Jülich, Germany
Michael Langguth
independent researcher
formerly at: Jülich Supercomputing Centre, Forschungszentrum Jülich, 52425 Jülich, Germany
Ankit Patnala
Jülich Supercomputing Centre, Forschungszentrum Jülich, 52425 Jülich, Germany
Martin G. Schultz
Jülich Supercomputing Centre, Forschungszentrum Jülich, 52425 Jülich, Germany
Department of Mathematics and Computer Science, University of Cologne, Cologne, Germany
Related authors
Bing Gong, Michael Langguth, Yan Ji, Amirpasha Mozaffari, Scarlet Stadtler, Karim Mache, and Martin G. Schultz
Geosci. Model Dev., 15, 8931–8956, https://doi.org/10.5194/gmd-15-8931-2022, https://doi.org/10.5194/gmd-15-8931-2022, 2022
Short summary
Short summary
Inspired by the success of deep learning in various domains, we test the applicability of video prediction methods by generative adversarial network (GAN)-based deep learning to predict the 2 m temperature over Europe. Our video prediction models have skill in predicting the diurnal cycle of 2 m temperature up to 12 h ahead. Complemented by probing the relevance of several model parameters, this study confirms the potential of deep learning in meteorological forecasting applications.
Sindhu Vasireddy, Michael Langguth, and Martin Schultz
EGUsphere, https://doi.org/10.5194/egusphere-2026-1562, https://doi.org/10.5194/egusphere-2026-1562, 2026
This preprint is open for discussion and under review for Geoscientific Model Development (GMD).
Short summary
Short summary
This study evaluates a transformer model for hourly air quality forecasting using past pollution, weather, and anthropogenic metadata (emissions, land use). It outperforms Copernicus Atmosphere Monitoring Service forecasts, especially in urban regions, with lower bias and improved stability. Trained in Germany, it transfers to South Korea with minimal adaptation, preserving geochemical relationships and showing strong cross-regional generalization.
Biplob Dey, Toke Due Sjøgren, Peeyush Khare, Georgios I. Gkatzelis, Yizhen Wu, Sindhu Vasireddy, Martin Schultz, Alexander Knohl, Riikka Rinnan, Thorsten Hohaus, and Eva Y. Pfannerstill
Biogeosciences, 23, 1423–1457, https://doi.org/10.5194/bg-23-1423-2026, https://doi.org/10.5194/bg-23-1423-2026, 2026
Short summary
Short summary
Trees release reactive gases that affect air quality and climate. We studied how these emissions from European beech and English oak change under realistic scenarios of combined and single heat and ozone stress. Heat increased emissions, while ozone reduced most of them. When stressors were combined, the effects were complex and varied by species. Machine learning identified key stress-related compounds. Our findings show that future tree stress may alter air quality and climate interactions.
Sebastian H. M. Hickman, Makoto M. Kelp, Paul T. Griffiths, Kelsey Doerksen, Kazuyuki Miyazaki, Elyse A. Pennington, Gerbrand Koren, Fernando Iglesias-Suarez, Martin G. Schultz, Kai-Lan Chang, Owen R. Cooper, Alex Archibald, Roberto Sommariva, David Carlson, Hantao Wang, J. Jason West, and Zhenze Liu
Geosci. Model Dev., 18, 8777–8800, https://doi.org/10.5194/gmd-18-8777-2025, https://doi.org/10.5194/gmd-18-8777-2025, 2025
Short summary
Short summary
Machine learning is being more widely used across environmental and climate science. This work reviews the use of machine learning in tropospheric ozone research, focusing on three main application areas in which significant progress has been made. Common challenges in using machine learning across the three areas are highlighted, and future directions for the field are indicated.
Hantao Wang, Kazuyuki Miyazaki, Haitong Zhe Sun, Zhen Qu, Xiang Liu, Antje Inness, Martin Schultz, Sabine Schröder, Marc Serre, and J. Jason West
Atmos. Chem. Phys., 25, 15969–15990, https://doi.org/10.5194/acp-25-15969-2025, https://doi.org/10.5194/acp-25-15969-2025, 2025
Short summary
Short summary
We compare six datasets of global ground-level ozone, developed using geostatistical, machine learning, or reanalysis methods. The datasets show important differences from one another in ozone magnitude, greater than 5 ppb, and trends, globally and regionally. Compared with measurements, performance varies among datasets, and most overestimate ozone, particularly at lower concentrations. These differences among datasets highlight uncertainties for applications to health and other impacts.
Yugo Kanaya, Roberto Sommariva, Alfonso Saiz-Lopez, Andrea Mazzeo, Theodore K. Koenig, Kaori Kawana, James E. Johnson, Aurélie Colomb, Pierre Tulet, Suzie Molloy, Ian E. Galbally, Rainer Volkamer, Anoop Mahajan, John W. Halfacre, Paul B. Shepson, Julia Schmale, Hélène Angot, Byron Blomquist, Matthew D. Shupe, Detlev Helmig, Junsu Gil, Meehye Lee, Sean C. Coburn, Ivan Ortega, Gao Chen, James Lee, Kenneth C. Aikin, David D. Parrish, John S. Holloway, Thomas B. Ryerson, Ilana B. Pollack, Eric J. Williams, Brian M. Lerner, Andrew J. Weinheimer, Teresa Campos, Frank M. Flocke, J. Ryan Spackman, Ilann Bourgeois, Jeff Peischl, Chelsea R. Thompson, Ralf M. Staebler, Amir A. Aliabadi, Wanmin Gong, Roeland Van Malderen, Anne M. Thompson, Ryan M. Stauffer, Debra E. Kollonige, Juan Carlos Gómez Martin, Masatomo Fujiwara, Katie Read, Matthew Rowlinson, Keiichi Sato, Junichi Kurokawa, Yoko Iwamoto, Fumikazu Taketani, Hisahiro Takashima, Mónica Navarro-Comas, Marios Panagi, and Martin G. Schultz
Earth Syst. Sci. Data, 17, 4901–4932, https://doi.org/10.5194/essd-17-4901-2025, https://doi.org/10.5194/essd-17-4901-2025, 2025
Short summary
Short summary
The first comprehensive dataset of tropospheric ozone over oceans/polar regions is presented, including 77 ship/buoy and 48 aircraft campaign observations (1977–2022, 0–5000 m altitude), supplemented by ozonesonde and surface data. Air masses isolated from land for 72+ hours are systematically selected as essentially oceanic. Among the 11 global regions, they show daytime decreases of 11–16 % in the tropics, while near-zero depletions are rare, unlike in the Arctic, implying different mechanisms.
Yan Ji, Bing Gong, Michael Langguth, Amirpasha Mozaffari, and Xiefei Zhi
Geosci. Model Dev., 16, 2737–2752, https://doi.org/10.5194/gmd-16-2737-2023, https://doi.org/10.5194/gmd-16-2737-2023, 2023
Short summary
Short summary
Formulating short-term precipitation forecasting as a video prediction task, a novel deep learning architecture (convolutional long short-term memory generative adversarial network, CLGAN) is proposed. A benchmark dataset is built on minute-level precipitation measurements. Results show that with the GAN component the model generates predictions sharing statistical properties with observations, resulting in it outperforming the baseline in dichotomous and spatial scores for heavy precipitation.
Adrian Rojas-Campos, Michael Langguth, Martin Wittenbrink, and Gordon Pipa
Geosci. Model Dev., 16, 1467–1480, https://doi.org/10.5194/gmd-16-1467-2023, https://doi.org/10.5194/gmd-16-1467-2023, 2023
Short summary
Short summary
Our paper presents an alternative approach for generating high-resolution precipitation maps based on the nonlinear combination of the complete set of variables of the numerical weather predictions. This process combines the super-resolution task with the bias correction in a single step, generating high-resolution corrected precipitation maps with a lead time of 3 h. We used using deep learning algorithms to combine the input information and increase the accuracy of the precipitation maps.
Bing Gong, Michael Langguth, Yan Ji, Amirpasha Mozaffari, Scarlet Stadtler, Karim Mache, and Martin G. Schultz
Geosci. Model Dev., 15, 8931–8956, https://doi.org/10.5194/gmd-15-8931-2022, https://doi.org/10.5194/gmd-15-8931-2022, 2022
Short summary
Short summary
Inspired by the success of deep learning in various domains, we test the applicability of video prediction methods by generative adversarial network (GAN)-based deep learning to predict the 2 m temperature over Europe. Our video prediction models have skill in predicting the diurnal cycle of 2 m temperature up to 12 h ahead. Complemented by probing the relevance of several model parameters, this study confirms the potential of deep learning in meteorological forecasting applications.
Felix Kleinert, Lukas H. Leufen, Aurelia Lupascu, Tim Butler, and Martin G. Schultz
Geosci. Model Dev., 15, 8913–8930, https://doi.org/10.5194/gmd-15-8913-2022, https://doi.org/10.5194/gmd-15-8913-2022, 2022
Short summary
Short summary
We examine the effects of spatially aggregated upstream information as input for a deep learning model forecasting near-surface ozone levels. Using aggregated data from one upstream sector (45°) improves the forecast by ~ 10 % for 4 prediction days. Three upstream sectors improve the forecasts by ~ 14 % on the first 2 d only. Our results serve as an orientation for other researchers or environmental agencies focusing on pointwise time-series predictions, for example, due to regulatory purposes.
Swantje Preuschmann, Tanja Blome, Knut Görl, Fiona Köhnke, Bettina Steuri, Juliane El Zohbi, Diana Rechid, Martin Schultz, Jianing Sun, and Daniela Jacob
Adv. Sci. Res., 19, 51–71, https://doi.org/10.5194/asr-19-51-2022, https://doi.org/10.5194/asr-19-51-2022, 2022
Short summary
Short summary
The main aspect of the paper is to obtain transferable principles for the development of digital knowledge transfer products. As such products are still unstandardised, the authors explored challenges and approaches for product developments. The authors report what they see as useful principles for developing digital knowledge transfer products, by describing the experience of developing the Net-Zero-2050 Web-Atlas and the "Bodenkohlenstoff-App".
Clara Betancourt, Timo T. Stomberg, Ann-Kathrin Edrich, Ankit Patnala, Martin G. Schultz, Ribana Roscher, Julia Kowalski, and Scarlet Stadtler
Geosci. Model Dev., 15, 4331–4354, https://doi.org/10.5194/gmd-15-4331-2022, https://doi.org/10.5194/gmd-15-4331-2022, 2022
Short summary
Short summary
Ozone is a toxic greenhouse gas with high spatial variability. We present a machine-learning-based ozone-mapping workflow generating a transparent and reliable product. Going beyond standard mapping methods, this work combines explainable machine learning with uncertainty assessment to increase the integrity of the produced map.
Cited articles
Abdi, H. and Williams, L. J.: Principal component analysis, Wiley Interdisciplinary Reviews: Computational Statistics, 2, 433–459, 2010. a
Airgood-Obrycki, W. and Rieger, S.: Defining suburbs: How definitions shape the suburban landscape, Joint Center for Housing Studies of Harvard University, https://www.jchs.harvard.edu/research-areas/working-papers/defining-suburbs-how-definitions-shape-suburban-landscape (last access: 24 June 2026), 2019. a
Bahmani, B., Moseley, B., Vattani, A., Kumar, R., and Vassilvitskii, S.: Scalable k-means++, arXiv [preprint], https://doi.org/10.48550/arXiv.1203.6402, 2012. a
Biau, G. and Scornet, E.: A random forest guided tour, Test, 25, 197–227, 2016. a
Breiman, L.: Random forests, Machine Learning, 45, 5–32, 2001. a
Brodersen, K. H., Ong, C. S., Stephan, K. E., and Buhmann, J. M.: The balanced accuracy and its posterior distribution, in: 2010 20th international conference on pattern recognition, 3121–3124, IEEE, https://doi.org/10.1109/ICPR.2010.764, 2010. a, b, c
Chacón, J. E. and Rastrojo, A. I.: Minimum adjusted Rand index for two clusterings of a given size, Advances in Data Analysis and Classification, 17, 125–133, 2023. a
Chawla, N. V., Bowyer, K. W., Hall, L. O., and Kegelmeyer, W. P.: SMOTE: synthetic minority over-sampling technique, Journal of Artificial Intelligence Research, 16, 321–357, 2002. a
Chekir, A., Hassas, S., Descoteaux, M., Côté, M., Garyfallidis, E., and Oulebsir-Boumghar, F.: 3D-SSF: A bio-inspired approach for dynamic multi-subject clustering of white matter tracts, Computers in Biology and Medicine, 83, 10–21, 2017. a
Cooper, O. R., Parrish, D., Ziemke, J., Balashov, N., Cupeiro, M., Galbally, I., Gilge, S., Horowitz, L., Jensen, N., Lamarque, J.-F., Naik, V., Oltmans, S. J., Schwab, J., Shindell, D. T., Thompson, A. M., Thouret, V., Wang, Y., and Zbinden, R. M.: Global distribution and trends of tropospheric ozone: An observation-based review, Elementa: Science of the Anthropocene, 2, 000029, https://doi.org/10.12952/journal.elementa.000029, 2014. a
Fleming, E., Payne, J., Sweet, W., Craghan, M., Haines, J., Hart, J., Stiller, H., and Sutton-Grier, A.: Coastal Effects, in: Impacts, Risks, and Adaptation in the United States: Fourth National Climate Assessment, Volume II, edited by: Reidmiller, D. R., Avery, C. W., Easterling, D. R., Kunkel, K. E., Lewis, K. L. M., Maycock, T. K., and Stewart, B. C., 322–352. U.S. Global Change Research Program, Washington, DC, USA, https://pubs.usgs.gov/publication/70201869 (last access: 29 June 2026), 2018. a
Florczyk, A. J., Corbane, C., Ehrlich, D., Freire, S., Kemper, T., Maffenini, L., Melchiorri, M., Pesaresi, M., Politis, P., and Schiavina, M.: GHSL data package 2019, Publications Office of the European Union, Luxembourg, https://doi.org/10.2760/290498, 2019. a
Gaudel, A., Cooper, O. R., Ancellet, G., Barret, B., Boynard, A., Burrows, J. P., Clerbaux, C., Coheur, P.-F., Cuesta, J., Cuevas, E., Eskes, H., van Roozendael, M., Ziemke, J. R., Liu, X., Tarasick, D. W., Thouret, V., Thompson, A. M., Witte, J. C., Safieddine, S., Steinbrecht, W., Stübi, R., Trickl, T., Wang, T., Vigouroux, C., Xu, X., Wagner, A., and Yu, H.: Tropospheric Ozone Assessment Report: Present-day distribution and trends of tropospheric ozone relevant to climate and global atmospheric chemistry model evaluation, Elementa: Science of the Anthropocene, 6, 39, https://doi.org/10.1525/elementa.291, 2018. a
Granier, C., Darras, S., van Der Gon, H. D., Jana, D., Elguindi, N., Bo, G., Michael, G., Marc, G., Jalkanen, J.-P., Kuenen, J., Liousse, C., Quack, B., Simpson, D., and Sindelarova, K.: The Copernicus atmosphere monitoring service global and regional emissions (April 2019 version), PhD thesis, Copernicus Atmosphere Monitoring Service, https://doi.org/10.24380/d0bn-kx16, 2019. a
Griffiths, P. T., Murray, L. T., Zeng, G., Shin, Y. M., Abraham, N. L., Archibald, A. T., Deushi, M., Emmons, L. K., Galbally, I. E., Hassler, B., Horowitz, L. W., Keeble, J., Liu, J., Moeini, O., Naik, V., O'Connor, F. M., Oshima, N., Tarasick, D., Tilmes, S., Turnock, S. T., Wild, O., Young, P. J., and Zanis, P.: Tropospheric ozone in CMIP6 simulations, Atmos. Chem. Phys., 21, 4187–4218, https://doi.org/10.5194/acp-21-4187-2021, 2021. a
Harris, I. and Jones, P.: University of East Anglia Climatic Research Unit 2017 CRU TS4. 00: Climatic Research Unit (CRU) Time-Series (TS) version 4.00 of high-resolution gridded data of month-by-month variation in climate (Jan. 1901–Dec. 2015), Chilton, Oxfordshire, Centre for Environmental Data Analysis, https://doi.org/10.5285/edf8febfdaad48abb2cbaf7d7e846a86, 2017. a
Hesse, M. and Siedentop, S.: Suburbanisation and suburbanisms – Making sense of continental European developments, Raumforschung und Raumordnung [Spatial Research and Planning], 76, 97–108, 2018. a
Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., and Liu, T.-Y.: Lightgbm: A highly efficient gradient boosting decision tree, Advances in Neural Information Processing Systems, 30, https://dl.acm.org/doi/10.5555/3294996.3295074 (last access: 29 June 2026), 2017. a
Ketchen, D. J. and Shook, C. L.: The application of cluster analysis in strategic management research: an analysis and critique, Strategic Management Journal, 17, 441–458, 1996. a
Kroehl, H.: National Geophysical and Solar-Terrestrial Data Center, EDIS, NOAA, Boulder, Colorado 80303, American Geophysical Union, p. 98, https://www.ncei.noaa.gov/products/space-weather/legacy-data/publications (last access: 24 June 2026), 1982. a
Kvålseth, T. O.: On normalized mutual information: measure derivations and properties, Entropy, 19, 631, https://doi.org/10.3390/e19110631, 2017. a
Mache, R. K., Schröder, S., Langguth, M., Patnala, A., and Schultz, M. G.: TOAR-classifier v2: A data-driven classification tool for global air quality stations, Zenodo [code, data set], https://doi.org/10.5281/zenodo.15411286, 2025. a
Madronich, S., Sulzberger, B., Longstreth, J., Schikowski, T., Andersen, M. S., Solomon, K., and Wilson, S.: Changes in tropospheric air quality related to the protection of stratospheric ozone in a changing climate, Photochemical & Photobiological Sciences, 22, 1129–1176, 2023. a
Mills, M. M., Brown, Z. W., Laney, S. R., Ortega-Retuerta, E., Lowry, K. E., van Dijken, G. L., and Arrigo, K. R.: Nitrogen limitation of the summer phytoplankton and heterotrophic prokaryote communities in the Chukchi Sea, Frontiers in Marine Science, 5, 362, https://doi.org/10.3389/fmars.2018.00362, 2018. a
Monks, P. S., Archibald, A. T., Colette, A., Cooper, O., Coyle, M., Derwent, R., Fowler, D., Granier, C., Law, K. S., Mills, G. E., Stevenson, D. S., Tarasova, O., Thouret, V., von Schneidemesser, E., Sommariva, R., Wild, O., and Williams, M. L.: Tropospheric ozone and its precursors from the urban to the global scale from air quality to short-lived climate forcer, Atmos. Chem. Phys., 15, 8889–8973, https://doi.org/10.5194/acp-15-8889-2015, 2015. a
Orru, H., Andersson, C., Ebi, K. L., Langner, J., Åström, C., and Forsberg, B.: Impact of climate change on ozone-related mortality and morbidity in Europe, European Respiratory Journal, 41, 285–294, 2013. a
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., and Duchesnay, E.: Scikit-learn: Machine Learning in Python, Journal of Machine Learning Research, 12, 2825–2830, 2011a. a, b
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., and Duchesnay, É.: Scikit-learn: Machine learning in Python, Journal of Machine Learning Research, 12, 2825–2830, 2011b. a, b
Pelleg, D. and Moore, A.: X-means: Extending K-means with Efficient Estimation of the Number of Clusters, in: Proceedings of the Seventeenth International Conference on Machine Learning (ICML 2000), 727–734, Morgan Kaufmann, San Francisco, CA, 2000. a
Post, E. S., Grambsch, A., Weaver, C., Morefield, P., Huang, J., Leung, L.-Y., Nolte, C. G., Adams, P., Liang, X.-Z., Zhu, J.-H., and Mahoney, H.: Variation in estimated ozone-related health impacts of climate change due to modeling choices and assumptions, Environmental Health Perspectives, 120, 1559–1564, 2012. a
Powers, D. M.: Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation, arXiv [preprint], https://doi.org/10.48550/arXiv.2010.16061, 2020. a, b, c, d
Prokhorenkova, L., Gusev, G., Vorobev, A., Dorogush, A. V., and Gulin, A.: CatBoost: unbiased boosting with categorical features, arXiv, https://doi.org/10.48550/arXiv.1810.11363, 2018. a
Rubinsteyn, A. and Feldman, S.: Fancyimpute: An Imputation Library for Python, GitHub, https://github.com/iskandr/fancyimpute (last access: 24 June 2026), 2016. a
Schultz, M. G., Akimoto, H., Bottenheim, J., Buchmann, B., Galbally, I. E., Gilge, S., Helmig, D., Koide, H., Lewis, A. C., Novelli, P. C., Plass-Dülmer, C., Ryerson, T. B., Steinbacher, M., Steinbrecher, R., Tarasova, O., Tørseth, K., Thouret, V., and Zellweger, C.: The Global Atmosphere Watch reactive gases measurement network, Elementa: Science of the Anthropocene, 3, 000067, https://doi.org/10.12952/journal.elementa.000067, 2015. a
Schultz, M. G., Schröder, S., Lyapina, O., Cooper, O. R., Galbally, I., Petropavlovskikh, I., Von Schneidemesser, E., Tanimoto, H., Elshorbany, Y., Naja, M., Seguel, R. J., Dauert, U., Eckhardt, P., Feigenspan, S., Fiebig, M., Hjellbrekke, A.-G., Hong, Y.-D., Kjeld, P. C., Koide, H., Lear, G., Tarasick, D., Ueno, M., Wallasch, M., Baumgardner, D., Chuang, M.-T., Gillett, R., Lee, M., Molloy, S., Moolla, R., Wang, T., Sharps, K., Adame, J. A., Ancellet, G., Apadula, F., Artaxo, P., Barlasina, M. E., Bogucka, M., Bonasoni, P., Chang, L., Colomb, A., Cuevas-Agulló, E., Cupeiro, M., Degorska, A., Ding, A., Fröhlich, M., Frolova, M., Gadhavi, H., Gheusi, F., Gilge, S., Gonzalez, M. Y., Gros, V., Hamad, S. H., Helmig, D., Henriques, D., Hermansen, O., Holla, R., Hueber, J., Im, U., Jaffe, D. A., Komala, N., Kubistin, D., Lam, K.-S., Laurila, T., Lee, H., Levy, I., Mazzoleni, C., Mazzoleni, L. R., McClure-Begley, A., Mohamad, M., Murovec, M., Navarro-Comas, M., Nicodim, F., Parrish, D., Read, K. A., Reid, N., Ries, L., Saxena, P., Schwab, J. J., Scorgie, Y., Senik, I., Simmonds, P., Sinha, V., Skorokhod, A. I., Spain, G., Spangl, W., Spoor, R., Springston, S. R., Steer, K., Steinbacher, M., Suharguniyawan, E., Torre, P., Trickl, T., Weili, L., Weller, R., Xu, X., Xue, L., and Ma, Z.: Tropospheric Ozone Assessment Report: Database and metrics data of global surface ozone observations, Elem. Sci. Anth., 5, 58, 2017a. a, b, c, d
Schultz, M. G., Schröder, S., Lyapina, O., Cooper, O. R., Galbally, I., Petropavlovskikh, I., von Schneidemesser, E., Tanimoto, H., Elshorbany, Y., Naja, M., Seguel, R. J., Dauert, U., Eckhardt, P., Feigenspan, S., Fiebig, M., Hjellbrekke, A.-G., Hong, Y.-D., Kjeld, P. C., Koide, H., Lear, G., Tarasick, D., Ueno, M., Wallasch, M., Baumgardner, D., Chuang, M.-T., Gillett, R., Lee, M., Molloy, S., Moolla, R., Wang, T., Sharps, K., Adame, J. A., Ancellet, G., Apadula, F., Artaxo, P., Barlasina, M. E., Bogucka, M., Bonasoni, P., Chang, L., Colomb, A., Cuevas-Agulló, E., Cupeiro, M., Degorska, A., Ding, A., Fröhlich, M., Frolova, M., Gadhavi, H., Gheusi, F., Gilge, S., Gonzalez, M. Y., Gros, V., Hamad, S. H., Helmig, D., Henriques, D., Hermansen, O., Holla, R., Hueber, J., Im, U., Jaffe, D. A., Komala, N., Kubistin, D., Lam, K.-S., Laurila, T., Lee, H., Levy, I., Mazzoleni, C., Mazzoleni, L. R., McClure-Begley, A., Mohamad, M., Murovec, M., Navarro-Comas, M., Nicodim, F., Parrish, D., Read, K. A., Reid, N., Ries, L., Saxena, P., Schwab, J. J., Scorgie, Y., Senik, I., Simmonds, P., Sinha, V., Skorokhod, A. I., Spain, G., Spangl, W., Spoor, R., Springston, S. R., Steer, K., Steinbacher, M., Suharguniyawan, E., Torre, P., Trickl, T., Weili, L., Weller, R., Xu, X., Xue, L., and Ma, Z.: Tropospheric Ozone Assessment Report: Database and metrics data of global surface ozone observations, Elementa: Science of the Anthropocene, 5, 58, https://doi.org/10.1525/elementa.244, 2017b. a
Sensoressa, N. A.: Balanced Accuracy: When Should You Use It?, Neptune.ai, https://doi.org/10.1109/ICPR.2010.764, 2025. a
Sinaga, K. P. and Yang, M.-S.: Unsupervised K-means clustering algorithm, IEEE Access, 8, 80716–80727, 2020. a
Szwarcman, D., Roy, S., Fraccaro, P., Gíslason, Þ. E., Blumenstiel, B., Ghosal, R., de Oliveira, P. H., Almeida, J. L. d. S., Sedona, R., Kang, Y., Chakraborty, S., Wang, S., Gomes, C., Kumar, A., Truong, M., Godwin, D., Lee, H., Hsu, C.-Y., Lal, R., Asanjan, A. A., Mujeci, B., Shidham, D., Keenan, T., Arevalo, P., Li, W., Alemohammad, H., Olofsson, P., Hain, C., Kennedy, R., Zadrozny, B., Bell, D., Cavallaro, G., Watson, C., Maskey, M., Ramachandran, R., and Moreno, J. B.: Prithvi-EO-2.0: A Versatile Multi-Temporal Foundation Model for Earth Observation Applications, arXiv [preprint], https://doi.org/10.48550/arXiv.2412.02732, 2024. a
Tapia, O., Escudero, M., Lozano, Á., Anzano, J., and Mantilla, E.: New classification scheme for ozone monitoring stations based on frequency distribution of hourly data, Science of The Total Environment, 544, 1–9, 2016. a
Teakles, A. D., So, R., Ainslie, B., Nissen, R., Schiller, C., Vingarzan, R., McKendry, I., Macdonald, A. M., Jaffe, D. A., Bertram, A. K., Strawbridge, K. B., Leaitch, W. R., Hanna, S., Toom, D., Baik, J., and Huang, L.: Impacts of the July 2012 Siberian fire plume on air quality in the Pacific Northwest, Atmos. Chem. Phys., 17, 2593–2611, https://doi.org/10.5194/acp-17-2593-2017, 2017. a
Van Rossum, G.: Python programming language, in: USENIX annual technical conference, Vol. 41, 1–36, Santa Clara, CA, https://dblp.org/db/conf/usenix/usenix2007 (last access: 24 June 2026), 2007. a
Zhang, L., Sun, Y., Li, C., and Li, B.: Promoting Sustainable Development in Urban–Rural Areas: A New Approach for Evaluating the Policies of Characteristic Towns in China, Buildings, 14, 1085, https://doi.org/10.3390/buildings14041085, 2024. a
Zhou, M., Li, Y., and Zhang, F.: Spatiotemporal variation in ground level ozone and its driving factors: a comparative study of coastal and inland cities in eastern China, International Journal of Environmental Research and Public Health, 19, 9687, https://doi.org/10.3390/ijerph19159687, 2022. a
Short summary
The TOAR-classifier model is a data-driven tool that allows for an objective classification of air quality measuring stations as urban, rural, or suburban. Such classification is important in the analysis of air pollutant trends and regional signatures. The model is employed in the second Tropospheric Ozone Assessment Report but can also be used in other research work.
The TOAR-classifier model is a data-driven tool that allows for an objective classification of...