Articles | Volume 18, issue 9
https://doi.org/10.5194/gmd-18-2701-2025
https://doi.org/10.5194/gmd-18-2701-2025
Methods for assessment of models
 | 
15 May 2025
Methods for assessment of models |  | 15 May 2025

Similarity-based analysis of atmospheric organic compounds for machine learning applications

Hilda Sandström and Patrick Rinke

Related authors

Technical note: Towards atmospheric compound identification in chemical ionization mass spectrometry with pesticide standards and machine learning
Federica Bortolussi, Hilda Sandström, Fariba Partovi, Joona Mikkilä, Patrick Rinke, and Matti Rissanen
Atmos. Chem. Phys., 25, 685–704, https://doi.org/10.5194/acp-25-685-2025,https://doi.org/10.5194/acp-25-685-2025, 2025
Short summary

Related subject area

Atmospheric sciences
Porting the Meso-NH atmospheric model on different GPU architectures for the next generation of supercomputers (version MESONH-v55-OpenACC)
Juan Escobar, Philippe Wautelet, Joris Pianezze, Florian Pantillon, Thibaut Dauhut, Christelle Barthe, and Jean-Pierre Chaboureau
Geosci. Model Dev., 18, 2679–2700, https://doi.org/10.5194/gmd-18-2679-2025,https://doi.org/10.5194/gmd-18-2679-2025, 2025
Short summary
Estimation of aerosol and cloud radiative heating rate in the tropical stratosphere using a radiative kernel method
Jie Gao, Yi Huang, Jonathon S. Wright, Ke Li, Tao Geng, and Qiurun Yu
Geosci. Model Dev., 18, 2569–2586, https://doi.org/10.5194/gmd-18-2569-2025,https://doi.org/10.5194/gmd-18-2569-2025, 2025
Short summary
Evaluation of dust emission and land surface schemes in predicting a mega Asian dust storm over South Korea using WRF-Chem
Ji Won Yoon, Seungyeon Lee, Ebony Lee, and Seon Ki Park
Geosci. Model Dev., 18, 2303–2328, https://doi.org/10.5194/gmd-18-2303-2025,https://doi.org/10.5194/gmd-18-2303-2025, 2025
Short summary
Sensitivity studies of a four-dimensional local ensemble transform Kalman filter coupled with WRF-Chem version 3.9.1 for improving particulate matter simulation accuracy
Jianyu Lin, Tie Dai, Lifang Sheng, Weihang Zhang, Shangfei Hai, and Yawen Kong
Geosci. Model Dev., 18, 2231–2248, https://doi.org/10.5194/gmd-18-2231-2025,https://doi.org/10.5194/gmd-18-2231-2025, 2025
Short summary
A Bayesian method for predicting background radiation at environmental monitoring stations in local-scale networks
Jens Peter Karolus Wenceslaus Frankemölle, Johan Camps, Pieter De Meutter, and Johan Meyers
Geosci. Model Dev., 18, 1989–2003, https://doi.org/10.5194/gmd-18-1989-2025,https://doi.org/10.5194/gmd-18-1989-2025, 2025
Short summary

Cited articles

Accelrys: The Keys to Understanding MDL Keyset Technology [White paper], Tech. rep., Accelrys, 2011. a
Aumont, B., Szopa, S., and Madronich, S.: Modelling the evolution of organic carbon during its gas-phase tropospheric oxidation: development of an explicit model based on a self generating approach, Atmos. Chem. Phys., 5, 2497–2517, https://doi.org/10.5194/acp-5-2497-2005, 2005. a, b
Ayoubi, D., Knattrup, Y., and Elm, J.: Clusteromics V: Organic Enhanced Atmospheric Cluster Formation, ACS Omega, 8, 9621–9629, https://doi.org/10.1021/acsomega.3c00251, 2023. a
Berkemeier, T., Krüger, M., Feinberg, A., Müller, M., Pöschl, U., and Krieger, U. K.: Accelerating models for multiphase chemical kinetics through machine learning with polynomial chaos expansion and neural networks, Geosci. Model Dev., 16, 2037–2054, https://doi.org/10.5194/gmd-16-2037-2023, 2023. a
Besel, V., Todorović, M., Kurtén, T., Rinke, P., and Vehkamäki, H.: Atomic structures, conformers and thermodynamic properties of 32k atmospheric molecules, Sci. Data, 10, 450, https://doi.org/10.1038/s41597-023-02366-x, 2023.  a, b, c, d, e
Download
Short summary
Machine learning has the potential to aid the identification of organic molecules involved in aerosol formation. Yet, progress is stalled by a lack of curated atmospheric molecular datasets. Here, we compared atmospheric compounds with large molecular datasets used in machine learning and found minimal overlap with similarity algorithms. Our result underlines the need for collaborative efforts to curate atmospheric molecular data to facilitate machine learning models in atmospheric sciences.
Share