Articles | Volume 18, issue 9
https://doi.org/10.5194/gmd-18-2701-2025
https://doi.org/10.5194/gmd-18-2701-2025
Methods for assessment of models
 | 
15 May 2025
Methods for assessment of models |  | 15 May 2025

Similarity-based analysis of atmospheric organic compounds for machine learning applications

Hilda Sandström and Patrick Rinke

Data sets

Similarity-Based Analysis of Atmospheric Organic Compounds for Machine Learning Applications Hilda Sandström https://doi.org/10.5281/zenodo.14671496

Model code and software

Atmospheric Compound Similarity Analysis Hilda Sandström https://zenodo.org/records/14224079

atmospheric_compound_similarity_analysis hilsan https://github.com/hilsan/atmospheric_compound_similarity_analysis

Download
Short summary
Machine learning has the potential to aid the identification of organic molecules involved in aerosol formation. Yet, progress is stalled by a lack of curated atmospheric molecular datasets. Here, we compared atmospheric compounds with large molecular datasets used in machine learning and found minimal overlap with similarity algorithms. Our result underlines the need for collaborative efforts to curate atmospheric molecular data to facilitate machine learning models in atmospheric sciences.
Share