Autoencoder-based feature extraction for the automatic detection of snow avalanches in seismic data

Simeon, Andri; Pérez-Guillén, Cristina; Volpi, Michele; Seupel, Christine; van Herwijnen, Alec

doi:https://doi.org/10.5194/gmd-2024-76

Preprints

https://doi.org/10.5194/gmd-2024-76

Preprints

Submitted as: development and technical paper

27 May 2024

Submitted as: development and technical paper |

| 27 May 2024

Status: a revised version of this preprint is currently under review for the journal GMD.

Autoencoder-based feature extraction for the automatic detection of snow avalanches in seismic data

Andri Simeon, Cristina Pérez-Guillén, Michele Volpi, Christine Seupel, and Alec van Herwijnen

Abstract. Monitoring snow avalanche activity is essential for operational avalanche forecasting and the successful implementation of mitigation measures to ensure safety in mountain regions. To facilitate and automate the monitoring process, avalanche detection systems equipped with seismic sensors can provide a cost-effective solution. Still, automatically differentiating avalanche signals from other sources in seismic data remains challenging, mainly due to the complexity of seismic signals generated by avalanches, the complex signal transmission through the ground, the relatively rare occurrence of avalanches, and the presence of multiple sources in the continuous seismic data. One approach to automate avalanche detection is by applying machine learning methods. So far, research in this area has mainly focused on extracting standard domain-specific signal attributes in the time and frequency domains as input features for statistical models. In this study, we propose a novel application of deep learning autoencoder models for the automatic and unsupervised extraction of features from seismic recordings. These new features are then fed into classifiers for discriminating snow avalanches. To this end, we trained three Random forest classifiers based on different feature extraction approaches. The first set of 32 features was automatically extracted from the time-series signals by an autoencoder consisting of convolutional layers and a recurrent long short-term memory unit. The second autoencoder applies a series of fully connected layers to extract 16 features from the spectrum of the signals. As a benchmark, a third random forest was trained with typical waveform, spectral and spectrogram attributes used to discriminate seismic events. We extracted all these features from 10-second windows of the seismograms recorded with an array of five seismometers installed in an avalanche test site located above Davos, Switzerland. The database used to train and test the models contained 84 avalanches and 828 noise (unrelated to avalanches) events recorded during the winter seasons of 2020–2021 and 2021–2022. Finally, we assessed the performance of each classifier, compared the results, and proposed different aggregation methods to improve the predictive performance of the developed seismic detection algorithms. The classifiers achieved an avalanche f1-score of 0.61 (seismic attributes), 0.49 (temporal autoencoder) and 0.60 (spectral autoencoder) and avalanche recall of 0.68, 0.71 and 0.71, respectively. Overall, the macro f1-score ranged from 0.70 (temporal autoencoder) to 0.78 (seismic attributes). After applying a post-processing step to event-based predictions, the avalanche recall of the three models significantly increased, reaching values between 0.82 and 0.91. The developed approach could be potentially used as an operational, near-real-time avalanche detection system. Yet, the relatively high number of false alarms still needs further implementation of the current automated seismic classification algorithms to be used as unique methods to detect avalanches effectively.

Received: 12 Apr 2024 – Discussion started: 27 May 2024

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.

Download & links

Andri Simeon, Cristina Pérez-Guillén, Michele Volpi, Christine Seupel, and Alec van Herwijnen

Status: final response (author comments only)

CEC1:
'Comment on gmd-2024-76', Juan Antonio Añel, 15 Jun 2024

Dear authors,

Unfortunately, after checking your manuscript, it has come to our attention that it does not comply with our "Code and Data Policy".

https://www.geoscientific-model-development.net/policies/code_and_data_policy.html
The problem is that you have not published the code and data necessary to replicate your manuscript. Our policy clearly states that all the code and data used in a manuscript must be published at the submission time in one of the acceptable repositories listed in our policy, and that the Code and Data Availability section must contain the details (links and DOIs) for such repositories. Instead this section in your manuscript reads "The code and data to develop the final models used in this study will be made available on GitLab and EnviDat"
You have provided internally a internet address containing part of these assets (not all of them, according to my understanding). This is not enough. First, the WSL server is not a repository that complies with the standards required for scientific publication; second, all the information must be available to every potential reader in Discussions to facilitate the peer-review and comments by the community, and sharing it privately with the editors fails to comply with the Discussions peer-review process.
Therefore, please, publish your code in one of the appropriate repositories, and reply to this comment with the relevant information (link and DOI) as soon as possible, as we can not accept manuscripts in Discussions that do not comply with our policy. Therefore, the current situation with your manuscript is irregular.
In this way, if you do not fix this problem, we will have to reject your manuscript for publication in our journal.
Also, you must include in a potentially reviewed version of your manuscript the modified 'Code and Data Availability' section, containing the links and DOI of the repository containing code and data.
Finally, in the current git that you have provided for the assets, there is no license listed. If you do not include a license, the code is not "free software/open-source"; it continues to be your property and nobody can use it, despite you make it public. Therefore, when uploading the code and data to the new repository, you should add a license. You could want to choose a free software/open-source (FLOSS) license. We recommend the GPLv3. You only need to include the file 'https://www.gnu.org/licenses/gpl-3.0.txt' as LICENSE.txt with your code. Also, you can choose other options that acceptable repositories provide: GPLv2, Apache License, MIT License, etc.
Juan A. Añel

Geosci. Model Dev. Executive Editor

Citation: https://doi.org/10.5194/gmd-2024-76-CEC1
- AC1: 'Reply on CEC1', Andri Simeon, 21 Jun 2024
  
  Dear Juan A. Añel
  
  We apologise for the misunderstanding and any inconvenience it might have caused.
  In the meantime, we have added the code and data used in the manuscript to Zenodo. As suggested, we also included a license text.
  DOI: 10.5281/zenodo.12162570
  Link: https://zenodo.org/records/12162570
  The GitLab repository is found at: https://gitlabext.wsl.ch/simeonan/code-egu-paper
  
  Sincerely,
  Andri Simeon
  
  Citation: https://doi.org/10.5194/gmd-2024-76-AC1
RC1:
'Comment on gmd-2024-76', Anonymous Referee #1, 15 Jul 2024

The authors have applied deep learning autoencoder models for the automatic and unsupervised extraction of features from seismic records. These extracted features were then used in classifiers to identify snow avalanches. This study presents a novel and relevant approach to enhance machine learning predictions, which could be useful not only for identifying snow avalanches but also for detecting other types of natural events. The overall methodology is well-defined, and the manuscript is well-written and easy to follow. I recommend the publication of this manuscript after the following issues are addressed

Major:
1) Given that the models tend to miss the onset of an avalanche, the authors should have included a scenario where only verified avalanches were used, excluding non-verified ones during the training of the autoencoders. While I am not suggesting that this must be incorporated in the revised version, as this conclusion emerged only after the study was completed, it is still worth mentioning.
2) How were the machine learning algorithms implemented, including details such as programming languages and libraries used?

Minor:
Line 218: “The best model from the cross-validation procedure (Table F2) was composed of convolutions with kernel size 20 (or 0.1 s) and stride 10. “ Was the MSE (Mean Squared Error) the primary metric used for classification?
Line 228: “As an activation function, we use the leaky rectified linear unit (leaky ReLU; (Xu et al., 2015))”. Was the activation function unchanged during the hyperparameter optimization process?
Line 318: Please replace “This for” by “For this”

Citation: https://doi.org/10.5194/gmd-2024-76-RC1
- AC2: 'Reply on RC1', Andri Simeon, 28 Aug 2024
  
  We would like to express our gratitude to the reviewer for taking the time to evaluate and provide valuable feedback on our manuscript. The responses to the individual comments are found in the attached PDF.
  
  Citation: https://doi.org/10.5194/gmd-2024-76-AC2
RC2:
'Comment on gmd-2024-76', Anonymous Referee #2, 27 Jul 2024

The paper is dealing with snow avalanche detection using autoencoded seismic data. It uses one study side in Davos and events were picked and the performance tested. The article is well written and of interest for publication. However, the following should be adressed.
Discuss the effects of various avalanche types in relation to models and autoencoders. Explain how the findings can be applied to other study sites and what the specific considerations are in this context. That could be also highlighted in the comparison with the former studies.
The conclusions and the further use should be more clear.
Please, avoid repetitions throughout the article.

Citation: https://doi.org/10.5194/gmd-2024-76-RC2
- AC3: 'Reply on RC2', Andri Simeon, 28 Aug 2024
  
  We are thankful to the reviewer for the valuable suggestions on how to improve our manuscript. We appreciate you to evaluate our work and we will revise the paper following the suggestions. Please find the responses to the individual comments in the attached PDF.
  
  Citation: https://doi.org/10.5194/gmd-2024-76-AC3

Andri Simeon, Cristina Pérez-Guillén, Michele Volpi, Christine Seupel, and Alec van Herwijnen

Model code and software

Autoencoder-based feature extraction for the automatic detection of snow avalanches in seismic data Andri Simeon https://gitlabext.wsl.ch/simeonan/code-egu-paper

Andri Simeon, Cristina Pérez-Guillén, Michele Volpi, Christine Seupel, and Alec van Herwijnen

Viewed

Total article views: 879 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
546	290	43	879	49	61

HTML: 546
PDF: 290
XML: 43
Total: 879
BibTeX: 49
EndNote: 61

Views and downloads (calculated since 27 May 2024)

Month	HTML	PDF	XML	Total
May 2024	50	16	2	68
Jun 2024	125	25	13	163
Jul 2024	78	24	8	110
Aug 2024	65	22	6	93
Sep 2024	29	11	0	40
Oct 2024	17	6	0	23
Nov 2024	36	5	2	43
Dec 2024	15	11	0	26
Jan 2025	24	18	5	47
Feb 2025	15	6	0	21
Mar 2025	16	14	1	31
Apr 2025	21	19	1	41
May 2025	13	5	0	18
Jun 2025	26	58	4	88
Jul 2025	14	48	1	63
Aug 2025	2	2	0	4

Cumulative views and downloads (calculated since 27 May 2024)

Month	HTML	PDF	XML	Total
May 2024	50	16	2	68
Jun 2024	125	25	13	163
Jul 2024	78	24	8	110
Aug 2024	65	22	6	93
Sep 2024	29	11	0	40
Oct 2024	17	6	0	23
Nov 2024	36	5	2	43
Dec 2024	15	11	0	26
Jan 2025	24	18	5	47
Feb 2025	15	6	0	21
Mar 2025	16	14	1	31
Apr 2025	21	19	1	41
May 2025	13	5	0	18
Jun 2025	26	58	4	88
Jul 2025	14	48	1	63
Aug 2025	2	2	0	4

Viewed (geographical distribution)

Total article views: 876 (including HTML, PDF, and XML) Thereof 876 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 07 Aug 2025

Short summary

Avalanche seismic detection systems are key for forecasting, but distinguishing avalanches from other seismic sources remains challenging. We propose novel autoencoder models to automatically extract features and compare them with standard seismic attributes. These features are then used to classify avalanches and noise events. The autoencoder feature classifiers have the highest sensitivity to detect avalanches, while the standard seismic classifier performs better overall.


Total:	0
HTML:	0
PDF:	0
XML:	0