the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Autoencoder-based feature extraction for the automatic detection of snow avalanches in seismic data
Abstract. Monitoring snow avalanche activity is essential for operational avalanche forecasting and the successful implementation of mitigation measures to ensure safety in mountain regions. To facilitate and automate the monitoring process, avalanche detection systems equipped with seismic sensors can provide a cost-effective solution. Still, automatically differentiating avalanche signals from other sources in seismic data remains challenging, mainly due to the complexity of seismic signals generated by avalanches, the complex signal transmission through the ground, the relatively rare occurrence of avalanches, and the presence of multiple sources in the continuous seismic data. One approach to automate avalanche detection is by applying machine learning methods. So far, research in this area has mainly focused on extracting standard domain-specific signal attributes in the time and frequency domains as input features for statistical models. In this study, we propose a novel application of deep learning autoencoder models for the automatic and unsupervised extraction of features from seismic recordings. These new features are then fed into classifiers for discriminating snow avalanches. To this end, we trained three Random forest classifiers based on different feature extraction approaches. The first set of 32 features was automatically extracted from the time-series signals by an autoencoder consisting of convolutional layers and a recurrent long short-term memory unit. The second autoencoder applies a series of fully connected layers to extract 16 features from the spectrum of the signals. As a benchmark, a third random forest was trained with typical waveform, spectral and spectrogram attributes used to discriminate seismic events. We extracted all these features from 10-second windows of the seismograms recorded with an array of five seismometers installed in an avalanche test site located above Davos, Switzerland. The database used to train and test the models contained 84 avalanches and 828 noise (unrelated to avalanches) events recorded during the winter seasons of 2020–2021 and 2021–2022. Finally, we assessed the performance of each classifier, compared the results, and proposed different aggregation methods to improve the predictive performance of the developed seismic detection algorithms. The classifiers achieved an avalanche f1-score of 0.61 (seismic attributes), 0.49 (temporal autoencoder) and 0.60 (spectral autoencoder) and avalanche recall of 0.68, 0.71 and 0.71, respectively. Overall, the macro f1-score ranged from 0.70 (temporal autoencoder) to 0.78 (seismic attributes). After applying a post-processing step to event-based predictions, the avalanche recall of the three models significantly increased, reaching values between 0.82 and 0.91. The developed approach could be potentially used as an operational, near-real-time avalanche detection system. Yet, the relatively high number of false alarms still needs further implementation of the current automated seismic classification algorithms to be used as unique methods to detect avalanches effectively.
- Preprint
(24857 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
-
CEC1: 'Comment on gmd-2024-76', Juan Antonio Añel, 15 Jun 2024
Dear authors,
Unfortunately, after checking your manuscript, it has come to our attention that it does not comply with our "Code and Data Policy".
https://www.geoscientific-model-development.net/policies/code_and_data_policy.htmlThe problem is that you have not published the code and data necessary to replicate your manuscript. Our policy clearly states that all the code and data used in a manuscript must be published at the submission time in one of the acceptable repositories listed in our policy, and that the Code and Data Availability section must contain the details (links and DOIs) for such repositories. Instead this section in your manuscript reads "The code and data to develop the final models used in this study will be made available on GitLab and EnviDat"
You have provided internally a internet address containing part of these assets (not all of them, according to my understanding). This is not enough. First, the WSL server is not a repository that complies with the standards required for scientific publication; second, all the information must be available to every potential reader in Discussions to facilitate the peer-review and comments by the community, and sharing it privately with the editors fails to comply with the Discussions peer-review process.
Therefore, please, publish your code in one of the appropriate repositories, and reply to this comment with the relevant information (link and DOI) as soon as possible, as we can not accept manuscripts in Discussions that do not comply with our policy. Therefore, the current situation with your manuscript is irregular.
In this way, if you do not fix this problem, we will have to reject your manuscript for publication in our journal.
Also, you must include in a potentially reviewed version of your manuscript the modified 'Code and Data Availability' section, containing the links and DOI of the repository containing code and data.
Finally, in the current git that you have provided for the assets, there is no license listed. If you do not include a license, the code is not "free software/open-source"; it continues to be your property and nobody can use it, despite you make it public. Therefore, when uploading the code and data to the new repository, you should add a license. You could want to choose a free software/open-source (FLOSS) license. We recommend the GPLv3. You only need to include the file 'https://www.gnu.org/licenses/gpl-3.0.txt' as LICENSE.txt with your code. Also, you can choose other options that acceptable repositories provide: GPLv2, Apache License, MIT License, etc.
Juan A. Añel
Geosci. Model Dev. Executive EditorCitation: https://doi.org/10.5194/gmd-2024-76-CEC1 -
AC1: 'Reply on CEC1', Andri Simeon, 21 Jun 2024
Dear Juan A. Añel
We apologise for the misunderstanding and any inconvenience it might have caused.
In the meantime, we have added the code and data used in the manuscript to Zenodo. As suggested, we also included a license text.
DOI: 10.5281/zenodo.12162570
Link: https://zenodo.org/records/12162570
The GitLab repository is found at: https://gitlabext.wsl.ch/simeonan/code-egu-paper
Sincerely,
Andri Simeon
Citation: https://doi.org/10.5194/gmd-2024-76-AC1
-
AC1: 'Reply on CEC1', Andri Simeon, 21 Jun 2024
-
RC1: 'Comment on gmd-2024-76', Anonymous Referee #1, 15 Jul 2024
The authors have applied deep learning autoencoder models for the automatic and unsupervised extraction of features from seismic records. These extracted features were then used in classifiers to identify snow avalanches. This study presents a novel and relevant approach to enhance machine learning predictions, which could be useful not only for identifying snow avalanches but also for detecting other types of natural events. The overall methodology is well-defined, and the manuscript is well-written and easy to follow. I recommend the publication of this manuscript after the following issues are addressed
Major:
1) Given that the models tend to miss the onset of an avalanche, the authors should have included a scenario where only verified avalanches were used, excluding non-verified ones during the training of the autoencoders. While I am not suggesting that this must be incorporated in the revised version, as this conclusion emerged only after the study was completed, it is still worth mentioning.
2) How were the machine learning algorithms implemented, including details such as programming languages and libraries used?
Minor:
Line 218: “The best model from the cross-validation procedure (Table F2) was composed of convolutions with kernel size 20 (or 0.1 s) and stride 10. “ Was the MSE (Mean Squared Error) the primary metric used for classification?
Line 228: “As an activation function, we use the leaky rectified linear unit (leaky ReLU; (Xu et al., 2015))”. Was the activation function unchanged during the hyperparameter optimization process?
Line 318: Please replace “This for” by “For this”
Citation: https://doi.org/10.5194/gmd-2024-76-RC1 - AC2: 'Reply on RC1', Andri Simeon, 28 Aug 2024
-
RC2: 'Comment on gmd-2024-76', Anonymous Referee #2, 27 Jul 2024
The paper is dealing with snow avalanche detection using autoencoded seismic data. It uses one study side in Davos and events were picked and the performance tested. The article is well written and of interest for publication. However, the following should be adressed.
Discuss the effects of various avalanche types in relation to models and autoencoders. Explain how the findings can be applied to other study sites and what the specific considerations are in this context. That could be also highlighted in the comparison with the former studies.
The conclusions and the further use should be more clear.
Please, avoid repetitions throughout the article.
Citation: https://doi.org/10.5194/gmd-2024-76-RC2 -
AC3: 'Reply on RC2', Andri Simeon, 28 Aug 2024
We are thankful to the reviewer for the valuable suggestions on how to improve our manuscript. We appreciate you to evaluate our work and we will revise the paper following the suggestions. Please find the responses to the individual comments in the attached PDF.
-
AC3: 'Reply on RC2', Andri Simeon, 28 Aug 2024
Model code and software
Autoencoder-based feature extraction for the automatic detection of snow avalanches in seismic data Andri Simeon https://gitlabext.wsl.ch/simeonan/code-egu-paper
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
384 | 108 | 30 | 522 | 16 | 14 |
- HTML: 384
- PDF: 108
- XML: 30
- Total: 522
- BibTeX: 16
- EndNote: 14
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1