the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
GEOMAPLEARN 1.0: Detecting geological structures from geological maps with machine learning
Abstract. The increasing availability of large geological datasets together with modern methods of data analysis facilitate a data science approach to geology in which inferences are drawn from geological data using automated methods based on statistics and machine learning. Such methods offer the potential for faster and less subjective interpretations of geological data than are possible from a human interpreter, but translating the understanding of a trained geologist to an algorithm is not straightforward. In this paper, we present automated workflows for detecting geological folds from map data using both unsupervised and supervised machine learning. For the unsupervised case, we use regular expression matching to identify map patterns suggestive of folds along lines crossing the map. We then use the hdbscan clustering algorithm to cluster these possible fold identifications into a smaller number of distinct folds, the number of which is not known a priori. For the supervised learning case, we use synthetic models of folds to train a convolutional neural network to identify folds using map and topographic data. We test both methods on synthetic and real datasets.
- Preprint
(1958 KB) - Metadata XML
- BibTeX
- EndNote
Status: open (until 22 Jul 2024)
-
CEC1: 'Comment on gmd-2024-35', Juan Antonio Añel, 15 Jun 2024
reply
Dear authors,
Unfortunately, after checking your manuscript, it has come to our attention that it does not comply with our "Code and Data Policy".
https://www.geoscientific-model-development.net/policies/code_and_data_policy.htmlTo replicate your work it is necessary to use some shapefiles (The Lavelanet and Esternay map shapefiles). However, you have not published them. You state that these have been provided to you by the BRGM, but this is your primary affiliation, so it comes out that are assets provided by your institution. We can not accept this. Our policy is clear, all the code and data necessary to produce a manuscript must be published when submitting the manuscript in one of the acceptable repositories according to our policy. Therefore, please, publish your shapefiles in one of the appropriate repositories, and reply to this comment with the relevant information (link and DOI) as soon as possible, as we can not accept manuscripts in Discussions that do not comply with our policy. In this way, the current situation with your manuscript is irregular.
Also, you must include in a potentially reviewed version of your manuscript the modified 'Data Availability' section, containing the requested information (link and DOI of the public repository containing the data).
If you think that your case for not sharing the data is under one of the exceptions that we can consider (publishing the data is out of your control because a law, regulation or mandate forbids it), please, reply to this comment with the evidence of it. Myself and the Topical Editor can guide you on it.
Please, note that if you do not fix this problem, we will have to reject your manuscript for publication in our journal.
Juan A. Añel
Geosci. Model Dev. Executive EditorCitation: https://doi.org/10.5194/gmd-2024-35-CEC1 -
AC1: 'Reply on CEC1', David Oakley, 20 Jun 2024
reply
Dear Editors,Thank you for bringing this issue to our attention. Although the shapefiles that we used are produced by the BRGM, they were not produced by us or as part of our present work, and we do not control their licensing. To rectify the situation, we propose to redo our analyses of the Lavelanet and Esternay areas using open-access map shapefiles, specifically those of the Bd Charm-50 dataset available at: http://infoterre.brgm.fr/formulaire/telechargement-cartes-geologiques-departementales-150-000-bd-charm-50. These maps are very similar to the ones we had originally used, so we do not anticipate any major changes to our results. When we have completed these analyses, we will publish a new version of the Zenodo repository containing the data and associated codes for the Lavelanet and Esternay sites.Sincerely,David OakleyCitation: https://doi.org/
10.5194/gmd-2024-35-AC1
-
AC1: 'Reply on CEC1', David Oakley, 20 Jun 2024
reply
-
RC1: 'Comment on gmd-2024-35', Anonymous Referee #1, 25 Jun 2024
reply
The authors present a very interesting study of evaluating the potential of both unsupervised and supervised machine learning approaches for the automatic detection of fold structures from geological maps.
Especially the consideration of an unsupervised machine learning approach holds interesting aspects. Nonetheless, it would be advantageous to clarify a couple of points and extend more in the direction of generalizability.
Major Comments:
- The title of the paper refers to the detection of geological structures, whereas the rest of the document focuses only on the detection of fold structures. The authors make clear that an extension to more general settings is desired for the future. It is clear how this can be achieved for the supervised approach, but might prove very challenging for the unsupervised technique. For isolated geological structures, this is likely possible but a combination of different structures might yield to problems in the unsupervised approach and potentially also for the supervised techniques. Therefore, it would be highly advantageous to have an example combining more than one geological feature to be able to judge the potential capabilities of both approaches in a more general setting. Without this example, it is challenging to see whether especially the presented unsupervised approach is extendable to complex structures or limited to more simple settings.
- For the unsupervised approach, the first step extracts rays. For the application to other studies, it is interesting to know what distance between the rays is in general desired. Are there any rules of thump? How would a too large distance affect the results? And would a too small distance significantly impact the efficiency/cost of the approach?
- For the supervised approach, it would be good to extend the results/discussion on the aspect whether there is a dependency on the hyperparameters. Furthermore, additional details regarding the architecture of the U-Net should be provided (e.g., number of hidden layers, number of neurons per layer, learning rate, ...).Minor Comments:
- Abstract: Not every reader might be familiar with the hdbscan clustering algorithm. Therefore, it would be useful to add a very brief explanation in the abstract.
- Abstract: The abstract is a bit generic and specified to better highlight the novelty of the approach presented in the paper.
- Introduction: The authors present previous work for both the automatic detection of geological structures from geological maps and the automatic classification of lithologies from remote sensing and geophysical data. It would be useful to extend this to include also the usage of machine learning approaches in the field of geological modeling. Especially since in this field also unsupervised approaches (Wang et al., 2017) have been tested, and it would be interesting to know if these approaches could potentially work also in the current settings.
- Equations 1 and 2: It might be better to move the description of equations 1 and 2 above the equations. Otherwise, these equations might be confusing for the reader at first since the type of notation might not be expected.
- Figure 1A: It would be good to add a verb to "Intersection of grid and map polygons" to unify this point with the rest of the figure.
- Figures: The resolution of some figures is relatively low. It might be advantageous to switch from bitmaps to vector graphics.References:
- Wang, Hui, et al. "A segmentation approach for stochastic geological modeling using hidden Markov random fields." Mathematical Geosciences 49 (2017): 145-177.Citation: https://doi.org/10.5194/gmd-2024-35-RC1
Model code and software
GEOMAPLEARN David Oakley and Thierry Coowar https://doi.org/10.5281/zenodo.11073379
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
250 | 52 | 15 | 317 | 7 | 7 |
- HTML: 250
- PDF: 52
- XML: 15
- Total: 317
- BibTeX: 7
- EndNote: 7
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1