the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Perspectives of Physics-Based Machine Learning for Geoscientific Applications Governed by Partial Differential Equations
Daniel Caviedes Voullième
Susanne Buiter
Harrie-Jan Hendriks Franssen
Harry Vereecken
Ana González-Nicolás
Florian Wellmann
Abstract. An accurate assessment of the physical states of the Earth system is an essential component of many scientific, societal and economical considerations. These assessments are becoming an increasingly challenging computational task since we aim to resolve models with high resolutions in space and time, to consider complex coupled partial differential equations, and to estimate uncertainties, which often requires many realizations. Machine learning methods are becoming a very popular method for the construction of surrogate 5 models to address these computational issues. However, they also face major challenges in producing explainable, scalable, interpretable and robust models. In this manuscript, we evaluate the perspectives of geoscience applications of physics-based machine learning, which combines physics-based and data-driven methods to overcome the limitations of each approach taken alone. Through three designated examples (from the fields of geothermal energy, geodynamics, and hydrology), we show that the non-intrusive reduced basis method as a physics-based machine learning approach is able to 10 produce highly precise surrogate models that are explainable, scalable, interpretable, and robust.
Denise Degen et al.
Status: open (extended)
-
RC1: 'Comment on gmd-2022-309', Anonymous Referee #1, 20 Apr 2023
reply
This paper provides an overview of physics-based ML method, mainly non-intrusive reduced basis method, for geoscientific problems and presents a workflow to implement physics-based methods. Authors provide geothermal example, geodynamic example, and hydrological example as three benchmarks to validate that this type of method has potential to solve general challenges in geoscience.
The paper is supposed to be a “perspective” paper, but the paper doesn’t provide sufficient overview of the field and existing methods. The numerical examples are also only limited to one method.
Major comments:
- Section 4 “Challenges” does not provide enough information, but rather repeat those already mentioned in introduction. It is also not clear how these challenges will be solved by new methods.
- There is a similar issue in conclusion section.
- For benchmark examples, the objective of learning and test metric are not clearly pointed out. For example, the input and output of the non-intrusive RB method and the target to learn should be emphasized in the text.
- It’s better to list computational costs for traditional methods and new methods to have a clear comparison as this paper focuses on speed-up of classical methods.
- Some paragraphs and sentences are hard to read and need to be revised.
Minor comments:
- Line 767-769 repetitive sentence.
- Format of equation should be consistent, such as all centered (equation 11 is on the left).
Citation: https://doi.org/10.5194/gmd-2022-309-RC1 -
AC1: 'Reply on RC1', Denise Degen, 26 Apr 2023
reply
Reviewer Comment: This paper provides an overview of physics-based ML method, mainly non-intrusive reduced basis method, for geoscientific problems and presents a workflow to implement physics-based methods. Authors provide geothermal example, geodynamic example, and hydrological example as three benchmarks to validate that this type of method has potential to solve general challenges in geoscience.
The paper is supposed to be a “perspective” paper, but the paper doesn’t provide sufficient overview of the field and existing methods. The numerical examples are also only limited to one method.
- Thank you very much for your comments regarding our manuscript.
- The idea behind the paper is to present the potential of physics-based machine learning for geoscientific applications governed by partial differential equations.Narrowing down the possible application field (as presented through the three examples) is necessary since the evaluation of the different physics-based machine learning methods strongly depends on the amount of available data and physical knowledge. We restrict our paper on purpose to physics-based machine learning since for data-driven machine learning, several review papers exist (Jordan and Mitchell, 2015; Kotsiantis et al., 2007; Mahesh, 2020). Within the field of physics-based machine learning, we chose to present physics-informed neural networks (PINNs) and the non-intrusive reduced basis (NIRB) method since we consider them as two important end members – whereas there are several more methods that combine characteristics of both approaches. The important difference, in our point of view, is that PINNs originated from the field of machine learning, treating physics only as a constraint, whereas NIRB originated from the field of applied mathematics, aiming at building the model mostly with physics in mind. PINNs are therefore ideal in a situation where some data is available and some physical knowledge. In the field we cover in this perspectives paper, we have a situation where data is often sparse, whereas physical knowledge is available. Hence, methods such as NIRB are preferential and thus also the choice to perform the benchmarks for the NIRB method. We realize that these points might not have become clear in the manuscript. Therefore, we will better illustrate these aspects in a revised manuscript. Furthermore, we will list other physics-based machine learning methods and provide further references to them, to highlight that many different physics-based machine learning methods exist. We will also provide references to review papers on data-driven methodologies.
Reviewer Comment:
Major comments:
- Section 4 “Challenges” does not provide enough information, but rather repeat those already mentioned in introduction. It is also not clear how these challenges will be solved by new methods.
- There is a similar issue in conclusion section.
- In the revised manuscript, we will emphasize stronger how the described methods can be used to solve the presented challenges.
Reviewer Comment: For benchmark examples, the objective of learning and test metric are not clearly pointed out. For example, the input and output of the non-intrusive RB method and the target to learn should be emphasized in the text.
- Thank you for the comment that the objectives need to be better illustrated. For clarification (which we will also include in a revised version): In general, the NIRB method consist of two steps:
- For the proper orthogonal decomposition, we have a set of snapshots as inputs (meaning full dimensional simulations for specific material properties and time steps) and obtain eigenvectors, which are the basis functions.
- For the machine learning method (in our case either a neural network or a gaussian process regression), which follows the proper orthogonal decomposition, we have the material properties as input. As training data, we have the matrix product of the snapshots and the basis functions. Hence, we obtain the weighting factors for the above determined basis functions.
Reviewer Comment: It’s better to list computational costs for traditional methods and new methods to have a clear comparison as this paper focuses on speed-up of classical methods.
- So far, we only presented the computational cost of the classical methods and the new methods in the text of the respective examples. Additionally, we will provide a table listing and comparing the computational cost for a better illustration.
Reviewer Comment: Some paragraphs and sentences are hard to read and need to be revised.
- We will revise the entire document carefully regarding readability.
Reviewer Comment:
Minor comments:
- Line 767-769 repetitive sentence.
- We will remove the sentence.
- Format of equation should be consistent, such as all centered (equation 11 is on the left).
- We will adjust the formatting.
References:
- Jordan, M. I., & Mitchell, T. M. (2015). Machine learning: Trends, perspectives, and prospects. Science, 349(6245), 255-260.
- Kotsiantis, S. B., Zaharakis, I., & Pintelas, P. (2007). Supervised machine learning: A review of classification techniques. Emerging artificial intelligence applications in computer engineering, 160(1), 3-24.
- Mahesh, B. (2020). Machine learning algorithms-a review. International Journal of Science and Research (IJSR), 9, 381-386.
Citation: https://doi.org/10.5194/gmd-2022-309-AC1
-
AC1: 'Reply on RC1', Denise Degen, 26 Apr 2023
reply
-
CEC1: 'Comment on gmd-2022-309', Juan Antonio Añel, 05 May 2023
reply
Dear authors,
Unfortunately, after checking your manuscript, it has come to our attention that it does not comply with our "Code and Data Policy".
https://www.geoscientific-model-development.net/policies/code_and_data_policy.htmlActually, we have detected several issues regarding this in your manuscript. I should note that, actually, your manuscript should not have been accepted in Discussions, given this lack of compliance with our policy. Therefore, the current situation with your manuscript is irregular. In this way, if you do not fix the below-listed problems in a prompt manner, we will have to reject your manuscript for publication in our journal.
First, we do not accept that it is necessary to contact authors to get access to the assets necessary to replicate the work described in the manuscript. This is crystal clear in our policy. Therefore, you must publish the SULEC code in one of the repositories that we accept, without restriction, open to anyone, and under a license that allows anyone to run it to be able to test and replicate your work.
Secondly, in the "Code Availability" section, the link to the Zenodo repository for DwarfElephant 1.0 is broken; please, fix it.
For ParFlow, you point to a GitHub repository. However, GitHub is not a suitable repository for scientific publication. GitHub itself instructs authors to use other alternatives for long-term archival and publishing, such as Zenodo. Fortunately, for ParFlow, there are several Zenodo repositories that maybe you could use. This depends on the ParFlow version that you are using for your work (by the way, you do not specify in the text, and you must do it). If there is not a ParFlow repository for the version that you use here, then you should create it.
Therefore, please, publish the SULEC and ParFlow codes in one of the appropriate repositories, and reply to this comment with the relevant information for them (link and DOI) as soon as possible, as it should be available for the Discussions stage. Also, please, include the relevant primary input/output data.
Also, do not forget to include in a potentially reviewed version of your manuscript the modified 'Code and Data Availability' section with the new DOIs.
Juan A. Añel
Geosci. Model Dev. Exec. EditorCitation: https://doi.org/10.5194/gmd-2022-309-CEC1 -
AC2: 'Reply on CEC1', Denise Degen, 10 May 2023
reply
Dear Juan Antonio Añel,
Thank you for your comment regarding the Data and Code Availability Section. We agree that all data should be provided that allow re-creating the surrogate models in our manuscript, for test or benchmarking purposes. We thank you for the opportunity to adjust this section during the review process. We will describe below which data are needed to construct the surrogate models.
Our manuscript does not further develop the three mentioned software packages (SULEC, DwarfElephant, ParFlow) nor uses these softwares for model testing. Instead, the aim of the paper is to construct surrogate models through physics-based machine learning based on output data from these three software packages. It is thus these output datasets, together with the Non-Intrusive RB code, that are essential for reproducing the surrogate models.
In order to ensure that the study can be reproduced and access to the data is guaranteed, we published the training and validation data sets and their associated model parameters, as well as the Non-Intrusive RB code for the construction of all three surrogate models in the following Zenodo repository (https://doi.org/10.5281/zenodo.7918740). The training and validation data sets are constructed with the three above-mentioned software packages. For the construction of the geothermal data set we used DwarfElephant v1.0 (available on Zenodo, https://doi.org/10.5281/zenodo.4074777), for the hydrology data set ParFlow v3.10 (available on Zenodo, https://doi.org/10.5281/zenodo.6413322), and for the geodynamic data set SULEC v6.1. A publication of SULEC for this manuscript will not be possible. However, as it is not the salt model itself that is the focus of the study, but instead the construction of the surrogate model based on the SULEC output data, the content of this manuscript is fully reproducible with the SULEC output data given in the repository. Furthermore, the presented geodynamic benchmark example is conceptually simple and reproducible in other open-source software packages such as Aspect v.2.4 (available on Zenodo, https://doi.org/10.5281/zenodo.6903424) with the data provided in the manuscript.
In a revised manuscript, we will adjust the Data and Code Availability Section accordingly.
Best regards, on behalf of the co-authors,
Denise Degen
Citation: https://doi.org/10.5194/gmd-2022-309-AC2 -
CEC2: 'Reply on AC2', Juan Antonio Añel, 10 May 2023
reply
Dear authors,
Many thanks for your reply. I think that it solves our concerns. However, the information that you have included in your reply here in Discussions is not included in the text of your manuscript, where it is quite obscure. Actually, the structure of your manuscript is quite "non-standard". There is no section on Data and Methods explaining what models and data you use, for what, which versions, etc.
I think that a potentially reviewed version of your manuscript would benefit from a more clear structure and the above-mentioned section, and readers would thank it.
Regards,
Juan A. Añel
Geosci. Model Dev. Executive Editor
Citation: https://doi.org/10.5194/gmd-2022-309-CEC2
-
CEC2: 'Reply on AC2', Juan Antonio Añel, 10 May 2023
reply
-
AC2: 'Reply on CEC1', Denise Degen, 10 May 2023
reply
Denise Degen et al.
Denise Degen et al.
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
453 | 211 | 17 | 681 | 8 | 5 |
- HTML: 453
- PDF: 211
- XML: 17
- Total: 681
- BibTeX: 8
- EndNote: 5
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1