Perspectives of physics-based machine learning strategies for geoscientific applications governed by partial  differential equations

Degen, Denise; Caviedes Voullième, Daniel; Buiter, Susanne; Hendricks Franssen, Harrie-Jan; Vereecken, Harry; González-Nicolás, Ana; Wellmann, Florian

doi:https://doi.org/10.5194/gmd-16-7375-2023

Articles | Volume 16, issue 24

https://doi.org/10.5194/gmd-16-7375-2023

© Author(s) 2023. This work is distributed under
the Creative Commons Attribution 4.0 License.

https://doi.org/10.5194/gmd-16-7375-2023

© Author(s) 2023. This work is distributed under
the Creative Commons Attribution 4.0 License.

Articles | Volume 16, issue 24

Review and perspective paper

|

19 Dec 2023

Review and perspective paper |

| 19 Dec 2023

Perspectives of physics-based machine learning strategies for geoscientific applications governed by partial differential equations

Denise Degen, Daniel Caviedes Voullième, Susanne Buiter, Harrie-Jan Hendricks Franssen, Harry Vereecken, Ana González-Nicolás, and Florian Wellmann

Download

Final revised paper (published on 19 Dec 2023)
Preprint (discussion started on 27 Mar 2023)

Interactive discussion

Status: closed

RC1:
'Comment on gmd-2022-309', Anonymous Referee #1, 20 Apr 2023

This paper provides an overview of physics-based ML method, mainly non-intrusive reduced basis method, for geoscientific problems and presents a workflow to implement physics-based methods. Authors provide geothermal example, geodynamic example, and hydrological example as three benchmarks to validate that this type of method has potential to solve general challenges in geoscience.
The paper is supposed to be a “perspective” paper, but the paper doesn’t provide sufficient overview of the field and existing methods. The numerical examples are also only limited to one method.

Major comments:
- Section 4 “Challenges” does not provide enough information, but rather repeat those already mentioned in introduction. It is also not clear how these challenges will be solved by new methods.
- There is a similar issue in conclusion section.
- For benchmark examples, the objective of learning and test metric are not clearly pointed out. For example, the input and output of the non-intrusive RB method and the target to learn should be emphasized in the text.
- It’s better to list computational costs for traditional methods and new methods to have a clear comparison as this paper focuses on speed-up of classical methods.
- Some paragraphs and sentences are hard to read and need to be revised.

Minor comments:
- Line 767-769 repetitive sentence.
- Format of equation should be consistent, such as all centered (equation 11 is on the left).

Citation: https://doi.org/10.5194/gmd-2022-309-RC1
- AC1:
  'Reply on RC1', Denise Degen, 26 Apr 2023
  Reviewer Comment: This paper provides an overview of physics-based ML method, mainly non-intrusive reduced basis method, for geoscientific problems and presents a workflow to implement physics-based methods. Authors provide geothermal example, geodynamic example, and hydrological example as three benchmarks to validate that this type of method has potential to solve general challenges in geoscience.
  The paper is supposed to be a “perspective” paper, but the paper doesn’t provide sufficient overview of the field and existing methods. The numerical examples are also only limited to one method.
  Thank you very much for your comments regarding our manuscript.
  
  The idea behind the paper is to present the potential of physics-based machine learning for geoscientific applications governed by partial differential equations.Narrowing down the possible application field (as presented through the three examples) is necessary since the evaluation of the different physics-based machine learning methods strongly depends on the amount of available data and physical knowledge. We restrict our paper on purpose to physics-based machine learning since for data-driven machine learning, several review papers exist (Jordan and Mitchell, 2015; Kotsiantis et al., 2007; Mahesh, 2020). Within the field of physics-based machine learning, we chose to present physics-informed neural networks (PINNs) and the non-intrusive reduced basis (NIRB) method since we consider them as two important end members – whereas there are several more methods that combine characteristics of both approaches. The important difference, in our point of view, is that PINNs originated from the field of machine learning, treating physics only as a constraint, whereas NIRB originated from the field of applied mathematics, aiming at building the model mostly with physics in mind. PINNs are therefore ideal in a situation where some data is available and some physical knowledge. In the field we cover in this perspectives paper, we have a situation where data is often sparse, whereas physical knowledge is available. Hence, methods such as NIRB are preferential and thus also the choice to perform the benchmarks for the NIRB method. We realize that these points might not have become clear in the manuscript. Therefore, we will better illustrate these aspects in a revised manuscript. Furthermore, we will list other physics-based machine learning methods and provide further references to them, to highlight that many different physics-based machine learning methods exist. We will also provide references to review papers on data-driven methodologies.
  
  Reviewer Comment:
  Major comments:
  - Section 4 “Challenges” does not provide enough information, but rather repeat those already mentioned in introduction. It is also not clear how these challenges will be solved by new methods.
  - There is a similar issue in conclusion section.
  In the revised manuscript, we will emphasize stronger how the described methods can be used to solve the presented challenges.
  
  Reviewer Comment: For benchmark examples, the objective of learning and test metric are not clearly pointed out. For example, the input and output of the non-intrusive RB method and the target to learn should be emphasized in the text.
  Thank you for the comment that the objectives need to be better illustrated. For clarification (which we will also include in a revised version): In general, the NIRB method consist of two steps:
  For the proper orthogonal decomposition, we have a set of snapshots as inputs (meaning full dimensional simulations for specific material properties and time steps) and obtain eigenvectors, which are the basis functions.
  
  For the machine learning method (in our case either a neural network or a gaussian process regression), which follows the proper orthogonal decomposition, we have the material properties as input. As training data, we have the matrix product of the snapshots and the basis functions. Hence, we obtain the weighting factors for the above determined basis functions.
  
  Reviewer Comment: It’s better to list computational costs for traditional methods and new methods to have a clear comparison as this paper focuses on speed-up of classical methods.
  So far, we only presented the computational cost of the classical methods and the new methods in the text of the respective examples. Additionally, we will provide a table listing and comparing the computational cost for a better illustration.
  
  Reviewer Comment: Some paragraphs and sentences are hard to read and need to be revised.
  We will revise the entire document carefully regarding readability.
  
  Reviewer Comment:
  Minor comments:
  - Line 767-769 repetitive sentence.
  We will remove the sentence.
  
  - Format of equation should be consistent, such as all centered (equation 11 is on the left).
  We will adjust the formatting.
  
  References:
  Jordan, M. I., & Mitchell, T. M. (2015). Machine learning: Trends, perspectives, and prospects. Science, 349(6245), 255-260.
  
  Kotsiantis, S. B., Zaharakis, I., & Pintelas, P. (2007). Supervised machine learning: A review of classification techniques. Emerging artificial intelligence applications in computer engineering, 160(1), 3-24.
  
  Mahesh, B. (2020). Machine learning algorithms-a review. International Journal of Science and Research (IJSR), 9, 381-386.
  
  Citation: https://doi.org/10.5194/gmd-2022-309-AC1
CEC1:
'Comment on gmd-2022-309', Juan Antonio Añel, 05 May 2023

Dear authors,
Unfortunately, after checking your manuscript, it has come to our attention that it does not comply with our "Code and Data Policy".

https://www.geoscientific-model-development.net/policies/code_and_data_policy.html
Actually, we have detected several issues regarding this in your manuscript. I should note that, actually, your manuscript should not have been accepted in Discussions, given this lack of compliance with our policy. Therefore, the current situation with your manuscript is irregular. In this way, if you do not fix the below-listed problems in a prompt manner, we will have to reject your manuscript for publication in our journal.
First, we do not accept that it is necessary to contact authors to get access to the assets necessary to replicate the work described in the manuscript. This is crystal clear in our policy. Therefore, you must publish the SULEC code in one of the repositories that we accept, without restriction, open to anyone, and under a license that allows anyone to run it to be able to test and replicate your work.
Secondly, in the "Code Availability" section, the link to the Zenodo repository for DwarfElephant 1.0 is broken; please, fix it.
For ParFlow, you point to a GitHub repository. However, GitHub is not a suitable repository for scientific publication. GitHub itself instructs authors to use other alternatives for long-term archival and publishing, such as Zenodo. Fortunately, for ParFlow, there are several Zenodo repositories that maybe you could use. This depends on the ParFlow version that you are using for your work (by the way, you do not specify in the text, and you must do it). If there is not a ParFlow repository for the version that you use here, then you should create it.
Therefore, please, publish the SULEC and ParFlow codes in one of the appropriate repositories, and reply to this comment with the relevant information for them (link and DOI) as soon as possible, as it should be available for the Discussions stage. Also, please, include the relevant primary input/output data.
Also, do not forget to include in a potentially reviewed version of your manuscript the modified 'Code and Data Availability' section with the new DOIs.
Juan A. Añel

Geosci. Model Dev. Exec. Editor

Citation: https://doi.org/10.5194/gmd-2022-309-CEC1
- AC2:
  'Reply on CEC1', Denise Degen, 10 May 2023
  
  Dear Juan Antonio Añel,
  Thank you for your comment regarding the Data and Code Availability Section. We agree that all data should be provided that allow re-creating the surrogate models in our manuscript, for test or benchmarking purposes. We thank you for the opportunity to adjust this section during the review process. We will describe below which data are needed to construct the surrogate models.
  Our manuscript does not further develop the three mentioned software packages (SULEC, DwarfElephant, ParFlow) nor uses these softwares for model testing. Instead, the aim of the paper is to construct surrogate models through physics-based machine learning based on output data from these three software packages. It is thus these output datasets, together with the Non-Intrusive RB code, that are essential for reproducing the surrogate models.
  In order to ensure that the study can be reproduced and access to the data is guaranteed, we published the training and validation data sets and their associated model parameters, as well as the Non-Intrusive RB code for the construction of all three surrogate models in the following Zenodo repository (https://doi.org/10.5281/zenodo.7918740). The training and validation data sets are constructed with the three above-mentioned software packages. For the construction of the geothermal data set we used DwarfElephant v1.0 (available on Zenodo, https://doi.org/10.5281/zenodo.4074777), for the hydrology data set ParFlow v3.10 (available on Zenodo, https://doi.org/10.5281/zenodo.6413322), and for the geodynamic data set SULEC v6.1. A publication of SULEC for this manuscript will not be possible. However, as it is not the salt model itself that is the focus of the study, but instead the construction of the surrogate model based on the SULEC output data, the content of this manuscript is fully reproducible with the SULEC output data given in the repository. Furthermore, the presented geodynamic benchmark example is conceptually simple and reproducible in other open-source software packages such as Aspect v.2.4 (available on Zenodo, https://doi.org/10.5281/zenodo.6903424) with the data provided in the manuscript.
  In a revised manuscript, we will adjust the Data and Code Availability Section accordingly.
  Best regards, on behalf of the co-authors,
  Denise Degen
  
  Citation: https://doi.org/10.5194/gmd-2022-309-AC2
  - CEC2: 'Reply on AC2', Juan Antonio Añel, 10 May 2023
    
    Dear authors,
    Many thanks for your reply. I think that it solves our concerns. However, the information that you have included in your reply here in Discussions is not included in the text of your manuscript, where it is quite obscure. Actually, the structure of your manuscript is quite "non-standard". There is no section on Data and Methods explaining what models and data you use, for what, which versions, etc.
    I think that a potentially reviewed version of your manuscript would benefit from a more clear structure and the above-mentioned section, and readers would thank it.
    Regards,
    Juan A. Añel
    Geosci. Model Dev. Executive Editor
    
    Citation: https://doi.org/10.5194/gmd-2022-309-CEC2
RC2:
'Comment on gmd-2022-309', Anonymous Referee #2, 11 Aug 2023

In this article, the authors discuss the potential of combining physics-based and data-driven methods for geophysical applications. One physics-based machine learning method, namely the non-intrusive reduced basis method, is highlighted and tested in three case studies.

The manuscript is overall rather well written. However, for a perspective article I am a bit surprised that so few methods are presented and only one is tested, especially since in my opinion the manuscript is very long.
General comments

1) In general, I think that clear statements of the objectives are missing in section 2. From what I understand, we are looking for a surrogate model, but for what kind of applications: forecast? inversions? parameter estimation? other? In my opinion this is very important as what works for an application might not work for another. The requirements for the surrogate model may also differ from one application to the other.

2) I am under the impression that I have a different definition of machine learning (or data-driven) methods than the one used in the present manuscript. From what I know, machine learning methods are methods where a problem is solved without being explicitly programmed, using a large dataset. By contrast, with physics-based methods the same problem is solved using a model derived from physical laws. Consequently, methods such as POD or RB, which only require data to work and which are independent of the physics, are typically machine learning methods, and I do not think they can be classified as physics-based. This needs to be clarified.

3) In the test series, I understand that the data comes from simulations. However, in figure 1 we see that in some cases the data comes from measurements. Can we have more details about these measurements? Furthermore in this figure, physics-based methods are used with measurements only. This is weird as in the description in section 2.1, "physics-based" techniques such as POD or RB are built using simulated data.

4) Here and there, for example L253, it is mentioned that data-driven models do not preserve the input-output relationships. This statement seems a bit bold since the input and output of the problems are not properly defined. I would highly recommend to explicitly define them.

5) In the numerical illustrations, only one method is tested, in such a way that we have no idea of how good the performances are. I would really advise to add a baseline method for comparison.

6) In the numerical experiments, only one initial condition is used. Readers from communities such as meteorology might be surprised by this choice. Perhaps it would be interesting to discuss the sensitivity to initial conditions and to compare it to the sensitivity to other parameters.

Specific remarks

L80: "physically consistent" What is meant exactly here? Physically consistent can have many different meanings.

L 89/90: "It is critical that the solutions produced by the surrogate models do not change the general behavior if, for instance, the accuracy of the models is increased." I do not understand this sentence.

L98: "Physics-driven approaches preserve the governing equations," What does it mean to "preserve an equation"?

L 131: "The RB method is fairly unknown in the field of Geosciences." Please tune down this subjective statement.

L 134/135: "Alternatively, also a POD can be used for the sampling instead of the greedy algorithm." In that case, RB would be the same as POD, right?

Citation: https://doi.org/10.5194/gmd-2022-309-RC2
- AC3:
  'Reply on RC2', Denise Degen, 15 Sep 2023
  Dear Reviewer,
  Thank you very much for taking the time to review our manuscript. Please find below a point-by-point answer to your comments.
  In this article, the authors discuss the potential of combining physics-based and data-driven methods for geophysical applications. One physics-based machine learning method, namely the non-intrusive reduced basis method, is highlighted and tested in three case studies.
  
  The manuscript is overall rather well written. However, for a perspective article I am a bit surprised that so few methods are presented and only one is tested, especially since in my opinion the manuscript is very long.
  Thank you very much for your comments regarding our manuscript and taking the time to review the manuscript. We presented only two physics-based machine learning methods since they represent two end member cases. However, we see that this might not have been apparent in the manuscript. Therefore, we will extend the explanation and add an additional Section 2.3.3. describing other physics-based machine learning methods and explaining how they behave with respect to these end members.
  
  General comments
  
  1) In general, I think that clear statements of the objectives are missing in section 2. From what I understand, we are looking for a surrogate model, but for what kind of applications: forecast? inversions? parameter estimation? other? In my opinion this is very important as what works for an application might not work for another. The requirements for the surrogate model may also differ from one application to the other.
  We focus mainly on parameter estimation, uncertainty quantification, and global sensitivity analysis. This will be specified in Section 2 as its influences the requirements for the surrogate models.
  
  2) I am under the impression that I have a different definition of machine learning (or data-driven) methods than the one used in the present manuscript. From what I know, machine learning methods are methods where a problem is solved without being explicitly programmed, using a large dataset. By contrast, with physics-based methods the same problem is solved using a model derived from physical laws. Consequently, methods such as POD or RB, which only require data to work and which are independent of the physics, are typically machine learning methods, and I do not think they can be classified as physics-based. This needs to be clarified.
  Both the POD and the intrusive RB method, described in Section 2.1., use a Galerkin projection which requires access to the discretized version of the PDE in form of the stiffness matrix and load vector and are not independent of the physics. However, the POD without the combination of a Galerkin projection can also be used independent of the physics, where it would be right to no longer classify the method as physics-based. To clarify this, additional explanations will be added.
  
  3) In the test series, I understand that the data comes from simulations. However, in figure 1 we see that in some cases the data comes from measurements. Can we have more details about these measurements? Furthermore in this figure, physics-based methods are used with measurements only. This is weird as in the description in section 2.1, "physics-based" techniques such as POD or RB are built using simulated data.
  We will add a clarification of what is meant by measured data in the caption of Figure 1. For the explanation of POD and RB, we refer to the previous comment.
  
  4) Here and there, for example L253, it is mentioned that data-driven models do not preserve the input-output relationships. This statement seems a bit bold since the input and output of the problems are not properly defined. I would highly recommend to explicitly define them.
  The statement originates from the Model Order Reduction community and in the original manuscript, we did not realize that input and output are naturally differently defined depending on the community. We will clarify, the definition of input and output throughout the manuscript.
  
  5) In the numerical illustrations, only one method is tested, in such a way that we have no idea of how good the performances are. I would really advise to add a baseline method for comparison.
  To offer the comparison to a base line method, we will construct the surrogate models also via a neural-network and detailed the implication in the newly introduced Section 3.5.
  
  6) In the numerical experiments, only one initial condition is used. Readers from communities such as meteorology might be surprised by this choice. Perhaps it would be interesting to discuss the sensitivity to initial conditions and to compare it to the sensitivity to other parameters.
  In Section 3.5, we will describ how different initial and boundary conditions can be considered.
  
  Specific remarks
  
  L80: "physically consistent" What is meant exactly here? Physically consistent can have many different meanings.
  We will specify the meaning of physically consistent.
  
  L 89/90: "It is critical that the solutions produced by the surrogate models do not change the general behavior if, for instance, the accuracy of the models is increased." I do not understand this sentence.
  The sentence will be revised.
  
  L98: "Physics-driven approaches preserve the governing equations," What does it mean to "preserve an equation"?
  We will clarify the meaning of this statement.
  
  Physics-driven approaches preserve the governing equations, meaning that they, for instance, conserve mass, momentum, and energy in the same way as the original discretized version does. They also maintain the characteristic that for a given set of model parameters (e.g., material properties), they produce information about the state variables (e.g., temperature, pressure) at, for instance, every node of the discretized model.
  
  L 131: "The RB method is fairly unknown in the field of Geosciences." Please tune down this subjective statement.
  We will remove the subjective statement.
  
  L 134/135: "Alternatively, also a POD can be used for the sampling instead of the greedy algorithm." In that case, RB would be the same as POD, right?
  Unfortunately, the POD method with a Galerkin projection is in literature sometimes called RB with POD sampling, POD or POD + Galerkin. A note of caution is found in line 140 -141. We will remove the statement from line 134/135 because it is an inconsistency in the definitions available in literature but does not impact the results presented in this manuscript. By removing the statement and only leaving the note of caution, we aim to avoid further confusions.
  
  Best regards, on behalf of the co-authors,
  Denise Degen
  
  Citation: https://doi.org/10.5194/gmd-2022-309-AC3

Peer review completion

AR: Author's response | RR: Referee report | ED: Editor decision | EF: Editorial file upload

AR by Denise Degen on behalf of the Authors (25 Sep 2023) Author's response Author's tracked changes

EF by Sarah Buchmann (27 Sep 2023) Manuscript

ED: Publish as is (24 Oct 2023) by James Kelly

ED: Publish as is (04 Nov 2023) by Paul Ullrich (Executive editor)

AR by Denise Degen on behalf of the Authors (06 Nov 2023)

Short summary

In geosciences, we often use simulations based on physical laws. These simulations can be computationally expensive, which is a problem if simulations must be performed many times (e.g., to add error bounds). We show how a novel machine learning method helps to reduce simulation time. In comparison to other approaches, which typically only look at the output of a simulation, the method considers physical laws in the simulation itself. The method provides reliable results faster than standard.