Bridging physics and deep learning is a topical challenge. While deep learning frameworks open avenues in physical science, the design of physically consistent deep neural network architectures is an open issue. In the spirit of physics-informed neural networks (NNs), the PDE-NetGen package provides new means to automatically translate physical equations, given as partial differential equations (PDEs), into neural network architectures. PDE-NetGen combines symbolic calculus and a neural network generator. The latter exploits NN-based implementations of PDE solvers using Keras. With some knowledge of a problem, PDE-NetGen is a plug-and-play tool to generate physics-informed NN architectures. They provide computationally efficient yet compact representations to address a variety of issues, including, among others, adjoint derivation, model calibration, forecasting and data assimilation as well as uncertainty quantification. As an illustration, the workflow is first presented for the 2D diffusion equation, then applied to the data-driven and physics-informed identification of uncertainty dynamics for the Burgers equation.

Machine learning and deep learning are of fast-growing interest in geoscience to address open issues, including sub-grid parameterization.

A variety of learning architectures have shown their ability to encode the physics of a problem, especially deep learning schemes which typically involve millions
of unknown parameters, while the theoretical reason for this success remains a key issue

Designing or learning an NN representation for a given physical process remains a difficult issue. If the learning fails, it may be unclear how to improve the architecture of the neural network. It also seems irrelevant to run computationally expensive numerical experiments on a large-scale dataset to learn well-represented processes. The advection in fluid dynamics may be a typical example of such processes, which do not require complex non-linear data-driven representations. Overall, one would expect to accelerate and make more robust the learning process by combining, within the same NN architecture, the known physical equations with the unknown physics.

From the geoscience point of view, a key question is to bridge physical representations and neural
network ones so that we can decompose both known and unknown equations according to the elementary computational
units made available by state-of-the-art frameworks (e.g. Keras, TensorFlow). In other words, we aim to translate physical equations into the computational vocabulary available to neural networks.
PDE-NetGen

The paper is organized as follows. In the next section, we detail the proposed neural network generator,
with an illustration of the workflow on a diffusion equation.
In Sect.

Introducing physics in the design of neural network topology is challenging
since physical processes can rely on very different partial derivative
equations, e.g. eigenvalue problems for waves or constrained
evolution equations in fluid dynamics under iso-volumetric assumption.
The neural network code generator presented here focuses on physical processes given
as evolution equations:

We first explain how the derivatives are embedded into NN layers, then we detail the workflow of PDE-NetGen for a simple example.

Since the NN
generator is designed for evolution equations,
the core of the generator is the automatic translation
of partial derivatives with respect to spatial coordinates into layers.
The correspondence between
the finite-difference discretization and the convolutional layer
give a practical way to translate a PDE into an NN

The finite-difference method remains to replace the derivative of a function by a fraction that only depends on the
value of the function (see e.g.

In PDE-NetGen, the finite-difference implementation appears as a linear operator

The operator

Note that we chose to design PDE-NetGen considering the finite-difference method, but alternatives using automatic differentiation can be considered as introduced by

Then, the time integration can be implemented either by a solver
or by a ResNet architecture of a given time scheme, e.g. an Euler scheme or a fourth-order Runge–Kutta (RK4) scheme

These two components, namely the translation of partial derivatives into NN layers and a ResNet implementation of the time integration, are the building blocks of the proposed NN topology generator as exemplified in the next section.

We now present the workflow for the NN generator
given a symbolic PDE using the heterogeneous 2D diffusion equation as a test bed:

Starting from a list of coupled evolution equations given as a PDE, a first preprocessing of the system determines the prognostic functions, the constant functions, the exogenous functions and the constants. The exogenous functions are the functions which depend on time and space but whose evolution is not described by the system of evolution equations. For instance, a forcing term in dynamics is an exogenous function.

For the diffusion equation Eq. (

Neural network generator for a heterogeneous 2D diffusion equation.

The core of the NN generator is given by the

The preprocessing of the diffusion equation Eq. (

Part of the Python code of the

All partial derivatives with respect to spatial coordinates are detected
and then replaced by an intermediate variable in the system of evolution
equations. The resulting system is assumed to be algebraic, which means that it only contains addition,
subtraction, multiplication and exponentiation (with at most a real).
For each evolution equation, the abstract syntax tree is translated into a sequence of
layers which can be automatically converted into NN layers in a given NN framework. For the current version of PDE-NetGen, we consider Keras

At the end, a Python code is rendered from templates by using the

Two applications are now considered. First we validate the NN generator on a known physical problem: the diffusion equation (Eq.

In the Python implementation in Fig.

Unified modelling language (UML) class diagram showing the interaction between the

The time integration of the diffusion equation is shown
in Fig.

The heterogeneity of the diffusion tensors makes
an anisotropic diffusion of the Dirac appear (see Fig.

Starting from a Dirac

The next section illustrates the situation in which only part of the dynamics is known, while the remaining physics are learned from the data.

As an illustration of the PDE-NetGen package, we consider a problem encountered in uncertainty prediction: the parametric Kalman filter (PKF)

The idea of the PKF is to mimic the dynamics of the covariance error matrices all along the analysis and the forecast cycle of the data assimilation in a Kalman setting (Kalman filter equations for the uncertainty). It relies on
the approximation of the true covariance matrices by some parametric covariance model. When considering a covariance model based on a diffusion equation,
the parameters are the variance

For the non-linear advection–diffusion equation, known as the Burgers equation,

In this system of PDEs, the term

Within a data-driven framework, one would typically explore a direct identification of the dynamics of the diffusion coefficient

Implementation of the closure by defining each unknown quantity as an instance of the class

The unknown closure function is represented by a neural network
(a Keras model) which implements the expansion

The above approach, which consists of constructing an exogenous function given by an NN to be determined, may seem tedious for an experimenter who would not be accustomed to NNs. Fortunately, we have considered an alternative in PDE-NetGen that can
be used in the particular case in which candidates for a closure take the form of an expression with
partial derivatives, as is the case for Eq. (

Example of a Keras implementation for an RK4 time scheme: given time step

Uncertainty estimated from a large ensemble of

Examples of implementation for the exogenous NN and for the trainable layers are provided in the package PDE-NetGen as Jupyter notebooks for the case of the Burgers equation.

For the numerical experiment, the Burgers equation is solved on
a one-dimensional periodic domain of length

To train the parameters

The resulting dataset involves 40 000 samples. To train the learnable parameters

Figure

In the Burgers dynamics, a priori knowledge was introduced
to propose an NN implementing the closure in Eq. (

In the general case, the choice of the terms to be introduced in the closure may be guided by known
physical properties that need to be verified by the system. For example,
conservation or symmetry properties that leave the system invariant
can guide the choice of possible terms.
For Burgers dynamics,

When no priors are available, one may consider modelling the
closure using state-of-the-art deep neural network architectures,
which have shown impressive prediction performance, e.g. CNNs, ResNets

The aim of the illustration proposed for Burgers dynamics is not to introduce a deep learning architecture
for the closure, but to facilitate the construction of a deep learning architecture taking into account the known physics: the
focus is on the hybridization between physics and machine learning. Though the closure
itself may not result in a deep architecture, the overall generated model leads to a deep
architecture. For instance, the implementation using the exogenous NN uses around

We have introduced a neural network generator, PDE-NetGen, which provides new means to bridge physical priors given as symbolic PDEs and learning-based NN frameworks. This package derives and implements a finite-difference version of a system of evolution equations, wherein the derivative operators are replaced by appropriate convolutional layers including the boundary conditions. The package has been developed in Python using the symbolic mathematics libraries SymPy and Keras.

We have illustrated the usefulness of PDE-NetGen through two applications: a neural network implementation of a 2D heterogeneous diffusion equation and the uncertainty prediction in the Burgers equation. The latter involves unknown closure terms, which are learned from data using the proposed neural network framework. Both illustrations show the potential of such an approach, which could be useful for improving the training in complex applications by taking into account the physics of the problem.

This work opens new avenues to make the most of existing physical knowledge and of recent advances in data-driven settings, more particularly neural networks, for geophysical applications. This includes a wide range of applications, for which such physically consistent neural network frameworks could either lead to the reduction of computational cost (e.g. GPU implementation embedded in deep learning frameworks) or provide new numerical tools to derive key operators (e.g. adjoint operator using automatic differentiation). These neural network representations also offer new means to complement known physics with the data-driven calibration of unknown terms. This is regarded as key to advancing state-of-the-art simulations, forecasting and the reconstruction of geophysical dynamics through model–data coupled frameworks.

For self-consistency, we detail how the theoretical closure is obtained

It can be shown that

The PDE-NetGen package is free and open source.
It is distributed under the CeCILL-B free software licence.
The source code is provided through a GitHub repository at

OP and RF designed the study, conducted the analysis and wrote the paper. OP developed the code.

The authors declare that they have no conflict of interest.

The UML class diagram has been generated from UMLlet

This work was supported by the French national programme LEFE/INSU (Étude du filtre de KAlman PAramétrique, KAPA). RF has been partially supported by Labex Cominlabs (grant SEACS), CNES (grant OSTST-MANATEE) and ANR through the programmes EUR Isblue, Melody and OceaniX.

This paper was edited by Adrian Sandu and reviewed by two anonymous referees.