Automating Finite Element Methods for Geodynamics via Firedrake

Firedrake is an automated system for solving partial differential equations using the finite element method. By applying sophisticated performance optimisations through automatic code-generation techniques, it provides a means to create accurate, efficient, flexible, easily extensible, scalable, transparent and reproducible research software, that is ideally suited to simulating a wide-range of problems in geophysical fluid dynamics. Here, we demonstrate the applicability of Firedrake for geodynamical simulation, with a focus on mantle dynamics. The accuracy and efficiency of the approach is confirmed via com5 parisons against a suite of analytical and benchmark cases of systematically increasing complexity, whilst parallel scalability is demonstrated up to 12288 compute cores, where the problem size and the number of processing cores are simultaneously increased. In addition, Firedrake’s flexibility is highlighted via straightforward application to different physical (e.g. complex nonlinear rheologies, compressibility) and geometrical (2-D and 3-D Cartesian and spherical domains) scenarios. Finally, a representative simulation of global mantle convection is examined, which incorporates 230 Myr of plate motion history as a 10 kinematic surface boundary condition, confirming Firedrake’s suitability for addressing research problems at the frontiers of global mantle dynamics research.

addressing research problems at the frontiers of global mantle dynamics research. Other components of Firedrake, which have not been showcased in this manuscript but may be beneficial to various future research endeavours, are discussed in Section 8.

Firedrake
The Firedrake project is an automated system for solving partial differential equations using the finite element method (e.g. Rathgeber et al., 2016). Using a high-level language that reflects the mathematical description of the governing equations 100 (e.g. Alnes et al., 2014), the user specifies the finite element problem symbolically. The high-performance implementation of assembly operations for the discrete operators is then generated 'automatically' by a sequence of specialised compiler passes that apply symbolic mathematical transformations to the input equations to ultimately produce C (and C++) code (Rathgeber et al., 2016;Homolya et al., 2018). Firedrake compiles and executes this code to create linear or nonlinear systems, which are solved by PETSc (Balay et al., 1997(Balay et al., , 2021b. As stated by Rathgeber et al. (2016), in comparison with conventional finite 105 element libraries, and even more so with handwritten code, Firedrake provides a higher productivity mechanism for solving finite element problems whilst simultaneously applying sophisticated performance optimisations that few users would have the resources to code by hand.
Firedrake builds on the concepts and some of the code of the FEniCS project (e.g. Logg et al., 2012), particularly its representation of variational problems via the Unified Form Language (UFL) (Alnes et al., 2014). We note that the applicability 110 of FEniCS for geodynamical problems has already been demonstrated (e.g. Vynnytska et al., 2013;Wilson et al., 2017). Both frameworks have the goal of saving users from manually writing low-level code for assembling the systems of equations that discretise their model physics. An important architectural difference is that while FEniCS has components written in C++ and Python, Firedrake is completely written in Python, including its run-time environment (it is only the automatically generated assembly code that is in C/C++, although it does leverage the PETSc library, written in C, to solve the assembled systems, 115 albeit through its Python interface). This provides a highly flexible user interface with ease of introspection of data structures.
We note that the Python environment also allows deploying of hand written C kernels should the need arise to perform discrete mesh-based operations that cannot be expressed in the finite element framework, such as sophisticated slope limiters or bespoke sub-grid physics.
Firedrake offers several highly-desirable features rendering it well-suited to problems in geophysical fluid dynamics. As will 120 be illustrated through a series of examples below, of particular importance in the context of this manuscript are Firedrake's support for a range of different finite-element discretisations, including a highly efficient implementation of those based on extruded meshes, programmable nonlinear solvers and composable operator aware solver preconditioners. As the importance of reproducibility in the computational geosciences is increasingly recognized, we note that Firedrake integrates with Zenodo and GitHub to provide users with the ability to generate a set of DOIs corresponding to the exact set of Firedrake components used 125 to conduct a particular simulation, in full compliance with FAIR (Findable, Accessible, Interoperable, Reusable) principles.

Dependencies
Firedrake treats finite element problems as a composition of several abstract processes, using separate packages for each.
The framework imposes a clear separation of concerns between the definition of the problem (UFL, Firedrake Language), the generation of computational kernels used to assemble the coefficients of the discrete equations (TSFC, FInAT), the parallel 130 execution of this kernel (PyOP2) over a given mesh topology (DMPlex), and the solution of the resulting linear or nonlinear systems (PETSc). These layers allow various types of optimisation to be applied at different stages of the solution process. The key components of this software stack are next described.
1. Unified Form Language (UFL) -as we will see in the examples below, a core part of finite element problems is the specification of the weak form of the governing PDEs. UFL, a domain-specific symbolic language with well-defined and 135 mathematically consistent semantics that is embedded in Python, provides an elegant solution to this problem. It was pioneered by the FEniCS project (Logg et al., 2012), although Firedrake has added several extensions.
2. Firedrake Language -in addition to the weak form of the PDEs, finite element problems require the user to select appropriate finite elements, specify the mesh to be employed, set field values for initial and boundary conditions and specify the sequence in which solves occur. Firedrake implements its own language for these tasks, which was designed 140 to be to a large extent compatible with DOLFIN (Logg et al., 2012), the runtime API of the FEniCS project. We note that Firedrake implements various extensions to DOLFIN, whilst some features of DOLFIN are not supported by Firedrake.
3. FInAT (Kirby and Mitchell, 2019) -incorporates all information required to evaluate the basis functions of the different finite element families supported by Firedrake. In earlier versions of Firedrake this was done through tabulation of the basis functions evaluated at Gauss points (FIAT: Kirby, 2004). FInAT, however, provides this information to the form 145 compiler as a combination of symbolic expressions and numerical values, allowing for further optimisations. FInAT allows Firedrake to support a wide-range of finite elements, including continuous, discontinuous, H(div) and H(curl) discretisations, and elements with continuous derivatives such as the Argyris and Bell elements. 4. Two-Stage Form Compiler (TSFC) -a form compiler takes a high-level description of the weak form of PDEs (here in UFL) and produces low-level code that carries out the finite element assembly. Firedrake uses TSFC, which was 150 developed specifically for the Firedrake project (Homolya et al., 2018), to generate its local assembly kernels. TSFC invokes two stages, where in the first stage UFL is translated to an intermediate symbolic tensor algebra language, before translating this into assembly kernels written in C. In comparison with the form compilers of FEniCS (FFC and UFLACS), TSFC aims to maintain the algebraic structure of the input expression for longer, which opens up additional opportunities for optimisation. 155 5. PyOP2 -a key component of Firedrake's software stack is PyOP2, a high-level framework that optimises the parallel execution of computational kernels on unstructured meshes (Rathgeber et al., 2012;Markall et al., 2013). Where the local assembly kernels generated by TSFC calculate the values of a local tensor from local input tensors, all associated with the degrees of freedom of a single element, PyOP2 wraps this code in an additional layer responsible for the extraction and addition of these local tensors out of/into global structures such as vectors and sparse matrices. It is also responsible 160 for the maintenance of halo layers, the overlapping regions in a parallel decomposed problem. PyOP2 allows for a clean separation of concerns between the specification of the local kernel functions, in which the numerics of the method are encoded, and their efficient parallel execution. More generally, this separation of concerns is the key novel abstraction that underlies the design of the Firedrake system. 6. DMPlex -PyOP2 has no concept of the topological construction of a mesh. Firedrake derives the required maps through 165 DMPlex, a data management abstraction that represents unstructured mesh data, which is part of the PETSc project (Knepley and Karpeev, 2009). This allows Firedrake to leverage the DMPlex partitioning and data migration interfaces to perform domain decomposition at run-time, whilst supporting multiple mesh file formats. Moreover, Firedrake reorders mesh entities to ensure computational efficiency . 7. Linear and nonlinear solvers -Firedrake passes solver problems on to PETSc (Balay et al., 1997(Balay et al., , 2021a, a well-170 established, high-performance solver library that provides access to several of its own and third-party implementations of solver algorithms. The Python interface to PETSc (Dalcin et al., 2011) makes integration with Firedrake straightforward.
We note that employing PETSc for both its solver library and for DMPlex has the additional advantage that the set of library dependencies required by Firedrake is kept small (Rathgeber et al., 2016).

175
Our focus here is on mantle convection, the slow creeping motion of Earth's mantle over geological timescales. The equations governing mantle convection are derived from the conservation laws of mass, momentum and energy. The simplest mathematical formulation assumes incompressibility and the Boussinesq approximation (McKenzie et al., 1973), under which the non-dimensional momentum and continuity equations are given by: whereσ is the stress tensor, u is the velocity and T temperature.k is the unit vector in the direction opposite to gravity and Ra 0 denotes the Rayleigh number, a dimensionless number that quantifies the vigor of convection: Here, ρ 0 denotes reference density, α the thermal expansion coefficient, ∆T the characteristic temperature change across the 185 domain, g the gravitational acceleration, d the characteristic length, µ 0 the reference dynamic viscosity and κ the thermal diffusivity. Note that the above non-dimensional equations are obtained through the following characteristic scales: length d; time d 2 / κ; and temperature ∆T .
When simulating incompressible flow, the full stress tensor,σ, is decomposed into deviatoric and volumetric components: 190 whereτ is the deviatoric stress tensor, p is dynamic pressure and I is the identity matrix. Substituting Eq. (4) into Eq. (1) and utilizing the constituative relation which relates the deviatoric stress tensor,τ , to the strain-rate tensor,˙ = sym(∇u), yields: The viscous flow problem can thus be posed in terms of pressure, p, velocity, u, and temperature, T . The evolution of the thermal field is controlled by an advection-diffusion equation: These governing equations are sufficient to solve for the three unknowns, together with adequate boundary and initial conditions.

Finite Element Discretisation and Solution Strategy
For the derivation of the finite element discretisation of Equations (6), (2), and (7) we start by writing these in their weak form.
We select appropriate function spaces V, W, and Q that contain, respectively, the solution fields for velocity u, pressure p, and temperature T , and also contain the test functions v, w and q. The weak form is then obtained by multiplying these equations with the test functions and integrating over the domain Ω, − Note that we have integrated by parts the viscosity term in (6), the divergence term in (2), and the diffusion term in (7), but have omitted the corresponding boundary terms, which will be considered in the following section.

210
Equations (8)(9)(10) are a more general, mathematically rigorous representation of the continuous PDEs in strong form (Equations 6, 2 and 7), provided suitable function spaces with sufficient regularity are chosen (see, for example Zienkiewicz and Taylor, 1991;Elman et al., 2005). Galerkin finite element discretisation proceeds by restricting these function spaces to finite-dimensional subspaces. These are typically constructed by dividing the domain into cells or elements, and restricting to piecewise polynomial subspaces with various continuity requirements between cells.
In all examples presented in this paper, 215 we use Continuous Galerkin (CG) finite elements, specifically the Q2-Q1 element pair for velocity and pressure and the Q2 discretisation for temperature.
We note that there are many other choices of finite element function spaces available in Firedrake, although they are not considered herein (see Gibson et al., 2019, for an overview). All that is required for their implementation is that a basis can be found for the function space such that each solution can be written as a linear combination of basis functions. For example, if 220 we have a basis φ i of the finite dimensional function space Q h of temperature solutions, then we can write each temperature solution as where T i represents the coefficients that we can collect into a discrete solution vector T. Using a Lagrangian polynomial basis the coefficients T i correspond to values at the nodes, where each node i is associated with one basis function φ i , but this is not 225 generally true for other choices of finite element bases.
In curved domains, boundaries can only be approximated with a finite number of triangles, tetrahedrals, quadrilaterals or hexahedrals. In a sense, this can be seen as a piecewise linear (or bi/tri-linear) approximation where the domain is approximated by straight lines (edges) between vertices. A more accurate representation of the domain is obtained by allowing higher order polynomials that describe the physical embedding of the element within the domain. A typical choice is to use a so-called 230 isoparametric representation in which the polynomial order of the embedding is the same as that of the discretised functions that are solved for.
Finally, we note that it is common to use a subscript h for the discrete, finite-dimensional function subspaces and Ω h for the discretised approximation by the mesh of the domain Ω, but since the remainder of this manuscript focusses on the details and implementation of this discretisation, we simply drop the h subscripts from here on.

Boundary conditions
In the Cartesian examples considered below, zero-slip and free-slip boundary conditions for (8) and (9) are imposed through strong Dirichlet boundary conditions for velocity u. This is achieved by restricting the velocity function space V to a subspace V 0 of vector functions for which all components (zero-slip) or only the normal component (free-slip) are zero at the boundary.
Since this restriction also applies to the test functions v, the weak form only needs to be satisfied for all test functions v ∈ V 0 240 that satisfy the homogeneous boundary conditions. Therefore, the omitted boundary integral that was required to obtain the integrated by parts viscosity term in Equation (8), automatically vanishes for zero-slip boundary conditions as v = 0 at the domain boundary, ∂Ω. In the case of a free-slip boundary condition for which the tangential components of v are non-zero, the boundary term does not vanish, but by omitting that term in (8) we weakly impose a zero shear 245 stress condition. The boundary term obtained by integrating the divergence term in (2) by parts, the boundary term associated with integrating by parts of the diffusion term, vanishes. In Cartesian domains the boundary term does not vanish for the lateral boundaries, but by omitting this term from (10) we weakly impose a homogeneous Neumann (zero-flux) boundary condition at these boundaries. The temperature solution itself is found in any representative temperature function that satisfies the required inhomogenous boundary conditions.
In curved domains, such as the 2-D cylindrical and 3-D spherical cases examined below, imposing free-slip boundary conditions is complicated by the fact that it is not straightforward to decompose the degrees of freedom of the velocity space V into tangential and lateral components for many finite element discretisations. For Lagrangian based discretisations we could define normal vectors at the Lagrangian nodes on the surface and decompose accordingly, but these normal vectors would have 260 to be averaged due to the piecewise approximation of the curved surface. To avoid such complications for our examples in cylindrical and spherical geometries, we employ a symmetric Nitsche penalty method (Nitsche, 1971) where the velocity space is not restricted and, thus, retains all discrete solutions with a non-zero normal component. This entails adding the following three surface integrals to Equation (8):

265
The first of these corresponds to the normal component of Equation (12) associated with integration by parts of the viscosity term. The tangential component, as before, is omitted and weakly imposes a zero shear stress condition. The second term ensures symmetry of Equation (8) with respect to u and v. The third term penalizes the normal component of u and involves a penalty parameter C Nitsche > 0 that should be sufficiently large to ensure coercivity of F Stokes as a bilinear form in u and v. Lower bounds for C Nitsche,f on each face f can be derived for simplicial (Shahbazi, 2005) and quadrilateral/hexahedral 270 (Hillewaert, 2013) meshes, respectively: Quadrilateral/Hexahedral meshes: where A f is the facet area of face f , V c f the cell volume of the adjacent cell c f , and p is the polynomial degree of the velocity discretisation. Here, we introduce an additional factor, C ip , to account for spatial variance of the viscosity µ in the adjacent cell, and domain curvature, which are not taken into account in the standard lower bounds (using C ip = 1). In all free-slip cylindrical and spherical examples presented below, we use C ip = 100.

Temporal discretisation and solution process for temperature
For temporal integration, we apply a simple θ scheme to the energy equation (10): is interpolated between the temperature solutions T n and T n+1 at the beginning and end of the n + 1-th time step using a parameter 0 ≤ θ ≤ 1.
In all examples that follow, we use a Crank-Nicholson scheme, where θ = 0.5. To simplify we will solve for velocity and pressure, u and p, in a separate step before solving for the new temperature T n+1 .

285
Because F energy is linear in q, if we expand the test function q as a linear combination of basis functions φ i of Q where F(T n+1 ) is the vector with coefficients F energy (φ i ; T n+1 ) (i.e. the energy equation tested with the basis functions φ i ).
Thus, to satisfy Equation (18) we need to solve for a temperature T for which the entire vector F(T n+1 ) is zero.
In the general, nonlinear case, this can be solved using a Newton solver, but here the system of equations F(T n+1 ) is also 290 linear in T n+1 and, accordingly, if we also expand the temperature with respect to the same basis: T n+1 = j T n+1 j φ j where we store the coefficients T n+1 j in a vector T, we can write it in the usual form as a linear system of equations with A the matrix that represents the Jacobian ∂F ∂T with respect to the basis φ i , and the right-hand side vector b containing all terms in (18) that do not depend on T n+1 , specifically: In the nonlinear case, every Newton iteration requires the solution of such a linear system with a Jacobian matrix A ij = ∂F energy /∂T n+1 j and right-hand side vector based on the residual b i = F energy (φ i , T n+1 ) that both need to be reassembled every iteration as T n+1 is iteratively improved. For the 2-D cases presented in this paper, this asymmetric linear system is 300 solved with a direct solver, and in 3-D using a combination of the GMRES Krylov subspace method with a symmetric SOR (SSOR) preconditioner.

Solving for velocity and pressure
In a separate step, we solve Equations (8) and (9) for velocity and pressure. Since these weak equations need to hold for all test functions v ∈ V and w ∈ W we can equivalently write, using a single residual functional F Stokes : where we have multiplied the continuity equation with −1 to ensure symmetry between the ∇p and ∇·u terms. This combined weak form that we simultaneously solve for a velocity u ∈ V and pressure p ∈ W is referred to as a mixed problem, and the 310 combined solution (u, p) is said to be found in the mixed function space V ⊕ W .
As before, we expand the discrete solutions u and p, and test functions v and w in terms of basis functions for V and W For isoviscous cases, where F Stokes is linear in u and p, we then derive a linear system of the following form

320
For cases with more general rheologies, in particular those with a strain-rate dependent viscosity, the system F Stokes (u, p) = 0 is nonlinear and can be solved using Newton's method. This requires the solution in every Newton iteration of a linear system of the same form as in Equation (27) but with an additional term in K associated with ∂µ/∂u.
There is a wide literature on iterative methods for solving saddle point systems of the form in Equation (27). For an overview of the methods commonly used in geodynamics, see May and Moresi (2008). Here we employ the Schur complement approach,

325
where pressure p is determined by solving It should be noted that K −1 is not assembled explicitly. Rather, in a first step we obtain y = K −1 f by solving Ky = f so that we can construct the right-hand side of the equation. We subsequently apply an iterative method to the linear system as a whole, in which each iteration requires matrix-vector multiplication with the matrix G T K −1 G that again involves the solution of a 330 linear system with matrix K. In addition to this matrix-vector multiplication we also need a suitable preconditioner. Here we follow the inverse scaled-mass matrix approach which uses the following approximation Finally, after solving Equation (31) for p, we obtain u in a final solve Ku = f − Gp.
Since this solution process involves multiple solves with the matrix K, we also need an efficient algorithm to solve that 335 system. For this, we combine the conjugate gradient method with an algebraic multigrid approach, specifically the Geometric Algebraic Multigrid (GAMG) method implemented in PETSc (Balay et al., 1997(Balay et al., , 2021a.
Depending on boundary conditions, the linearised Stokes system admits a number of null modes. In the absence of open boundaries, which is the case for all cases examined here, the pressure admits a constant null mode, where any arbitrary constant can be added to the pressure solution and remain a valid solution to the equations. In addition, the cylindrical and 340 spherical cases with free-slip boundary conditions at both boundaries examined in Section 5, admit, respectively, one and three independent rotational null modes in velocity. These null modes result in indefinite matrices and preconditioned iterative methods typically require the null vectors to be provided so that they can be projected out during iteration.
In the absence of any Dirichlet conditions on velocity, the nullspace of the velocity block K also consists of a further two independent translational modes in 2D, and three in 3D. Even if, as for the cases here, the boundary conditions do not 345 admit all rotational and translational modes, these solutions are still associated with low energy modes of the matrix, and some multigrid methods use this information to improve their performance by ensuring that these near-nullspace modes are accurately represented at the coarser levels (Vanek et al., 1996). We make use of this in several of the examples considered below.

350
Firedrake provides a complete framework for solving finite element problems, highlighted in this section through a series of examples. We start in Section 5.1 with the most basic problem -isoviscous, incompressible convection, in an enclosed 2-D Cartesian box -and systematically build complexity, initially moving into more realistic physical approximations (Section 5. 2) and, subsequently, geometries that are more representative of Earth's mantle (Section 5.3).

355
A simple 2-D square convection problem, from Blankenbach et al. (1989), for execution in Firedrake, is displayed in Listing 1. The problem is incompressible, isoviscous, heated from below and cooled from above, with closed, free-slip boundaries, on a unit square mesh. Solutions are obtained by solving the Stokes equations for velocity and pressure, alongside the energy 1 from firedrake import *  35 F_stokes += dot(grad(w), u) * dx # Continuity equation 36 # Energy equation in UFL form: 37 F_energy = q * (Tnew -Told) / delta_t * dx + q * dot(u, grad(Ttheta)) * dx + dot(grad(q), kappa * grad(Ttheta)) * dx  Told.assign(Tnew) Listing 1. Firedrake code required to reproduce 2-D Cartesian incompressible isoviscous benchmark cases from Blankenbach et al. (1989). equation for temperature. The initial temperature distribution is prescribed as follows: where A = 0.05 is the amplitude of the initial perturbation.
We have set up the problem using a bilinear quadrilateral element pair (Q2-Q1) for velocity and pressure, with Q2 elements for temperature. Firedrake user code is written in Python, so the first step, illustrated on line 1 of Listing 1, is to import the Firedrake module. We next need a mesh: for simple domains such as the unit square, Firedrake provides built-in meshing functions. As such, line 4 defines the mesh, with 40 quadrilateral elements in x and y directions. We also need function spaces, 365 which is achieved by associating the mesh with the relevant finite element on lines 7-9: V , W and Q are symbolic variables representing function spaces. They also contain the function space's computational implementation, recording the association of degrees of freedom with the mesh and pointing to the finite element basis. The user does not usually need to pay any attention to this: the function space just behaves as a mathematical object (Rathgeber et al., 2016). Function spaces can be combined in the natural way to create mixed function spaces, as we do on line 10, combining the velocity and pressure function spaces 370 to form a function space for the mixed Stokes problem, Z. Note that although we use continuous Lagrange elements (CG) in all examples presented herein, Firedrake offers a range of different options, including discontinuous elements (DG). Test functions, v, w and q are subsequently defined (lines [13][14] and we also specify functions to hold our solutions (lines [15][16][17][18]: z in the mixed function space, noting that a symbolic representation of the two parts -velocity and pressure -is obtained with split on line 16, and T old and T new (line 17), required for the Crank-Nicholson scheme used for temporal discretisation in our 375 energy equation (see Equations 18 and 19 in Section 4.2), where T θ is defined on line 18.
We obtain symbolic expressions for coordinates in the physical mesh (line 21) and subsequently use these to initialize the old temperature field, via Equation (33), on line 22. This is where Firedrake transforms a symbolic operation into a numerical computation for the first time: the interpolate method generates C code that evaluates this expression at the nodes of T old , and immediately executes it to populate the values of T old . We initialize T new with the values of T old , on line 23, via the assign 380 function. Important constants in this problem (Rayleigh Number, Ra; viscosity, µ; thermal diffusivity, κ), in addition to the constant timestep (∆ t ) and unit vector (k), are defined on lines 26-30. We note that viscosity could also be a Function, if we wanted spatial variation.
We are now in a position to define the variational problems expressed in Equations (24) and (18). Although in this test case the problems are linear, we maintain the more general nonlinear residual form F Stokes (v, u) = 0 and F energy (q, T ) = 0, to allow (b) RMS velocity versus number of pressure and velocity DOF, at Ra = 1 × 10 4 ; (c, d) as in panels a and b, but at Ra = 1 × 10 5 (Case 1b - Blankenbach et al., 1989); (e, f) at Ra = 1 × 10 6 (Case 1c - Blankenbach et al., 1989). Benchmark values are denoted by dashed red lines.
In panels e and f, we also display results from simulations where temperature is represented through a Q1 discretisation (Q2Q1_Q1), for comparison with our standard Q2 temperature discretisations (Q2Q1_Q2).
(1) to apply boundary conditions to the x and y components of the velocity field only. To apply conditions to the pressure space, we would use Z.sub(1). This problem has a constant pressure nullspace and we must ensure that our solver removes this space. To do so, we build a nullspace object on line 42, which will subsequently be passed to the solver, and PETSc will seek a solution in the space orthogonal to the provided nullspace. We finally come to solving the variational problem, with problems and solver objects created on lines 59-62. We pass in the 400 residual functions F Stokes and F Energy , solution fields (z, T new ), boundary conditions and, for the Stokes system, the nullspace object. Solution of the two variational problems is undertaken by the PETSc library (Balay et al., 1997), guided by the solver parameters specified on lines 50-56 (see Balay et al., 2021a, b  Jacobian matrix, and a right-hand side vector based on the residual, as indicated in Equations (21), (22) and (23) for the energy equation, and Equations (27), (28), (29) and (30) for the Stokes equation. Note, however, that the symbolic expression for the Jacobian is derived automatically in UFL. Firedrake's TSFC (Homolya et al., 2018) subsequently converts the UFL into highly optimised assembly code, which is then executed to create the matrix and vectors, with the resulting system passed to PETSc for solution. Finally, we note that output is written on line 66, to a .pvd file, initialised on line 45, for visualisation in software 420 such as ParaView (e.g. Ahrens et al., 2005).
In < 70 lines of Python, we are able to produce a model that can be executed and quantitatively compared with benchmark results from Blankenbach et al. (1989). To do so, we have computed the RMS velocity andsurface Nusselt number at a range of different mesh resolutions and Rayleigh numbers, with results presented in Figure 1. Results converge towards the benchmark solutions, with increasing resolution. The final steady-state temperature field, at Ra = 1 × 10 6 , is illustrated in Figure 2(a).

425
To further highlight the flexibility of Firedrake, we have also simulated these cases using a Q1 discretisation for the temperature field. The modifications necessary are minimal: on line 9, the polynomial order is changed from 2 to 1. Results, at Ra = 1 × 10 6 , are presented in Figure 1(e,f), again converging towards benchmark values with increasing resolution. We find that, as expected, a Q2 temperature discretisation leads to more accurate results, although results converge towards the benchmark solutions from different directions. For the remainder of the examples considered herein, we use a Q2 discretisation for 430 temperature.

Extension: more realistic physics
We next highlight the ease at which simulations can be updated to incorporate more realistic physical approximations. We All cases are set up in an enclosed 2-D Cartesian box with free-slip boundary conditions, with the required changes discussed relative to the base case presented in Section 5.1.

Compressibility
The governing equations applicable for compressible mantle convection, under the Anelastic Liquid Approximation (ALA), 440 are presented in Appendix A (based on, for example, Schubert et al., 2001). Their weak forms are derived by multiplying these equations with appropriate test functions and integrating over the domain, as we did with their incompressible counterparts in Section 4. They differ appreciably from the incompressible approximations that have been utilised thus far, with important updates to all three governing equations. Despite this, the changes required to incorporate these equations, within UFL and Firedrake, are minimal.

445
Although King et al. (2009) examined a number of cases, we focus on one illustrative example here, at Ra = 10 5 and a dissipation number Di = 0.5. This allows us to demonstrate the ease at which these cases can be configured within Firedrake.
The required changes, relative to the base case, are displayed in Listing 2. They can be summarised as follows:  (2) 17 stress = 2 * mu * sym(grad(u)) -2./3. * I * mu * div(u) 18 F_stokes = inner(grad(v), stress) * dx + dot(v, grad(p)) * dx -(dot(v,k) * (Ra * Ttheta * rhobar * alphabar -(Di/gruneisen) * (cpr/cvr) * rhobar * chibar * p) * dx) 19 F_stokes += dot(grad(w), rhobar * u) * dx # Mass conservation 20 F_energy = q * rhobar * cpbar * ((Tnew -Told) / delta_t) * dx + q * rhobar * cpbar * dot(u, grad( Ttheta)) * dx + dot(grad(q), tcond * grad(Tbar + Ttheta)) * dx + q * (alphabar * rhobar * Di * u[1] * Ttheta) * dx -q * ( (Di/Ra) * inner(stress, grad(u)) ) * dx 2. The UFL for the momentum, mass conservation and energy equations is updated, emphasising once again the resemblance to the mathematical formulation (lines [16][17][18][19][20]. The key changes are as follows: (i) the stress tensor is updated to account for a non-zero velocity divergence (line 17), where Identity represents a unit matrix of a given size ( 3. Temperature boundary conditions are updated, noting that we are solving for deviatoric temperature rather than the full 460 temperature, which also includes the reference state. 4. In our Stokes solver, we only specify the transpose_nullspace option (as opposed to both nullspace and transpose_nullspace options for our base case): the incorporation of dynamic pressure's impact on buoyancy implies that the (right-hand side) pressure nullspace is no longer the same as the (left-hand side) transpose nullspace. The transpose nullspace remains the same space of constant pressure solutions, and is used to project out these modes from the initial residual vector to ensure that the linear system is well-posed. The right-hand side nullspace now consists of different modes, which can be found through integration. However, this nullspace is only required for iterative linear solvers in which the modes are projected out from the solution vector at each iteration to prevent its unbounded growth.
We note that in setting up the Stokes solver as we have, we incorporate the pressure effect on buoyancy implicitly, as advocated by Leng and Zhong (2008). As this term depends on the pressure that we are solving for, an extra term is required in addition 470 to the pressure gradient matrix G in the Jacobian matrix in Equation (27). The inclusion ofρ in the continuity constraint also means that this term is no longer simply represented by the transpose of G. Such changes are automatically incorporated by Firedrake, highlighting a major benefit of the automatic assembly approach that is utilised. To ensure the validity of our approach, we have computed the RMS velocity and Nusselt number at a range of different mesh resolutions, for direct comparison with King et al. (2009), with results presented in Figure 3, alongside the final steady-state (full) temperature field. As expected, 475 results converge towards the benchmark solutions, with increasing resolution, demonstrating the applicability and accuracy of Firedrake for compressible simulations of this nature.

Viscoplastic Rheology
To illustrate the changes necessary to incorporate a viscoplastic rheology, which is more representative of deformation within Earth's mantle and lithosphere, we examine a case from Tosi et al. (2015), a benchmark study intended to form a straightforward 480 extension to Blankenbach et al. (1989). Indeed, aside from the viscosity and reference Rayleigh Number (Ra 0 = 10 2 ), all other aspects of this case are identical to the case presented in Section 5.1. The viscosity field, µ, is calculated as the harmonic mean between a linear component, µ lin and a nonlinear, plastic component, µ plast , which is dependent on the strain-rate, as follows: The linear part is given by an Arrhenius law (the so-called Frank-Kamenetskii approximation): where γ T = ln(∆µ T ) and γ z = ln(∆µ z ) are parameters controlling the total viscosity contrast due to temperature and depth, respectively. The nonlinear component is given by: and µ = 10 −3 ), which allows us to demonstrate how a temperature-, depth-and strain-rate dependent viscosity is incorporated 495 within Firedrake. The changes required to simulate this case, relative to our base case, are displayed in Listing 3. These are: 1. Linear solver options are no longer applicable, given the dependence of of viscosity on the flow field, through the strainrate. Accordingly, the solver dictionary is updated to account for the nonlinear nature of our Stokes system (lines 2-11).
For the first time, we fully-exploit the SNES, using a setup based on Newton's method ("snes_type": "newtonls") with a secant line search over the L2-norm of the function ("snes_linesearch_type": "l2"). As we target a steady-500 state solution, an absolute tolerance is specified for our nonlinear solver ("snes_atol": 1e-10).

Solver options differ between the (nonlinear) Stokes and (linear) energy systems. As such, a separate solver dictionary
is specified for solution of the energy equation (lines [13][14][15][16][17][18][19][20]. Consistent with our base case, we use a direct solver for solution of the energy equation, based on the Mumps library. 3. Viscosity is calculated as a function of temperature, depth (µ lin -line 29) and strain-rate (µ plast -line 30), using constants 505 specified on lines 25-26. Linear and nonlinear components are subsequently combined via a harmonic mean (line 31).

Updated solver dictionaries are incorporated into their respective solvers on lines 35 and 36, noting that for this case
both the nullspace and transpose_nullspace options are provided for the Stokes system, consistent with the base case.
We note that even though the UFL for the Stokes and energy systems remains identical to our base case, assembly of additional terms in the Jacobian, associated with the nonlinearity in this system, is once again handled automatically by Firedrake.

515
Taken together, our compressible and viscoplastic rheology results demonstrate the accuracy and applicability of Firedrake for problems incorporating a range of different approximations to the underlying physics. They have allowed us to illustrate Firedrake's flexibility: by leveraging UFL and PETSc, the framework is easily extensible, allowing for straightforward application to scenarios involving different physical approximations, even if they require distinct solution strategies.

520
In this section we highlight the ease at which simulations can be examined in different dimensions and geometries, by modifying our basic 2-D case. We primarily simulate benchmark cases that are well-known within the geodynamical community, initially matching the steady-state, isoviscous simulation of Busse et al. (1994) Zhong et al. (2008). Once again, the changes required to run these cases are discussed relative to our base case (Section 5.1), unless noted otherwise.
3. The inclusion of Python dictionaries that define iterative solver parameters for the Stokes and energy systems (lines . Although direct solves provide robust performance in the 2-D cases examined above, in 3-D the computational (CPU and memory) requirements quickly become intractable. PETSc's fieldsplit pc_type provides a class of preconditioners 550 for mixed problems that allows one to apply different preconditioners to different blocks of the system. Here we configure the Schur complement approach as described in Section 4.3.
The fieldsplit_0 entries configure solver options for the first of these blocks, the K matrix. The linear systems associated with this matrix are solved using a combination of the Conjugate Gradient method (cg, line 23) and an algebraic multigrid preconditioner (gamg, line 27). We also specify two options (gamg_threshold and gamg_square_graph) that 1 # Mesh Generation:  control the aggregation method (coarsening strategy) in the GAMG preconditioner, which balance the multigrid effectiveness (convergence rate) with coarse grid complexity (cost per iteration) (Balay et al., 2021a).
The fieldsplit_1 entries contain solver options for the Schur complement solve itself. As explained in Section 4.3 we do not have explicit access to the Schur complement matrix, G T K −1 G, but can compute its action on any vector, at the cost of a fieldsplit_0 solve with the K matrix, which is sufficient to solve the system using a Krylov method. However, 560 for preconditioning, we do need access to the values of the matrix or its approximation. For this purpose we approximate the Schur complement matrix with a mass matrix scaled by viscosity, which is implemented in MassInvPC (line 35) with the viscosity provided through the optional appctx argument on line 71. This is a simple example of Firedrake's powerful programmable preconditioner interface which, in turn, connects with the Python preconditioner interface of PETSc (line 34). In more complex cases the user can specify their own linear operator in UFL that approximates the 565 true linear operator but is easier to invert. The MassInvPC preconditioner step itself is performed through a linear solve with the approximate matrix with options prefixed with Mp_ to specify a Conjugate Gradient solver with symmetric SOR (SSOR) preconditioning (lines 36-38). Note that PETSc's sor preconditioner type, specified on line 38, defaults to the symmetric SOR variant. Since this preconditioner step now involves an iterative solve, the Krylov method used for the Schur complement needs to be of flexible type, and we specify flexible GMRES (fgmres) on line 32.

570
Specification of the matrix type matfree (line 16) for the combined system ensures that we do not explicitly assemble its associated sparse matrix, instead computing the matrix-vector multiplications required by the Krylov iterations as they arise. Again, for preconditioning in the K-matrix solve we need access to matrix values, which is achieved using AssembledPC. This explicitly assembles the K-matrix by extracting relevant terms from the F_Stokes form.
Finally, the energy solve is performed through a combination of the GMRES (gmres) Krylov method and SSOR preconditioning (lines [42][43][44][45][46][47]. For all iterative solves we specify a convergence criterion based on the relative reduction of the preconditioned residual (ksp_rtol: lines 24, 33, 36 and 46). to include the near nullspace options defined above, in addition to the optional appctx keyword argument that passes the viscosity through to our MassInvPC Schur complement preconditioner. Energy solver options are also updated relative to 585 our base case (lines [72][73], using the dictionary created on lines 42-47. Our model results can be validated against those of Busse et al. (1994). As with our previous examples, we compute the Nusselt number and RMS velocity at a range of different mesh resolutions, with results presented in Figure 5. We find that results converge towards the benchmark solutions, with increasing resolution, as expected. The final steady state temperature field is illustrated in Figure 2

2-D Cylindrical Shell Domain
We next examine simulations in a 2-D cylindrical domain. The domain is defined by the radii of the inner (r min ) and outer (r max ) boundaries. These are chosen such that the non-dimensional depth of the mantle, z = r max −r min = 1, and the ratio of the inner and outer radii, f = r min /r max = 0.55, thus approximating the ratio between the radii of Earth's surface and core-mantleboundary (CMB). Specifically, we set r min = 1.22 and r max = 2.22. The initial temperature distribution, chosen to produce 4 595 equidistant plumes, is prescribed as: where A = 0.02 is the amplitude of the initial perturbation. Boundary conditions for temperature are T = 0 at the surface (r max ) and T = 1 at the base (r min ). Free-slip velocity boundary conditions are specified on both boundaries, which we incorporate weakly through the Nitsche approximation (see Section 4.1). The Rayleigh number Ra = 1 × 10 5 .

600
With a free-slip boundary condition on both boundaries, one can add an arbitrary rotation of the form (−y, x) = rθ to the velocity solution (i.e. this case incorporates a velocity nullspace, as well as a pressure nullspace). As noted in Section 4, these lead to null-modes (eigenvectors) for the linear system, rendering the resulting matrix singular. In preconditioned Krylov methods these null-modes must be subtracted from the approximate solution at every iteration (Kramer et al., 2021a), which we illustrate through this example. The key changes required to simulate this case, displayed in Listing 5, are: over the top or bottom surface of the mesh, respectively). FacetArea and CellVolume return, respectively, A f and V cf required by Equation 17. Given that velocity boundary conditions are handled weakly through UFL, they are no longer passed to the Stokes problem as a separate option (line 46). 4. We define the rotational nullspace for velocity and combine this with the pressure nullspace in the mixed finite element space Z (lines [30][31][32][33][34][35]. Constant and rotational near-nullspaces, utilised by our GAMG preconditioner, are also defined 620 on lines 37-42, with this information passed to the solver on line 47. Note that iterative solver parameters identical to those presented in the previous example are used (see Section 5.3.1).
Our predicted Nusselt numbers converge towards those of Fluidity with increasing resolution (Figure 6), demonstrating the accuracy of our approach. Predicted RMS velocities exceed those of Fluidity, albeit only by ∼ 0.1%, but lie within the bounds set by other codes for this case (Wilson, Pers. Comm., using TerraFERMA: Wilson et al., 2017). To further assess the validity 625 of our setup, we have confirmed the accuracy of our solutions to the Stokes system in this 2-D cylindrical geometry, through comparisons with analytical solutions from Kramer et al. (2021a), for both zero-slip and free-slip boundary conditions. These provide a suite of solutions based upon a smooth forcing term, at a range of wave-numbers n, with radial dependence formed by a polynomial of arbitrary order k. We study the convergence of our Q2-Q1 discretisation with respect to these solutions. https://doi.org/10.5194/gmd-2021-367 Preprint. Discussion started: 13 January 2022 c Author(s) 2022. CC BY 4.0 License. Listing 6. Difference in Firedrake code required to reproduce 3-D spherical benchmark cases from Zhong et al. (2008).
Convergence plots are illustrated in Figure 7. We observe super-convergence for the Q2-Q1 element pair at fourth-and second-630 order, for velocity and pressure, respectively, with both zero-slip and free-slip boundary conditions, which is higher than the theoretical (minimum) expected order of convergence of three for velocity and two for pressure (we note that super-convergence was also observed in Zhong et al., 2008;Kramer et al., 2021a). Cases with lower wave-number, n, show smaller relative error than those at higher n, as expected. The same observation holds for lower and higher polynomial order, k = 2 and k = 4, for the radial density profile. Python scripts for these analytical comparisons can be found in the repository accompanying this 635 paper.

3-D Spherical Shell Domain
We next move into a 3-D spherical shell geometry, which is required to simulate global mantle convection. We examine a wellknown isoviscous community benchmark case (e.g. Bercovici et al., 1989;Ratcliff et al., 1996;Zhong et al., 2008;Davies et al., 2013), at a Rayleigh number of Ra = 7 × 10 3 , with free-slip velocity boundary conditions. Temperature boundary conditions 640 are set to 1 at the base of the domain (r min = 1.22) and 0 at the surface (r max = 2.22), with the initial temperature distribution approximating a conductive profile with superimposed perturbations triggering tetrahedral symmetry at spherical harmonic degree l = 3 and order m = 2 (see Zhong et al., 2008, for further details).
As illustrated in Listing 6, when compared to the 2-D cylindrical case examined in Section 5.3.2, the most notable change required to simulate this 3-D case is the generation of the underlying mesh. We use Firedrake's built-in CubedSphereMesh and 645 extrude it radially through 16 layers, forming hexahedral elements. As with our cylindrical example, we approximate the curved cylindrical domain quadratically, using the optional keyword argument degree= 2. Further required changes, highlighted in Listing 6, relate to 3-D extensions of the velocity nullspace, and the near-nullspaces required by the GAMG preconditioner, all of which are simple. We do not show the changes associated with extending the radial unit vector to 3-D, or the initial condition for temperature, given that they are straightforward, although, as with all examples, a complete Python script for this case can be found in the repository accompanying this paper.  (Bercovici et al., 1989;Ratcliff et al., 1996;Yoshida and Kageyama, 2004;Stemmer et al., 2006;Choblet et al., 2007;Tackley, 2008;Zhong et al., 2008;Davies et al., 2013); (c) final steady-state temperature field highlighted through isosurfaces at temperature anomalies (i.e. away from radial average) of T = −0.15 (blue) and T = 0.15 (orange), with the core-mantle-boundary at the base of the spherical shell marked by a red surface; (d-f) as in a-c, but for a temperature-dependent-viscosity case, with thermally induced viscosity contrasts of 10 2 . Fewer codes have published predictions for this case, but results of Zhong et al. (2008) are marked by dashed red lines.  Figure 9. Convergence of velocity and pressure for 3-D spherical cases with zero-slip and free-slip boundary conditions, for perturbations at a range of spherical harmonic degrees l and orders m. Note that all cases with a smooth forcing are run at k = l + 1. Refinement level 3 corresponds to the level specified for our cubed sphere mesh, comprising 386 elements in the tangential direction, which is extruded radially to 8 layers. Resolution is doubled in all directions at subsequent refinement levels.

Free-Slip
Despite the simplicity of our setup, the accuracy of our approach is confirmed via comparison of both Nusselt numbers and RMS velocities with those of previous studies (e.g. Bercovici et al., 1989;Ratcliff et al., 1996;Yoshida and Kageyama, 2004;Stemmer et al., 2006;Choblet et al., 2007;Tackley, 2008;Zhong et al., 2008;Davies et al., 2013). For completeness, the final steady-state temperature field is illustrated in Figure 8(c). Furthermore, in line with our 2-D cases, we have confirmed 655 the accuracy of our Stokes solver for both zero-slip and free-slip boundary conditions in a 3-D spherical geometry, through comparisons with analytical solutions from Kramer et al. (2021a), which provide solutions based upon a smooth forcing term at a range of spherical harmonic degrees, l, and orders, m, with radial dependence formed by a polynomial of arbitrary order k. As with our 2-D cases, we observe super-convergence for the Q2-Q1 element pair at fourth-and second-order, for velocity and pressure, respectively, with both zero-slip and free-slip boundary conditions (Figure 9).

660
This section has allowed us to highlight a number of Firedrake's benefits over other codes: (i) the ease at which simulations can be examined in different geometries, with minimal changes to the Python code, facilitated by Firedrake's built-in mesh generation utilities and extrusion functionality; (ii) the ease at which iterative solver configurations and preconditioners can be updated and tested, including scenarios incorporating multiple nullspaces, facilitated by Firedrake's fully-programmable solver interface, alongside its customisable preconditioner interface, both of which are seamlessly coupled to PETSc; and (iii) 665 the convergence properties of our finite element system, in geometries that are representative of Earth's mantle. Taken together, these confirm Firedrake's suitability for simulations of global mantle dynamics, as will be further highlighted in Section 7.

Parallel Scaling
We assess parallel scalability using a 3-D spherical case similar to that presented in Section 5.3.3, albeit incorporating a temperature-dependent viscosity, following the relation: where E is a parameter that controls the temperature dependence of viscosity. In the example considered -Case A4 from The most challenging aspect of weak parallel scaling is solver performance as the problem size increases. Whilst the amount of computation in equation assembly typically scales linearly with the number of DOFs -before taking parallel aspects such as communication into account -solver scaling is generally worse. In the case of iterative solvers, this is due to a deterioration 685 in the conditioning of the matrix, driving an increase in the number of iterations required for convergence. As a result, even if the cost per iteration scales linearly, the overall cost will not. This implies that for weak scaling, the amount of work per core may increase rapidly, despite the number of DOFs per core remaining consistent.
The deterioration in conditioning is intimately related to the fact that an increase in resolution increases the ratio between smallest and largest resolvable length-scales. For elliptic operators, like the viscosity matrix K, the condition number scales 690 with the square of that ratio (e.g. Kramer et al., 2010). Multigrid approaches, which separate smaller and larger length scales on a hierarchy of fine to coarse meshes, are commonly used to address this problem, which motivates the choice of the algebraic multigrid preconditioner, GAMG, used here. Such approaches aim to maintain a constant, or only slowly increasing number of iterations and, thus, a near-linear scaling of the overall cost, as the problem size increases. This can be a challenge however as, for instance, an increase in resolution will require more multigrid levels, which will lead to an increased setup time and cost 695 per iteration. In practice, when configuring the multigrid method, a compromise needs to be found between the effectiveness of multigrid in limiting the number of iterations, and not allowing the setup and costs per iteration to grow too rapidly. The two options, gamg_threshold and gamg_square_graph, specified in our solver setup, ensure a balance between multigrid effectiveness an coarse grid complexity.
A breakdown of key parallel scaling results are presented in Figure 10. Panel a displays the average number of iterations per 700 solve over the 20 timesteps. We find that the number of pressure (the Schur complement solve: fieldsplit_1) and energy solve iterations remains flat (12 and ∼ 10.5, respectively), whilst the number of velocity solve iterations (inversion of the matrix K, using the GAMG preconditioner: fieldsplit_0) increases only slowly, from ∼ 41 to ∼ 51, over a greater than three-orders-ofmagnitude increase in problem size and number of processor cores. This demonstrates algorithmic scalability on up to 12288 cores and ∼ 50 × 10 6 elements (which corresponds to ∼ 1.25 × 10 9 velocity and pressure degrees of freedom).

705
Parallel scalability can also be assessed by analysing the growth in CPU-time of the dominant components of our problem: assembly of finite element systems (Figure 10b), setup of the algebraic multigrid (GAMG) preconditioner (Figure 10c), and time spent solving the Schur complement system (Figure 10d). We find that the assembly time is a negligible fraction of this problem. The setup time for our GAMG preconditioner grows from ∼ 240s on 24 cores to ∼ 470s on 12288 cores. This is understandable, given the large communication costs associated with setting up various multigrid levels, particularly for 710 problems incorporating nullspaces and near-nullspaces, as is the case here. We note, however, that this is not a concern: as a We note that the change in gradient, apparent in panels b-f when moving from 24-192 and 192-12288 cores arises due to a transition from running simulations on a single compute node to multiple nodes.

720
In this section, we demonstrate application of Firedrake to a time-dependent simulation of global mantle convection in a 3-D spherical shell geometry, at a realistic Rayleigh number. As with the examples provided above, calculations are performed using a hexahedral trilinear Q2-Q1 element pair for velocity and pressure. We use a Q2 discretisation for temperature and, given the increased importance of advection at higher Rayleigh numbers, incorporate stabilisation through a streamline upwinding scheme, following Donea and Huerta (2003). Our solution strategy for the Stokes and energy equations is otherwise identical 725 to the spherical examples presented above.
For simplicity, we assume an incompressible mantle and a linear temperature-and depth-dependent rheology, following the relation, Here µ 0 is a reference viscosity that increases by a factor of 40 below the mantle transition zone, and E = ln(1000) controls of hot upwelling material is strongly modulated by these downwellings, with upwelling plumes concentrating in two clusters beneath the African continent and the central Pacific ocean (i.e. away from regions that have experienced subduction over the 745 past 150 Myr or so). The cluster of plumes in the Pacific is reasonably circular, whilst those beneath Africa extend in a NW-SE trending structure, which to the north curves eastward under Europe and to the south extends into the Indian Ocean.
Further analysis of this proof-of-concept simulation is beyond the scope of this study. However, when combined with the benchmark and parallel scaling analyses presented-above, our model predictions, which are consistent with those from a number of previous studies (e.g. Bunge et al., 2002;Davies et al., 2012;Bower et al., 2013;Davies et al., 2015a), confirm Firedrake's 750 applicability for realistic, time-dependent, global mantle dynamics simulations of this nature.

Discussion
Firedrake is a next-generation automated system for solving variational problems using the finite element method (e.g. Rathgeber et al., 2016;Gibson et al., 2019). It has a number of features that are ideally suited to simulating geophysical fluid dynamics problems, as exemplified by its use in application areas such as coastal ocean modelling (Kärnä et al., 2018), numerical weather 755 prediction (Shipton et al., 2018), and glacier flow modelling (Shapero et al., 2021). The focus of this manuscript has been to demonstrate Firedrake's applicability for geodynamical simulation, with an emphasis on global mantle dynamics. To do so, we presented, analysed and validated Firedrake against a number of benchmark and analytical cases, of systematically increasing complexity, building towards a realistic time-dependent global simulation.
In order to introduce the core components and illustrate the elegance of setting up and validating a geodynamical model in 760 Firedrake, we started with a simple, incompressible, isoviscous case in an enclosed 2-D Cartesian box. Setting up this problem was straightforward, requiring only a weak formulation of the governing equations for specification in UFL, together with a mesh, initial and boundary conditions, and appropriate discrete function spaces. By utilising Firedrake's built-in meshing functionality and default direct solver options, we were able to demonstrate the framework's accuracy for simulations of this nature: in less than 70 lines of Python, we reproduced results from the well-established benchmark study of Blankenbach et al. 765 (1989), Representative simulations of mantle and lithosphere dynamics, however, incorporate more complicated physics. To demonstrate Firedrake's applicability in such scenarios, we next set up 2-D simulations that accounted for compressibility, through the Anelastic Liquid Approximation (Schubert et al., 2001), and a nonlinear viscosity that depends upon temperature, depth and strain-rate. Our results were validated through comparison with the benchmark studies of King et al. (2009) and Tosi et al.

770
(2015), respectively. For compressible cases, despite the governing equations differing appreciably from their incompressible counterparts, the modifications required to our setup were minimal, with the most notable change being the UFL describing the relevant PDEs. For the viscoplastic rheology case, where viscosity varied by several orders of magnitude across the domain, an appropriate solution strategy was required to deal with nonlinear coupling between strain-rate and viscosity: Firedrake's fully-programmable solver interface and seamless coupling to PETSc facilitated the straightforward use of PETSc's Scalable key benefits: by leveraging UFL (Alnes et al., 2014), associated strategies for automatic assembly of finite element systems, and PETSc (Balay et al., 1997(Balay et al., , 2021a, the framework is easily extensible, allowing for straightforward application to problems involving different physical approximations, even when they require distinct solution strategies. This is further highlighted with the transition from 2-D to 3-D. With modifications to only a few lines of Python, the basic 2-D 780 Cartesian case described above was easily extended to 3-D, allowing for comparison and validation against the well-established benchmark results of Busse et al. (1994). However, the direct solvers used for our 2-D cases quickly become computationally intractable in 3-D, necessitating the use of an iterative approach. Firedrake's programmable solver interface facilitates the straightforward inclusion of Python dictionaries that define iterative solver parameters for the Stokes and energy systems. A number of different schemes have been advocated by the geodynamical modelling community (e.g. May and Moresi, 2008; 785 Burstedde et al., 2013), but in all 3-D simulations examined herein, the Schur complement approach was utilised for solution of our Stokes system, exploiting the fieldsplit preconditioner type to apply preconditioners, including algebraic multigrid, to different blocks of the system. A Crank-Nicholson scheme was utilised for temporal discretisation of the energy equation, with a standard GMRES Krylov method with SOR preconditioning used for solution. We have demonstrated that such solution strategies are effective and scalable, with algorithmic scalability confirmed on up to 12288 cores.

790
Cartesian simulations offer a means to better understand the physical mechanisms controlling mantle convection, but a 3-D spherical shell geometry is required to simulate global mantle dynamics. We have demonstrated how Firedrake's built-in meshing and extrusion functionality facilitates the effortless transition to such geometries (in addition to comparable 2-D cylindrical shell geometries), whilst its Python user-interface allows for the simple inclusion of a radial gravity direction and boundary conditions that are not aligned with Cartesian directions. The convergence properties and accuracy of our simulations in a 3-D 795 spherical geometry have been demonstrated through comparison with the extensive set of analytical solutions introduced by Kramer et al. (2021a) and a series of low Rayleigh number isoviscous and temperature-dependent viscosity simulations, from Zhong et al. (2008). We observed super-convergence for the Q2-Q1 element pair at fourth-and second-order, for velocity and pressure, respectively.
Having validated Firedrake against this broad suite of cases, we finally applied the framework to a realistic simulation of 800 global mantle convection. For simplicity, we assumed an incompressible mantle and a linear temperature-and depth-dependent rheology, assimilating 230 Myr of plate motion histories (Muller et al., 2016) through a kinematic surface boundary condition.
These prescribed plate velocities organize underlying mantle flow, such that the predicted present-day convective planform is dominated by cold downwellings in regions of plate convergence, with upwellings concentrating elsewhere, particularly beneath the African and Pacific domains. Our model predictions, which reproduce first-order characteristics of the structure of 805 Earth's mantle imaged through seismology (e.g. Ritsema et al.;French and Romanowicz, 2015), the geographical distribution of mantle plumes (e.g. Austermann et al., 2014;Davies et al., 2015b), and are consistent with those from a number of previous studies and the (e.g. Bunge et al., 2002;Davies et al., 2012;Bower et al., 2013;Davies et al., 2015a), serve as a proof-ofconcept, confirming Firedrake's applicability for realistic, time-dependent, global simulations of this nature and, accordingly, its suitability for addressing research problems from the very frontiers of geodynamical research. a wide-range of finite elements, including continuous, discontinuous, H(div) and H(curl) discretisations, and elements with continuous derivatives such as the Argyris and Bell elements (see Kirby and Mitchell, 2019, for an overview). Some of these could offer major advantages for geodynamical simulation. For example, a number of studies now advocate the use of Discontinuous Galerkin (DG) schemes for solution of the energy equation (e.g. Vynnytska et al., 2013;He et al., 2017). Importantly, Firedrake's simple API allows a user to escape the UFL abstraction, and implement common 820 operations that fall outside of pure variational formulations, such as flux limiters, which are central to DG schemes.
Firedrake also provides the necessary infrastructure for hybridisation strategies (Gibson et al., 2019), which allow for a reduction of the many extra degrees of freedom introduced by DG schemes in the global system to a smaller subset, defined on element interfaces through so-called trace elements. This offers the prospect of arriving at more efficient ways of solving the Stokes system (e.g. Cockburn and Shi, 2014). Such possibilities will be explored in future work, noting 825 that Firedrake's existing support for these elements will facilitate rapid and efficient testing and validation.
2. Fully coupled nonlinear systems: in all examples considered herein, we solve for velocity and pressure in a separate step to temperature, largely owing to our familiarity with this approach from previous work (e.g. Davies et al., 2011;Kramer et al., 2021a). However, a number of studies advocate solving for these fields simultaneously (e.g. Wilson et al., 2017), particularly for strongly coupled, highly-nonlinear, multi-physics problems. By leveraging UFL, in combination with PETSc's fieldsplit preconditioning approach, future work to configure and test such coupled schemes within Firedrake will be relatively straightforward.
3. Preconditioners: a major benefit of Firedrake for the problems considered herein is access to the wide variety of solution algorithms and preconditioning strategies provided by the PETSc library, which can be flexibly configured through the solver parameters dictionary, allowing one to test and apply different strategies with ease. The development of 835 preconditioners for the Stokes problem is an active area of research (e.g. May and Moresi, 2008;Burstedde et al., 2013;Shih et al., 2021). As noted above, Firedrake supports a powerful programmable preconditioner interface which, in turn, connects with the Python preconditioner interface of PETSc, and allows users to specify their own linear operator in UFL, thus enabling preconditioning techniques with bespoke operator approximations. We note that in addition to the complete range of algebraic solvers offered by PETSc, Firedrake also provides access to multilevel solvers with 840 geometric hierarchies, opening up the possibility of exploring geometric multigrid approaches in the future.
We note that the automated approach underpinning Firedrake has the potential to revolutionize the use of adjoints and other inverse schemes in geodynamics. Adjoint models have made an enormous impact in fields such as meteorology and oceanography. However, despite significant progress (e.g. Bunge et al., 2003;Liu et al., 2008;Li et al., 2017;Colli et al., 2018; hampered by the practical difficulty of their derivation and implementation. In contrast to developing a model directly in Fortran or C++, high-level systems, such as Firedrake, allow the developer to express the variational problems to be solved in near-mathematical notation through UFL. As such, these systems have a key advantage: since the mathematical structure of the problem is preserved, they are more amenable to automated analysis and manipulation, which can be exploited to automate the derivation of adjoints (e.g. Farrell et al., 2013;Mitush et al., 2019) and the generation of the low-level code for the derived 850 models. Exploring the use of such an approach in geodynamics will be an important avenue for future research.
Finally, given that the importance of reproducibility in the computational geosciences is increasingly being recognized, we note that Firedrake integrates with Zenodo and GitHub to provide users with the ability to generate a set of DOIs corresponding to the exact set of Firedrake components used to conduct a particular set of simulations. In providing our input scripts and a DOI for the version of Firedrake used herein, we ensure traceable provenance of model data, in full compliance with FAIR 855 (Findable, Accessible, Interoperable, Reusable) principles.

Conclusions
Firedrake is a next-generation system for solving variational problems using the finite element method (e.g. Rathgeber et al., 2016;Gibson et al., 2019). It treats finite element problems as a composition of several abstract processes, using separate and open-source software components for each. Firedrake's overarching goal is to save users from manually writing low-level code 860 for assembling the systems of equations that discretize their model physics. It is written completely in Python, and exploits automatic code-generation techniques to apply sophisticated performance optimisations.
In this manuscript, we have confirmed Firedrake's applicability for geodynamical simulation, by configuring and validating model predictions against a series of benchmark and analytical cases, of systematically increasing complexity. In all cases, Firedrake has been shown to be accurate and efficient, and we have also demonstrated that that it is flexible and easily exten-865 sible: by leveraging UFL and PETSc, it can be effortlessly applied to problems involving different physical approximations (e.g. incompressible and compressible flow; isoviscous and more complex nonlinear rheologies), even if they require distinct solution strategies. We have illustrated how Firedrake's built-in mesh generation utilities and extrusion functionality provide a straightforward mechanism for examining problems in different geometries , and how its fully-programmable solver dictionary and customisable preconditioner interface, both of which 870 are seamlessly coupled to PETSc, facilitate straightforward configuration of different solution approaches. Parallel scalability has been demonstrated, on up to 12288 compute cores. Finally, using a realistic simulation of global mantle dynamics, where the distribution of heterogeneity is governed by imposed plate motion histories (Muller et al., 2016), we have confirmed Firedrake's suitability for tackling challenges from the very forefront of geodynamical research. We note that all simulation data presented herein has traceable provenance: in providing our input scripts and a DOI for the exact set of Firedrake components employed, Firedrake facilitates transparency and reproducibility, in full compliance with FAIR principles. reference state. Here, they are defined at the domain's upper surface.
Assuming a linearised equation of state (Eq. A1), the dimensionless form of the conservation of mass equation under the Anelastic Liquid Approximation (ALA) can be expressed as (e.g., Schubert et al., 2001): where u is the velocity. Neglecting inertial terms, the force balance equation becomes: where µ denotes the dynamic viscosity, I the identity tensor, Ra the Rayleigh number, and Di the dissipation number given by, respectively: with κ denoting the thermal diffusivity, d the length scale and ∆T the temperature scale. Note that the final term in Eq. A8 is 915 expressed in terms of the temperature perturbation, T (sometimes called the potential temperature). Finally, in the absence of internal heating, conservation of energy is expressed as: where k is the thermal conductivity and Φ denotes viscous dissipation.
Author contributions. DRD and SCK conceived this study, with all authors having significant input on the design, development and validation

920
of the examples and cases presented. All authors contributed towards writing the manuscript.
Competing interests. Authors declare that they have no conflict of interest.
NCI National Facility in Canberra, Australia, which is supported by the Australian Commonwealth Government. Authors are grateful to the  , 47, 113-125, 1971.