Version 2 of the unstructured-mesh Finite-Element Sea ice–Ocean circulation
Model (FESOM) is presented. It builds upon FESOM1.4

Ocean circulation models formulated on unstructured meshes
offer multi-resolution functionality in a seamless way. Although they are
common in coastal ocean modeling, they are only beginning to be used for
global ocean studies. The Finite-Element Sea ice–Ocean circulation Model
(FESOM,

The main reason for switching to a new finite-volume numerical core in FESOM2
is its higher computational efficiency. It stems largely from a more
efficient data structure. FESOM1.4 is based on tetrahedral elements, and
tetrahedra below any surface triangle do not necessarily keep the same
neighborhood connectivity pattern as the depth increases. Three-dimensional
auxiliary and look-up arrays are therefore needed, and accessing them for
each element slows down the performance. Another reason for switching to a
finite-volume version is the availability of clearly defined fluxes and a
possibility to choose from a selection of transport algorithms, which was
very limited for the continuous Galerkin discretization of FESOM1.4. A very
useful feature of FESOM1.4 is its ability to combine geopotential and
terrain-following vertical mesh levels, namely, it was the reason for using
tetrahedral elements and not triangular prisms. To ensure similar
functionality in the new version, we introduce the
arbitrary Lagrangian Eulerian (ALE)
vertical coordinate (see, e.g.,

Although many details of the finite-volume method used by FESOM2 have already
been presented in

FESOM2 uses a cell–vertex placement of variables in the horizontal
directions. The 3-D mesh structure is defined by the surface triangular mesh
and a system of level surfaces which form a system of prisms. In a horizontal
plane, the horizontal velocities are located at cell (triangle) centroids,
and scalar variables are at mesh (triangle) vertices. The vector control
volumes are the prisms based on mesh surface cells, and the prisms based on
median–dual control volumes are used for scalars (temperature, salinity,
pressure and elevation). The latter are obtained by connecting cell centroids
with edge midpoints, as illustrated in Fig.

Schematic of cell–vertex discretization (left) and the edge-based
structure (right). The horizontal velocities are located at cell (triangle)
centers (red circles) and scalar quantities (the elevation, pressure,
temperature and salinity) are at vertices (blue circles). The vertical
velocity and the curl of horizontal velocity (the relative vorticity) are at
the scalar locations too. Scalar control volumes (here the volume associated
with vertex

In the vertical direction, the horizontal velocities and scalars are located
at mid-levels. The velocities of inter-layer exchange (vertical velocities
for flat layer surfaces) are located at full layers and at scalar points.
Figure

The layer thicknesses are defined at scalar locations (to be consistent with the elevation). There are also auxiliary layer thicknesses at the horizontal velocity locations. They are interpolated from the vertex layer thicknesses.

The cell–vertex discretization selected for FESOM2 can be viewed as an
analog of an Arakawa B-grid (see also below), while that of FESOM1.4 is an
analog of an A-grid. The cell–vertex discretization is free of pressure
modes, which would be excited in the A-grid FESOM1.4 without its
stabilization. However, the cell–vertex discretization allows spurious
inertial modes because of excessively many degrees of freedom used to
represent the horizontal velocities. They can be filtered by the horizontal
viscosity. In the quasi-hexagonal C-grid discretization used by
the Model for Prediction Across Scales (MPAS)

For convenience of model description we introduce the following notation.
Quantities defined at cell centroids will be denoted with the lower index

We use a spherical coordinate system with the North Pole displaced to
Greenland (commonly 75

Schematic of vertical discretization. The thick line represents the bottom; the thin lines represent the layer boundaries and vertical faces of prisms. The locations of variables are shown for the left column only. The blue circles correspond to scalar quantities (temperature, salinity, pressure), the red circles to the horizontal velocities and the yellow ones to the vertical exchange velocities. The bottom can be represented with full cells (three left columns) or partial cells (the next two). The mesh levels can also be terrain-following, and the number of layers may vary (the right part of the schematic). The layer thickness in the ALE procedure may vary in prisms above the blue line. The height of prisms in contact with the bottom is fixed.

The bottom topography is commonly specified at scalar points because the
elevation is defined there. However, for discretizations operating with full
velocity vectors, this would imply that velocity points are also at
topographic boundaries. In this case the only safe option is to use the
no-slip boundary conditions, similar to the traditional B-grids. To avoid
this constraint, we use the cellwise representation of bottom topography. In
this case both no-slip and free-slip boundary conditions are possible. Their
implementation relies on the concept of ghost cells which are obtained from
the boundary elements by reflection with respect to the boundary face (edge
in 2-D). The drawback of the elementwise bottom representation is that the
total thickness is undefined at scalar points if the bottom is stepwise
(geopotential vertical coordinate). The motion of level surfaces of the ALE
vertical coordinate at each scalar location is then limited to the layers
that do not touch the bottom topography (above the blue line in
Fig.

Because of cellwise bottom representation, algorithms aiming to closely
follow the bottom topography may create triangular prisms going inland (two lateral faces touch the land) at certain levels on

Partial cells on

We introduce layer thicknesses

The equations of motion, continuity and tracer balance are integrated
vertically over the layers. We will use

Integrating Eq. (

The layer-integrated momentum equation in the flux form is

If the flux form (

We get another familiar option by subtracting

The second term on the lhs of Eq. (

To summarize, the velocity

FESOM1.4 uses asynchronous time stepping, with the horizontal velocities and
scalars shifted by a half time step. We adapt it to FESOM2. This requires
that the elevation and layer thicknesses be introduced at, respectively, full
(integer) and half-integer time levels. We write

The elevation at full time steps and the total thickness on half-steps, given
by the vertical sum of

For

We will continue by providing more detail on the asynchronous time stepping.
We write

The vertical viscosity contribution on the rhs can be conveniently added during the assembly of the operator on the lhs.

The corrector step is written as

The overall solution strategy is as follows.

Compute

Solve Eq. (

Compute

Determine layer thicknesses and

Advance the tracers. The implementation of implicit vertical diffusion will be detailed below.

Linear free surface: if we keep the layer thicknesses fixed, the time derivative
drops out, and the rest gives us the standard equation to compute

If this option is also applied to the first layer, the freshwater flux cannot
be taken into account in the thickness equation. Its contribution to the
salinity equation is then through the virtual salinity flux. In this option,

Full (nonlinear) free surface: we adjust the thickness of the upper layer,
while the thicknesses of all other layers are kept fixed,

We can distribute the total change in height

where

This can be generalized even further. One can use arbitrary distribution
of layer thicknesses provided that their tendencies sum to

Because of varying layer thicknesses, the implementation of implicit vertical
diffusion needs slight adjustment compared to the case of fixed layers. We
write, considering time levels

The semi-implicit implementation of the part related to the surface elevation
(external mode) implies that an iterative solver must be used to solve the
equation on

To obtain the finite-volume discretization, the governing equations are
integrated over the control volumes. The flux divergence terms are then, by
virtue of the Gauss theorem, transformed into the net fluxes leaving the
control volumes. All other terms are estimated as a mean over the volumes. It
is assumed that

Since the horizontal velocity is at centroids, its cell-mean value

where

Here

Here

On the cells touching the lateral walls or bottom topography, we use ghost
cells (mirror reflections with respect to the boundary edge). Their
velocities are computed either as

We stress that matrices

where

In contrast to the scalar gradient operator, the operator of divergence
depends on the layer (because of bottom topography), which is one of the
reasons why it is not stored in advance. Besides, the fluxes

where

It can be verified that the operators introduced above are mimetic. For example, the scalar gradient and divergence are negative adjoints of each other in the energy norm and the curl operator applied to the scalar gradient operator gives identically zero. The latter property allows a PV conserving discretization, but we will not discuss it here.

FESOM2.0 has three options for momentum advection. Two of them use the flux
form and the third one uses the vector-invariant form. In spherical geometry
the flux form takes an additional term

and use them to compute the divergence of horizontal momentum flux:

Here

The fluxes through the top and bottom faces are computed with

with the same rule for the normals as in the computations of the divergence
operator. The contributions from the top and bottom faces of the scalar
control volume are obtained by summing the contributions from the cells,

for the top surface, and similarly for the bottom one. The estimate of

This option is special in the sense that the continuity is treated here in the same way as for the scalar quantities.

The representation with the thicknesses,

is reserved for the future. The gradient of kinetic energy should be computed
in the same way as the pressure gradient, which necessitates computations of

The vertical part follows Eq. (

for the top surface, and similarly for the bottom. Note that the contributions from the curl of horizontal velocity, the gradient of kinetic energy and the vertical part involve the same stencil of horizontal velocities.

The three options above behave similarly in simple tests on triangular
meshes, but their effect on flow–topography interactions or eddy dynamics
remains to be studied. The vector-invariant option is slightly less
dissipative, but may leave some noise in

Formally, the derivatives of horizontal velocity can be estimated and the
components of the viscous stress tensor,

The expression for stresses can be simplified as

This procedure, especially its biharmonic version, proves to be costly, for
it involves computations of velocity derivatives and manipulations with two
types of contributions. On the other hand, we see that the expensive part
involving the general computation of velocity derivatives is only needed on
deformed meshes; it will be small on quasi-equilateral meshes and, even if it
is not small generally, it contributes little to penalizing differences
between the nearest velocities. This leads to the idea to introduce
simplified operators based on the nearest neighbors. Indeed, by writing

The procedure can be simplified even further as

The code contains these options but we are using the last one in the
biharmonic version in most cases – it is efficient both computationally and
in terms of providing stable code performance. We have not met any visible
artifacts thus far despite its obvious physical shortcomings. In all other
cases, the coefficient of horizontal viscosity is scaled with mesh size to
provide

We note that the inefficiency of the standard Laplace operator in filtering
grid scales for cell variable placement and measures needed to amend it are
well known (see, e.g.,

High-order transport schemes for vertex variable placement can be realized by
using polynomial reconstruction of scalar fields or the reconstruction of
gradients of scalar fields at mid-edges. We experimented with the quadratic
reconstruction of scalars, which provides a compromise between accuracy and
computational effort (see

Consider edge

We note that the high order of the scheme above is only achieved on uniform
meshes. However, since

The implementation requires preliminary computation of scalar gradients on
cells. An extended halo exchange is needed to make these gradients available
during flux assembly. Edges touching the topography may lack either

For the vertical direction, we provide a set of possibilities which include
the third-/fourth-order option similar to the algorithm described above,
spline interpolation, as well as the piece-wise parabolic method by

The FCT version uses the first-order upwind method as the low-order monotonic method and the method above as the high-order one. The low-order solution and the antidiffusive fluxes (the difference between the high-order and low-order fluxes) are assembled in the same cycle (over edges for the horizontal part and over vertices for the vertical part) and stored. We experimented with separate pre-limiting of horizontal and vertical antidiffusive fluxes and found that commonly this leads to an increased dissipation, for the horizontal admissible bounds are in many cases too tight. For this reason, the computation of admissible bounds and limiting is 3-D. As a result, it will not necessarily fully eliminate non-monotonic behavior in the horizontal direction. The FCT algorithm of FESOM1.4 follows the same logic; however, in that case it is the only possibility. Using the FCT version roughly doubles the cost of the transport algorithm, but adds the stability needed in practice.

As demonstrated in

There are several ways to implement the Gent–McWilliams (GM)
parameterization

The bolus velocity

Although the natural placement for

We compute the speed

Assuming that the slope of the isopycnals is small, we can write the
diffusivity tensor as

In the following we evaluate the performance of FESOM2.0 by simulating the realistic ocean state under prescribed atmospheric forcing. The purpose is to illustrate that FESOM2.0 is ready to be run in global configurations, although it may still need some further parameter tuning. Model efficiency is then briefly assessed. Detailed model assessment is the subject of future work.

The evaluation will be done in two steps. In the first step we compare the
performance of FESOM2.0 to that of finite-element FESOM1.4

In the second step we simulate the ocean state under CORE-II forcing with
FESOM2.0 but on an eddy-permitting global mesh with a quasi-uniform
resolution of 15

The departure of simulated potential temperature averaged over
1998–2007 from WOA2005 climatology, averaged over depth ranges. The left and
middle columns correspond to the simulations performed with FESOM1.4 and
FESOM2, respectively, on the coarse-resolution reference mesh. The right
column corresponds to FESOM2.0 on the global mesh with a resolution of
15

The same as in Fig.

Although we try to configure both model versions as closely as possible for
our intercomparison, there are a few differences due to the details of
implementation. First, different transport schemes are used. The
Taylor–Galerkin (TG) algorithm of FESOM1.4 with consistent mass matrices is
expected to be less dissipative than the third-/fourth-order upwind algorithm
used in FESOM2.0. The TG scheme works by default with a FCT limiter in
FESOM1.4, so we apply the FCT limiting in FESOM2 too. Second, the difference
between the two versions of FESOM comes from the implementation of the GM
parameterization of eddy transport. FESOM1.4 uses the GM skew flux
formulation as suggested by

All simulations are run with the linear free-surface and virtual salinity
forcing. The surface salinity is restored to the climatological data with the
piston velocity of 50

We first compare the last 15 years of the simulated hydrography in the two
model runs on the coarse-resolution reference mesh to the World Ocean
Database 2005

At deeper levels of the tropical Atlantic, FESOM2.0 performs better than
FESOM1.4; at the same time, errors become larger in the Southern Ocean and
the eastern North Atlantic. Our experience in running FESOM is that the drift
in the Southern Ocean is substantially affected by the imposed spatial
(horizontal and vertical) pattern of the GM coefficient

The streamfunction of meridional overturning circulation (MOC) shown in
Fig.

The difference in hydrography simulated on Glob15 compared to WOA2005 is
shown for the mean over the last 15 years in Figs.

In order to illustrate the eddy activity, we show the snapshot of subsurface
relative vorticity in the North Atlantic in Fig.

A snapshot of subsurface (40

The pattern of relative vorticity also reveals the existence of zonally
elongated patches corresponding to zonal jets which are often simulated with
the high-resolution ocean models, and are confirmed by the altimetric
observations

Eulerian-mean meridional overturning streamfunction averaged over
the last 15 years of 60-year simulations for FESOM1.4 on the reference
mesh

The MOC for Glob15 is shown in Fig.

The sea ice thickness simulated on Glob15 is shown in Fig.

In order to quantify the seasonal variability of the sea ice, we plot the
monthly time series of sea ice extents in Fig.

The simulated mean ice thickness distribution (m) in the Northern (top) and Southern (bottom) hemispheres in March (left) and September (right).

FESOM is written in Fortran 90 with some C/C++ code inserts providing bindings to the third party libraries. The code employs the distributed memory parallelization based on MPI (Message Passing Interface). The model experiments have been carried out on a Cray XC40 system hardwared with Intel Xeon Haswell and 24 cores per node, which was made available through the North-German Supercomputing Alliance (HLRN). The experience shows that the parallel scalability of both versions of FESOM starts to saturate after assigning less than 300 vertices of surface mesh per computational core. In view of this, the experiments on the reference were conducted using 384 cores (16 nodes).

Disregarding input/output, the throughput of FESOM1.4 is ca. 25 simulated
years per day (SYPD), where 92.5 and 7.5 % of the resources are spent in
the ocean and ice components, respectively. The resources spent in dynamical
(solving for

Using the same computer resources, the throughput of FESOM2.0 is 110 SYPD. In
this version, the resources between the ocean and sea ice components are
split as 67 and 33 %, respectively. The ocean component in FESOM2.0
demonstrates 7 times higher throughput than that of FESOM1.4, giving the
largest speedup in the tracer part, where it is even 9 times faster than in
FESOM1.4. The implementation of GM following

The Glob15 configuration was run on 1728 cores (72 nodes) giving a throughput
of 17 SYPD, with relative costs between model components remaining comparable
to those of the coarser-resolution reference setup. For this mesh the
relative cost of using pARMS decreases compared to the reference mesh despite
the much larger mesh and the number of cores. We guess that it is partly
linked to a smaller time step which improves the diagonal dominance in the
matrix of the sea surface height operator. Compared to the reference mesh,
which was run in the limit of linear scalability (

The numbers given above serve only to illustrate the computational
performance. Details may depend on the frequency of output, the type of
transport algorithm, the presence of isoneutral diffusion or GM
parameterization and the number of subcycles used in the
elastic-viscous-plastic sea ice
solver of FESIM

The simulated ice extent in the Northern (top) and Southern (bottom) hemispheres.

There are several reasons for developing a new dynamical core based on
finite-volume discretization. The first and main one is the need for enhanced
numerical efficiency. Generally, the codes based on unstructured meshes are
less efficient numerically than their structured-mesh counterparts, partly
because of (i) indirect indexing and the need for numerous auxiliary
(look-up) arrays (neighboring cells, vertices, matrices of horizontal
derivatives) and partly because of (ii) an increased share of floating-point
and memory-access operations needed in the absence of directional splitting
and mesh structure. The overhead related to (i) can be minimized in codes
using prismatic elements defined by unstructured surface meshes. In this case
the same 2-D auxiliary arrays can be used over the entire water column, which
makes the cost of assessing them rather moderate. The overhead of (3-D)
auxiliary arrays is much larger in FESOM1.4 because of its tetrahedral
elements needed to implement arbitrary level surfaces. Using bilinear
prismatic elements

The second reason for switching to a finite-volume discretization is that, as
mentioned in

Finally, the finite-volume discretization operates with clear definition of fluxes, which is much more convenient for post-processing. For example, it makes computations of the meridional overturning streamfunction much more straightforward and free of interpretation inconsistencies intrinsic to the continuous finite-element discretization. In addition, it also allows numerous transport algorithms, whereas the choice available for finite elements of a selected type is much more restrictive.

Among possible finite-volume discretizations, the cell–vertex discretization
used by FESOM2 presents a compromise allowing us to keep general triangular
meshes and use staggering of velocities and pressure. A collocated
vertex–vertex finite-volume discretization, which is the closest analog to
FESOM1.4, was explored by

Because of staggering and keeping the velocity vector, the triangular cell–vertex discretization is an analog of an inverted B-grid (we call it a quasi-B-grid). The inversion (the domain boundary is defined by scalar points) allows us to implement both free- and no-slip boundary conditions. Spurious inertial modes are absent on quadrilateral B-grids. This prompts us to consider hybrid meshes composed of triangles and quads, where the triangles will be used to provide transitions between regions of different resolution. The generalization to hybrid meshes is straightforward in the finite-volume implementation because most of the operations are implemented as a cycle over edges. Furthermore, since the number of edges on quadrilateral meshes is smaller than on triangular meshes for a given number of vertices, this also implies a speedup in the code performance. This strategy is already implemented in the coastal branch of FESOM (to be described elsewhere) and will be made available in FESOM later.

Two other variants of finite-volume discretization are used at present in
global ocean circulation models. MPAS

This paper describes version 2 of FESOM. The new numerical core uses a cell–vertex finite-volume discretization. FESOM2.0 compares well with FESOM1.4 in terms of simulated global ocean circulation. It inherits the model framework and the sea ice model of its predecessor, and is conceived so as to allow users familiar with FESOM1.4 to switch the versions easily. FESOM2.0 ensures higher numerical throughput than FESOM1.4, which makes it much closer to the structured-mesh models in terms of numerical efficiency. It offers new functionality through the ALE vertical coordinate. Future development will focus on the generalized vertical coordinates, high-order transport algorithms working on partly terrain-following meshes without excessive diapycnal mixing and on generalization to mixed meshes combining triangles and quads. FESOM2 will gradually replace FESOM1.4, yet the latter will be maintained and user support will be provided over several years to come.

The version of FESOM2.0 used to carry out simulations reported here can be
accessed from

When using the flux form of momentum, the natural choice is

The time stepping algorithm can be formulated as follows:

Do the predictor step and compute

Update for implicit viscosity.

Solve for new elevation. We write first

and similarly for other quantities, getting

and

Eliminating

In reality, everything remains similar to the vector-invariant case, and the matrix to be inverted is the same.

Correct the transport velocities as

Proceed with ALE and determine

The new velocities are estimated as

Here

We discuss modifications needed to solve for the external mode through
subcycling. This option will be added in future when needed for massively
parallel runs. We use the flux form of momentum advection as an example. We
take

We follow a common technology and run subcycles between time levels

For the same reason, the contribution from the elevation

Instead of Eqs. (

On completing sybcycles one is at time level

As an aside, we document another possibility which implements a pseudotime
solver. We want to solve the same pair of equations as Eqs. (

While this option is not cheaper than the commonly used one, it is equivalent
to the solution based on semi-implicit solvers, and warrants consistency.
Indeed, in this case

Meshes combining

For completeness, we write down the expressions for the horizontal and
vertical components of fluxes:

First, we split each triangular prism of our mesh into subvolumes characterized by unique values of the expansion/contraction coefficients, vertical gradients and horizontal gradients, to form triplets. We obtain six subprisms per prism, formed by sections along the midplane and by vertical planes passing through centroids and mid-edges.

Next, one writes the dissipation functional. We
will use a different but equivalent formulation. Consider the bilinear form

The last step is to compute the contribution to the rhs of the scalar
equation from the diffusion term

Note that since

In summary, the variational formulation originally proposed for quadrilaterals can easily be extended to triangular meshes. All symmetry properties will be granted if computations are local on subprisms.

Substituting

Let us start from the third term and compute its contribution to

In the expressions above, indices

Now, we combine the contributions from the column associated with cell

We continue with the contribution from

Now, performing differentiation with respect to

We return to the horizontal part in the expression for

The authors declare that they have no conflict of interest.

We are indebted to our colleagues N. Rakowski and S. Harig for their support and help with numerous details. We also acknowledge the contribution of K. Korchuk at early stages of FESOM2. The article processing charges for this open-access publication were covered by a Research Centre of the Helmholtz Association. Edited by: D. Ham Reviewed by: Y. J. Zhang and S. C. Kramer