the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Splitexplicit external mode solver in the finite volume sea ice–ocean model FESOM2
Patrick Scholz
Sergey Danilov
Knut Klingbeil
Dmitry Sidorenko
A novel splitexplicit (SE) external mode solver for the Finite volumE Sea ice–Ocean Model (FESOM2) and its subversions is presented. It is compared with the semiimplicit (SI) solver currently used in FESOM2. The splitexplicit solver for the external mode utilizes a dissipative asynchronous (forward–backward) timestepping scheme that does not require additional filtering during the barotropic subcycling. Its implementation with arbitrary Lagrangian–Eulerian vertical coordinates like Z star (z^{*}) and Z tilde ($\stackrel{\mathrm{\u0303}}{z}$) is explored. The comparisons are performed through multiple test cases involving idealized and realistic global simulations. The SE solver demonstrates lower phase errors and dissipation, while maintaining a simulated mean ocean state very similar to the SI solver. The SE solver is also shown to possess better runtime performance and parallel scalability across all workloads tested.
 Article
(3224 KB)  Fulltext XML
 BibTeX
 EndNote
All version 2 iterations of the Finite volumE Sea ice–Ocean Model (FESOM2; Danilov et al., 2017), the same as its predecessor FESOM1.4, rely on an implicit algorithm for solution of the external mode. Its computational algorithm maintains elementary options of the arbitrary Lagrangian–Eulerian (ALE) vertical coordinate, such as z^{*} or nonlinear free surface, but needs modifications to incorporate more general options beginning from $\stackrel{\mathrm{\u0303}}{z}$, where information on horizontal divergence in scalar cells is used when making decisions about layer thicknesses on a new time level in the internal (baroclinic) mode. This work aims to present the modified timestepping algorithm and its extension through a splitexplicit option for solution of the external mode. Many modern ocean circulation models rely on the splitexplicit method to solve for their external mode. The primary motivation behind such a choice is expectation of better parallel scalability in massively parallel applications. Indeed, as is well known, the need for global communications to calculate certain global dot products in most iterative solvers is a factor that potentially slows down the overall performance (see e.g. Huang et al., 2016; Koldunov et al., 2019). Although there are solutions minimizing the number of global communications per iteration (see e.g. Cools and Vanroose, 2017), as well as solutions where global communications are avoided (e.g. Huang et al., 2016), splitexplicit methods are an obvious alternative. This method is followed by the Geophysical Fluid Dynamics Laboratory (GFDL) Global Ocean and Sea Ice Model OM4 whose ocean component uses version 6 of the Modular Ocean Model (MOM; Adcroft et al., 2019), the Nucleus for European Modelling of the Ocean (NEMO; Madec et al., 2019), the Regional Oceanic Modelling System (ROMS; Shchepetkin and McWilliams, 2005), and the Model for Prediction Across Scales – Ocean (MPASO; Ringler et al., 2013) to mention some widely used cases. A careful analysis in Shchepetkin and McWilliams (2005) discusses many details of the numerical implementation for a splitexplicit external mode algorithm and proposes the AB3–AM4 (Adams–Bashforth and Adams–Moulton) method, which is at present followed by several models, i.e. ROMS (Shchepetkin and McWilliams, 2005), the Coastal and Regional Ocean COmmunity model (CROCO; Jullien et al., 2022), and FESOMC (Androsov et al., 2019). However, a recent analysis in Demange et al. (2019) suggests a simpler choice of dissipative forward–backward time stepping. The builtin dissipation in this case allows one to avoid filtering the external mode solution (see Shchepetkin and McWilliams, 2005). The simplest commonly used filter requires that an external mode be stepped across two baroclinic time steps to ensure temporal centring. This doubles the computational cost of the external mode solution. In the forward–backward dissipative method by Demange et al. (2019) an external mode is stepped precisely across one baroclinic time step and not beyond. This method was ultimately found to be the best choice around which to build a splitexplicit scheme for our purposes. Demange et al. (2019) also showed how dissipation can be added to the AB3–AM4 method of Shchepetkin and McWilliams (2005). We thus also explore dissipative AB3–AM4 for dissipation and phase errors. The rest of the paper is thus structured as follows. We begin with providing a breakdown for individual steps of splitexplicit schemes as adopted in FESOM2 (Sect. 2). It is then followed by a comparison of the individual temporal discretizations that are characteristic of these timestepping schemes (Sect. 3). We then perform various experiments comparing the new solver against the existing one using different test cases (Sects. 4 and 5). Finally, we summarize the results and argue for the new proposed scheme and its solver being a great choice for FESOM2 moving forward (Sect. 6).
This section provides a detailed description of the proposed asynchronous timestepping scheme for FESOM2 that incorporates a splitexplicit barotropic solver whose schematic can be found in Appendix A. FESOM in its standard version relies on a semiimplicit barotropic solver that already uses an asynchronous time stepping (Danilov et al., 2017). The asynchronous time stepping is a variant of forward–backward time stepping, which is formulated by considering scalar and horizontal velocities as being displaced by half a time step $\mathit{\tau}/\mathrm{2}$. The asynchronous time stepping is taken as the simplest option. Other timestepping options for the baroclinic part, such as a thirdorder Runge–Kutta method, are under consideration for future versions.
2.1 Momentum equation
The standard set of equations under the Boussinesq and standard approximations is solved. The equations are taken in a layerintegrated form, and the placement of the variables on the mesh is explained in Danilov et al. (2017). The layerintegrated momentum equation in the flux form is
with U_{k}=u_{k}h_{k} the horizontal transports, u the horizontal velocity, h_{k} the layer thickness, V_{h} the horizontal viscosity operator, ν_{v} the vertical viscosity coefficient, f the Coriolis parameter, e_{z} a unit vertical vector, and ${\mathrm{\nabla}}_{\mathrm{h}}=({\partial}_{x},{\partial}_{y})$ with respect to a constant model layer. Here, k is the layer index, starting from 1 ar the surface layer and increasing downward to the available number of levels with maximum value N_{l}. We ignore the momentum source due to the added water W at the surface. The term with the pressure gradient gρ∇_{h}Z_{k} accounts for the fact that layers deviate from geopotential surfaces. The quantity Z_{k} appearing in this term is the z coordinate of the midplane of the layer with the thickness h_{k}. The equation for elevation is written as
where $\stackrel{\mathrm{\u203e}}{\mathit{U}}={\sum}_{k}{\mathit{U}}_{k}$, and equations for layer thicknesses h_{k} and tracers will be presented in the following sections. Equations further in this section are for a particular layer k, and the index k will be suppressed. In the implementation described here, the discrete scalar state variables (elevation η, temperature T, salinity S, and layer thicknesses h) are defined at full time steps denoted by the upper index n, whereas the 3D velocities $\mathbf{v}=(\mathit{u},w)$ and horizontal transports U are defined at halfinteger time steps ($n+\mathrm{1}/\mathrm{2},\mathrm{\dots}$). Since thicknesses and horizontal velocities are not synchronous, layer transports U are chosen as prognostic variables. Using them, we avoid the question of ${h}^{n+\mathrm{1}/\mathrm{2}}$ up to the moment of barotropic correction. Note that the flux form of the momentum advection is used in Eq. (1). Adjustments needed for other forms are straightforward and will not be discussed here.
First, we estimate the transport ${\mathit{U}}^{n+\mathrm{1}/\mathrm{2},*}$ assuming that ${h}^{n},{T}^{n},{S}^{n}$, η^{n}, ${\mathit{u}}^{n\mathrm{1}/\mathrm{2}}$, ${\mathit{U}}^{n\mathrm{1}/\mathrm{2}}$, and ${w}^{n\mathrm{1}/\mathrm{2}}$ are known.
The ${\mathit{R}}_{U}^{i}$ terms with $i=A,C,P,\mathrm{hV}$ indicate advective, Coriolis, pressure gradient, and horizontal viscosity components estimated at time step n for $i=A,C,P$, and $n\mathrm{1}/\mathrm{2}$ for the horizontal viscosity. The momentum advection term is
where ${}_{\mathrm{b}}^{\mathrm{t}}$ implies that the difference between the top and bottom interfaces of layer k is taken. The Coriolis term is
Fields entering these advection and Coriolis terms are known at $n\mathrm{1}/\mathrm{2}$, and the second or thirdorder Adams–Bashforth method is used to get an estimate of ${\mathit{R}}_{U}^{A}$ and ${\mathit{R}}_{U}^{C}$ at n. For any quantity f, ${f}^{\mathrm{AB}}=(\mathrm{3}/\mathrm{2}+\mathit{\beta}){f}^{n}(\mathrm{1}/\mathrm{2}+\mathrm{2}\mathit{\beta}){f}^{n\mathrm{1}}+\mathit{\beta}{f}^{n\mathrm{2}}$. For classical thirdorder interpolation (AB3), β is $\mathrm{5}/\mathrm{12}$, while β=0 gives the secondorder result (AB2). The pressure gradient term is,
where ∇_{z} means differencing at constant z. Since pressure and thicknesses are known at the time level n, no interpolation is needed. This is one of the advantages of asynchronous time stepping. Depending on how much layer thicknesses are perturbed, other algorithms than that written above can be applied to minimize pressure gradient errors. As a default, the approach by Shchepetkin and McWilliams (2003) is used in FESOM. The pressure contains contributions from η, density perturbations in the fluid column, and contributions from atmospheric and ice loading. The contribution from horizontal viscosity is of either the harmonic or biharmonic type. For simplicity, we write it here as
The implicit contribution from vertical viscosity is added as
The latter equation is rewritten for increments $\mathrm{\Delta}\mathit{u}={\mathit{u}}^{n+\mathrm{1}/\mathrm{2},**}{\mathit{u}}^{n+\mathrm{1}/\mathrm{2},*}=({\mathit{U}}^{n+\mathrm{1}/\mathrm{2},**}{\mathit{U}}^{n+\mathrm{1}/\mathrm{2},*})/{h}^{*}$ and solved for Δu. At this stage of the scheme, a reliable estimate for h^{*} at $n+\mathrm{1}/\mathrm{2}$ is not available, but, since A_{v} is a parameterization and this step is first order in time, we use ${h}^{*}={h}^{n}$. When Δu is obtained, we update $\mathit{\tau}{\mathit{R}}_{U}^{vV}=\mathrm{\Delta}\mathit{u}{h}^{n}$ and ${\mathit{U}}^{n+\mathrm{1}/\mathrm{2},**}={\mathit{U}}^{n+\mathrm{1}/\mathrm{2},*}+\mathrm{\Delta}\mathit{u}{h}^{n}$. In preparation for the barotropic time step, the vertically integrated forcing from baroclinic dynamics is computed as
Here, ${\stackrel{\mathrm{\u0303}}{R}}_{U}^{\mathrm{P}}$ represents the pressure gradient force excluding the contribution of η, as it will be accounted for explicitly in the barotropic equation. The Coriolis term is also omitted for the same reason. The vertically summed contribution from vertical viscosity is in reality the difference in surface stress and bottom stress. The bottom stress in FESOM is commonly computed as ${C}_{\mathrm{d}}\left{\mathit{u}}_{\mathrm{b}}^{n\mathrm{1}/\mathrm{2}}\right{\mathit{u}}_{\mathrm{b}}^{n\mathrm{1}/\mathrm{2}}$, where u_{b} is the bottom velocity.
2.2 Barotropic time stepping
Next is the barotropic step where η and $\stackrel{\mathrm{\u203e}}{\mathit{U}}={\sum}_{k}{\mathit{U}}_{k}$ are estimated by solving
Here, $H={H}_{\mathrm{0}}+\mathit{\eta}$, W is the freshwater flux (positive out of ocean), and $\stackrel{\mathrm{\u203e}}{\mathit{R}}$ the is the forcing from the 3D part defined above. These equations are solved from time step n to n+1 as detailed below. Note that the baroclinic forcing term is taken at time level n. Centring it at $n+\mathrm{1}/\mathrm{2}$ would have involved many additional computations and is not implemented at present. This set of equations presents a minimum model. It is sufficient for basins with simple geometry. In realistic applications, it has been found that an additional viscous regularization term is needed to suppress oscillations in narrow straits with irregular coastlines. In such cases we add
to the righthand side of momentum Eq. (6) and subtract the initial value of this term from $\stackrel{\mathrm{\u203e}}{\mathit{R}}$ at each baroclinic time step. Here, ${\stackrel{\mathrm{\u203e}}{A}}_{\mathrm{h}}$ is the viscosity coefficient tuned experimentally to ensure stability in narrow, shallow regions. We express it as a combination of some background viscosity and a flowdependent part that is proportional to the differences in barotropic velocity across cell edges. The subtraction mentioned serves to minimize the inconsistency created by adding the new term. Note that in coastal applications, one generally keeps bottom drag acting on the barotropic flow as well as barotropic momentum advection (Klingbeil et al., 2018). We treat them as slow processes here, but modifications might be needed for possible future applications.
As a default time stepping for the barotropic part, the forward–backward dissipative time stepping by Demange et al. (2019) is used. It is abbreviated as SE (for splitexplicit, as it will be the default choice) in subsequent sections.
Here, M is the total number of barotropic substeps per baroclinic step τ, and θ controls dissipation. The value of θ=0.14 is mentioned by Demange et al. (2019) as being sufficient^{1}. This method is firstorder accurate for θ≠0. We also used another version that is based on the AB3–AM4 (Adams–Bashforth – Adams–Moulton) approach of Shchepetkin and McWilliams (2005), with dissipative corrections as proposed in Demange et al. (2019) (abbreviated as SESM below). The specific versions of AB3 and AM4 used are
with appropriate values of $\mathit{\beta},\mathit{\delta},\mathit{\gamma},\mathit{\zeta}$ discussed later. The time stepping takes the form^{2}
The modified AB3–AM4 scheme should also be moved to the firstorder accuracy as concerns amplitudes but remains higherorder with respect to phase errors.
2.3 Reconciliation of barotropic and baroclinic mode
Note that the use of dissipative time stepping in Eq. (8) or Eq. (10) allows one to abandon the filtering of η and $\stackrel{\mathrm{\u203e}}{\mathit{U}}$ during the barotropic step that would be needed if nondissipative forward–backward (θ=0) or the original AB3–AM4 schemes were applied instead (see Shchepetkin and McWilliams, 2005). The most elementary form of filtering involves integration to n+2 with subsequent averaging to n+1, which would double the computational expense for the barotropic solver. To be in agreement with the traditional notation (Shchepetkin and McWilliams, 2005), we write $\langle \mathit{\eta}{\rangle}^{n+\mathrm{1}}={\mathit{\eta}}^{n+m/M}$ and $\langle \stackrel{\mathrm{\u203e}}{\mathit{U}}{\rangle}^{n+\mathrm{1}}={\stackrel{\mathrm{\u203e}}{\mathit{U}}}^{n+m/M}$for m=M (there would be a difference if filtering were needed). By summing the elevation equations over M substeps, one gets for the forward–backward dissipative case (Eq. 8)
where
While η is consistently initialized with 〈η〉^{n} for m=0, there is no good answer for $\stackrel{\mathrm{\u203e}}{\mathit{U}}$. One can use the last available $\langle \stackrel{\mathrm{\u203e}}{\mathit{U}}{\rangle}^{n}$, but because 3D and barotropic velocities are integrated using different methods, this may lead to divergences with time unless some synchronization with 3D velocities is foreseen. We return to this topic below. On time level n+1, the total thickness becomes ${H}^{n+\mathrm{1}}={H}^{\mathrm{0}}+\langle \mathit{\eta}{\rangle}^{n+\mathrm{1}}$. The horizontal transport is finalized by making the vertically integrated transport equal to the value obtained from the barotropic solution.
2.4 Finalization of baroclinic mode
The estimate of the thickness at $n+\mathrm{1}/\mathrm{2}$ depends on the option of the ALE vertical coordinate and will be detailed below. In treating the scalar part, we are relying on the VALE (vertical ALE) approach in the terminology of Griffies et al. (2020). It is assumed that there is some external procedure to predict ${h}_{k}^{n+\mathrm{1}}={h}_{k}^{\mathrm{target}}$ constrained by the condition ${\sum}_{k}{h}_{k}^{n+\mathrm{1}}={H}^{\mathrm{0}}+\langle \mathit{\eta}{\rangle}^{n+\mathrm{1}}$. In the simplest case, this is the z^{*} vertical coordinate with ${h}_{k}^{n+\mathrm{1}}={h}_{k}^{\mathrm{0}}(H/{H}^{\mathrm{0}})$. Here, as well as in other cases when the decision regarding h^{target} does not depend on layer horizontal divergences, ${h}^{n+\mathrm{1}/\mathrm{2}}$ in Eq. (13) is half the sum of the n and n+1 values. In more complicated cases, such as $\stackrel{\mathrm{\u0303}}{z}$ (Leclair and Madec, 2011; Petersen et al., 2015; Megann et al., 2022), the horizontal divergence in layers $\mathrm{\nabla}\cdot {\mathit{U}}_{k}^{n+\mathrm{1}/\mathrm{2}}$ is needed to predict ${h}_{k}^{n+\mathrm{1}}$, and a reliable estimate of ${h}^{n+\mathrm{1}/\mathrm{2}}$ is not immediately available. Requiring that ${h}_{k}^{n+\mathrm{1}}$ is smooth and positive and also satisfies the barotropic constraint ${\sum}_{k}{h}_{k}^{n+\mathrm{1}}={H}^{\mathrm{0}}+\langle \mathit{\eta}{\rangle}^{n+\mathrm{1}}$ could be a nontrivial task and may require a special procedure (see Hallberg and Adcroft, 2009 and Megann et al., 2022) that simultaneously adjusts ${\mathit{U}}_{k}^{n+\mathrm{1}/\mathrm{2}}$ and h^{n+1}. The description of the current implementation of $\stackrel{\mathrm{\u0303}}{z}$ in FESOM is presented in Appendix C. The potential presence of such complications is the reason why the decision regarding h^{n+1} is delayed to the end, and the discretization of momentum equation is performed in terms of U. Once the new thickness is determined, the thickness equation,
is used to estimate the diasurface velocity w. Tracers are then advanced, first taking into account advection and horizontal (isoneutral) diffusion before being trimmed by implicit vertical diffusion.
Here, T_{W} it the value of scalar T in a freshwater flux. In this procedure, if T^{n}=const, the second equation will return this constant in T^{*}. The two equations above could have been combined into a single one. We treat them separately to avoid the loss of some significant digits (and to avoid ensuing errors in constancy preservation). Before solving the last equation in Eq. (15), it is rewritten for the increment $\mathrm{\Delta}T={T}^{n+\mathrm{1}}{T}^{*}$.
While ${\mathit{U}}_{k}^{n+\mathrm{1}/\mathrm{2}}$, trimmed as given by Eq. (13), ensures by virtue of the first equation in Eq. (15) that ${\sum}_{k}{h}_{k}^{n+\mathrm{1}}={h}^{\mathrm{0}}+\langle \mathit{\eta}{\rangle}^{n+\mathrm{1}}$ (as required for perfect volume conservation), its vertical sum ${\sum}_{k}{\mathit{U}}_{k}^{n+\mathrm{1}/\mathrm{2}}=\langle \langle \stackrel{\mathrm{\u203e}}{\mathit{U}}\rangle {\rangle}^{n+\mathrm{1}/\mathrm{2}}$ deviates from the barotropic transport at time level $n+\mathrm{1}/\mathrm{2}$ (i.e for $m=M/\mathrm{2}$). We tried to compensate for this difference by saving ${\stackrel{\mathrm{\u203e}}{\mathit{U}}}^{n+m/M}$ for $m=M/\mathrm{2}$ in the barotropic step and retrimming the 3D transports once tracers are advanced, but this has been found to be redundant in practice^{3} ^{4}. The number of barotropic substeps M depends on the quality of meshes with varying resolutions. It can always be estimated based on mesh cell size and local depth. The presence of particularly small cells in deep water may limit the scheme globally even if the rest of the mesh is regular. Such limitations are absent in the present version of FESOM that is based on an implicit barotropic solver. Appendix B summarizes the changes needed to extend it (Danilov et al., 2017) to more general ALE options.
This section provides a detailed numerical analysis of the new external mode solver when using SE or SESM time stepping vs. the current semiimplicit solver (Danilov et al., 2017) of FESOM2, which uses a firstorder implicit time stepping in global simulations. Although parts of these implementations are already known, we repeat them here for clarity and comparison. A simple prototype system relevant for this analysis is
where ${c}_{\mathrm{p}}=(g{H}_{\mathrm{0}}{)}^{\mathrm{1}/\mathrm{2}}$ is the phase velocity, $\stackrel{\mathrm{\u0303}}{u}$ the dimensionless vertically averaged velocity ($\stackrel{\mathrm{\u203e}}{u}/{c}_{\mathrm{p}}$), and $\stackrel{\mathrm{\u0303}}{\mathit{\eta}}$ the dimensionless surface elevation ($\mathit{\eta}/H$). It will be assumed that $\stackrel{\mathrm{\u0303}}{\mathit{\eta}},\stackrel{\mathrm{\u0303}}{u}\sim {e}^{ikx}$, where k is the wave number.
3.1 Characteristic matrix forms
3.1.1 Semiimplicit (SI) method of Danilov et al. (2017)
We begin with the semiimplicit method of Danilov et al. (2017) used in the current model, FESOM2, which was adapted from FESOM 1.4 (Wang et al., 2014) and can be written as
Here, $\mathrm{1}/\mathrm{2}\le \mathit{\theta},\mathit{\alpha}\le \mathrm{1}$ are the control parameters, and c=c_{p}kτ is the Courant number. The characteristic matrix form for this scheme is
3.1.2 Splitexplicit (SESM) method of Shchepetkin and McWilliams (2005)
The explicit method of Shchepetkin and McWilliams (2005) based on an advanced forward–backward method combining the AB3 and AM4 steps can be expressed as
Here, too, $\mathit{\beta},\mathit{\delta},\mathit{\gamma},\mathit{\zeta}$ are the control parameters. Its characteristic matrix form is then
3.1.3 Splitexplicit (SE) method of Demange et al. (2019)
Finally, the SE method by Demange et al. (2019) can be expressed as
with θ being the control parameter. Its characteristic matrix form is
3.2 Dissipation and phase analysis
Depending on the control parameters, the schemes above may lead to different dissipation and phase errors. Let the characteristic matrices of Eqs. (18), (20), and (22) be denoted by M_{c}. For I as an identity matrix of same rank as M_{c} and λ an eigenvalue of M_{c}, the characteristic polynomials for each scheme, given by det(M_{c}−λI), are
Here, the index i serves to distinguish between the SE, SI, and SESM schemes and their control parameters. The equations were obtained using the Maple symbolic solver. If λ is an eigenvalue, then P(λ)_{i}=0. Given that the physical eigenvalue should closely resemble the continuous solution λ=e^{ic}^{5}, we can expand it for small c as
If the schemes are to be at least secondorder dissipative with respect to c (see Demange et al., 2019), λ must also obey the relationship
where χ is a parameter characterizing dissipation. Similarly, the phase ${\mathrm{tan}}^{\mathrm{1}}\left(\mathrm{\Im}\right({\mathit{\lambda}}_{\mathrm{p}})/\mathfrak{R}({\mathit{\lambda}}_{\mathrm{p}}\left)\right)$ must also closely resemble the ideal phase c. The two conditions (24) and (25) then tie all control parameters together. From the requirement to remain formally secondorder dissipative, one gets the conditions
with the requirement that m=i. Note that here $i=\sqrt{\mathrm{1}}$ and that each equation in Eq. (26) admits its own set of parameters, i.e. θ_{SI}≠θ_{SE}. We also reject the possibility of $m=i$ as it immediately gives the wrong phase.
Note that here too as in Eq. (26) the parameters will be different for each scheme, i.e. θ_{SI}≠θ_{SE}. At this point, only the SE scheme is fully defined. The other schemes still have free parameters in need of optimization – α for SI and $\mathit{\beta},\mathit{\gamma},\mathit{\zeta}$ for the SESM scheme. As in Shchepetkin and McWilliams (2005), β can be set to 0.281105 for the largest stability limit. Given that dissipation is now the same (up to the second order), one can seek to optimize for phase errors. If thirdorder phase accuracy is desirable, then for the SI scheme, it is only possible if $\mathit{\alpha}=\mathit{\chi}+\mathrm{1}/\mathrm{2}\pm (\mathrm{1}/\mathrm{6})\sqrt{\mathrm{6}+\mathrm{72}{\mathit{\chi}}^{\mathrm{2}}}$. For the SESM scheme, this gives $\mathit{\gamma}={\mathit{\chi}}^{\mathrm{2}}\mathrm{3}\mathit{\zeta}+\mathrm{1}/\mathrm{3}\mathit{\beta}$. This still leaves ζ open for optimization. It can be obtained through further optimizing for either phase accuracy or stability limit. If optimizing for stability limit, the limit can be pushed much higher, as in Demange et al. (2019) if one relaxes the thirdorder phase accuracy constraint. The results of both optimizations are as follows,
Here, β,δ retain their earlier description. To demonstrate the benefit in terms of phase accuracy for these splitexplicit schemes, we analyse their net amplitude and phase errors per baroclinic time step, assuming that it consists of M=30 barotropic steps in Fig. 1, together with the errors in the SI scheme, vs. the baroclinic Courant–Friedrichs–Lewy (CFL) number c=c_{p}τk, where τ is the baroclinic time step. In practical cases, where the largest k is defined by the mesh resolution Δx (i.e. $k\to \mathit{\pi}/\mathrm{\Delta}x$), this baroclinic CFL number can become rather large. Because of the subcycling, the barotropic CFL is M times smaller and stays within the stability bounds of the explicit schemes. Figure 1 shows how all tested schemes in reality are able to maintain low dissipation even for high baroclinic CFLs. The choice is then made based on phase accuracy, which is very different between the implicit and explicit schemes. It is seen that both the SE and SESM schemes have ordersofmagnitudelower phase errors compared to the SI scheme. For high c, the SI scheme has to be used with parameters α and θ to ensure strong damping of wavenumbers with large dispersive errors. Also, between the SESM and SE schemes, SESM seems to be the most accurate, even for same dissipation. To conclude the tests on phase accuracy, we report that irrespective of dissipation level, the splitexplicit schemes will always by design provide ordersofmagnitudebetter phase accuracy compared to the semiimplicit implementation, especially in the range of high CFL numbers, i.e. smaller wavelengths.
This section compares measurements from the new external mode solver to the existing one in FESOM2. The tests are done for both an idealized case and a realistic global setup. The idealized case is expected to highlight threshold performance of the new solver compared to the global case, where its impact will also be governed by mesh nonuniformity, the presence of external forcing, complicated boundaries, and bottom topography. The global case will, however, crucially assess the practicality of this new solver.
4.1 Idealized channel
In Sect. 3 (see Fig. 1), the primary characteristics of the new schemes were explored. In this idealized case, the solvers are tested for correct representation of the mean dynamics. We use the zonally reentrant channel described in Soufflet et al. (2016). It is 2000 km long (north–south), 500 km wide (east–west), and 4 km deep. We test 10 km meshes of different types (triangular, quadrilateral) and unequally spaced vertical levels (40, 60). The baroclinic time step is τ=720 s, and surface gravity wave speed ${c}_{\mathrm{p}}=(g{H}_{\mathrm{0}}{)}^{\mathrm{1}/\mathrm{2}}=\mathrm{200}$ m s^{−1} so that a mesh cell is crossed by waves in less than 50 s. We take M=30 for the splitexplicit solver, which means $\mathit{\tau}/M=\mathrm{24}$ s. The initial density stratification due to temperature corresponds to a zonal jet. The zonally mean stratification and velocity are relaxed to their initial distributions. Some initial temperature perturbation leads to an onset of baroclinic instability, which is maintained through the relaxation of the zonal mean profiles. Simulations are run for 20 years, and the last 13 years are used to compute means. Figure 2 shows that mean depth profiles for this case are not affected by implementation of the new solver regardless of mesh configuration, i.e. a different mesh structure and number of vertical layers. This can be attributed to the predominantly baroclinic character of the flow, so the lower dissipation in the new solver is not necessarily seen. This is further supported by the observation that when the bottom drag is reduced, i.e. when the flow is made more barotropic, we see the separation in mean eddy kinetic energy between the two solvers grow, as shown in Fig. A2.
4.2 Surface gravity wave
In this test, we further show how the SE solver for the barotropic mode being proposed is less dissipative and how the differences between it and the current SI solver of FESOM2 become more obvious when the barotropicity of the flow is dominant. For that, we use a simple surface gravity wave (SGW) setup where we simulate a channel (of the same geometry as in Sect. 4.1) with an initial elevation distribution that is meridionally Gaussian, i.e. $\mathrm{ln}(\mathit{\eta}/A)=(y{y}_{\mathrm{mid}}{)}^{\mathrm{2}}/{\mathit{\sigma}}^{\mathrm{2}}$, where A=3 m is the amplitude and σ=200 km is the halfbell width. The temperature is set at T=20 °C, the velocities are initialized to 0, and the simulation is run for 3 d with a baroclinic time step of τ=5 min. As seen in Fig. 3, which shows the final elevation for both cases after 3 d as well as time series of their available potential energy (APE), the APE for the SI solver asymptotes to 0 within the first 10 h, with the SI solver heavily damping all the SGWs, whereas the SE solver still maintains strong SGWs along with most of its APE. In contrast to the results for the baroclinic test case of Sect. 4.1, here we see a clear difference between the two solvers because of the flow being purely barotropic. As such, we see the gains in terms of the better energetic representation that the new SE solver promises.
4.3 Realistic global ocean
For this case, we now test a more complicated case of a global ocean–sea ice simulation similar to the one used by Scholz et al. (2022). We use the standard coarse mesh of FESOM2 with a minimum resolution of 25 km north of 25° N and a coarse resolution of around 1.5° in the interior of the ocean, with further moderate refinements in the equatorial belt and around Antarctica. The mesh configuration consists of 47 vertical levels with a minimum layer thickness of 10 m near the surface, up to 250 m near the abyssal depth. The baroclinic time step is τ=2700 s, and we take M=50 for the splitexplicit solver, which means that $\mathit{\tau}/M=\mathrm{54}$ s. The simulations were forced with the JRA55do v1.4.0 reanalysis data covering the period from 1958 to 2019. To show the differences in the simulations carried out with the SI and the SE barotropic solvers, we only show mean elevation, surface temperature, and kinetic energy over the last 20 years (1999–2019) of the simulation period. Due to the high similarity of SE and SESM results (as seen earlier in Fig. 2 for the idealized case), only SE results are shown in Fig. 4. The differences in sea surface elevation are found to be rather small. The pattern of difference in the sea surface temperature is most likely associated with the transient variability that is different in the two setups. The eddy kinetic energy increases everywhere outside the equatorial belt. This increase could be associated with the reduction in overall dissipation due to use of the SE barotropic solver and the observation that the barotropic kinetic energy contributes most to the overall kinetic energy budget at middle and high latitudes, as shown in Aiki et al. (2011).
In summary, no significant difference in terms of timeaveraged measurements from the new SE external mode solver was observed. For both the idealized and the global test cases, the new SE external mode solver maintained mean dynamics close to those reported by the current SI solver.
This section compares the parallel scalability of the new external mode solver to the existing one of FESOM2. As in Sect. 4, we again utilize the two test cases – idealized and global – described earlier. Additionally, the two cases are also executed on different computer clusters providing for even better estimation of their general performance. Again, because of high similarities between SE and SESM parallel scalability in comparison to SI, the plots of SESM are omitted. In reality, SESM was found to be slightly less scalable than SE. The simulations are run for many model steps, and the mean total time per task the model spends for the barotropic solver is measured. For the idealized case, the simulations were performed using the Ollie HPC at the Alfred Wegener Institute equipped with Intel Xeon E52697 v4 (Broadwell) CPUs (308 nodes with 36 cores). To ensure a sufficient workload, we used a fine 2 km triangular mesh with 60 vertical layers on the same channel setup as in Soufflet et al. (2016). The mesh contains approximately 2.5×10^{5} vertices, so the setup is expected to scale almost linearly to about 10^{3} cores, according to our previous experience (Koldunov et al., 2019). The baroclinic time step has been reduced to τ=144 s, and M was left without changes. For this case, the simulations were run for 600 steps, i.e. 1 simulated day. As seen from Fig. 5 (left panel), the new external mode solver (SE) scales significantly better and is faster than the current SI solver of FESOM across all workloads. In reality, the relative speed of the SE versus SI solver will depend on M and on the efficiency of the preconditioner in SI and may change. Since the barotropic solver takes only a part of the total time step (10 %–20 %), the improved scalability of the SE solver contributes noticeably to the reduction in total computing time only after approximately 400 surface vertices per core, as shown in the left panel of Fig. 5. Its impact becomes significant only when parallelized beyond this limit. For the global case, the measurements were performed using the Albedo HPC at the Alfred Wegener Institute with 2xAMD Epyc7702 CPUs (240 nodes with 128 cores). It uses the same setup and mesh from the global case in Sect. 4. The mesh contains approximately 1.27×10^{5} vertices. The baroclinic time step and M have been left unchanged (τ=2700 s and M=50, respectively). Here, simulations were run for 11 680 steps, i.e. 1 simulated model year. Similar to the findings from the idealized channel case, Fig. 5 (right panel) shows the new SE solver scaling similarly faster and further for the global case also. We again observe a perceivable difference across all workloads. Similar to the idealized case, these performance improvements only become significant for highly parallelized workflows, i.e. for fewer than approximately 400 vertices per core. In summary, the performance of the SE solver shows visible improvement in parallelization and computing time over the SI solver across all tested workloads. The behaviour remained the same over different test cases (idealized and global) and different computer resources (Ollie HPC, Albedo HPC). For lessparallel workloads, the benefits are marginal, but they become significant for highly parallelized workflows, i.e. in cases with fewer than approximately 400 vertices/core.
A preliminary implementation of the new splitexplicit external mode solver within the sea ice model FESOM2 as proposed in this paper, including the test cases, can be found in the public repository at https://doi.org/10.5281/zenodo.10040944 (Banerjee et al., 2023).
The new splitexplicit external mode solver proposed in this paper is more phase accurate, faster, and more scalable than the SI solver used in FESOM (Danilov et al., 2017). The dissipative asynchronous timestepping scheme (SE) of Demange et al. (2019) is able to deliver phase accuracy orders of magnitude higher than the firstorder SI scheme used before. It also provides comparable phase accuracy and dissipation to the dissipatively modified AB3–AM4 scheme (SESM) of Shchepetkin and McWilliams (2005). No filtering of fast dynamics is required due to the dissipative character of the SE solver. It is easier to implement compared to SESM, and leads to very similar results as SESM in practice.
The new SE solver is one part of the adjusted time stepping of FESOM that facilitates the use of the arbitrary Lagrangian–Eulerian vertical coordinate. As a demonstration, we extended z^{*} in FESOM2 to the $\stackrel{\mathrm{\u0303}}{z}$ vertical coordinate in the development version of FESOM2. The implementation of $\stackrel{\mathrm{\u0303}}{z}$ is outlined in Appendix C but still needs to be tested in realistic simulations. Across different test cases using different mesh geometries and computing resources, the new solver is shown to represent mean dynamics similarly to the existing solver with no significant difference. In the case of runtime performance and parallel scalability, it is shown to improve across all workloads. The improvements are shown to be especially significant for highly parallelized workloads. Kang et al. (2021) presents a semiimplicit solver for MPAS, showing that in contrast to the present work, it is more computationally efficient than their splitexplicit solver. While a detailed answer to the question why the opposite conclusion was reached needs a separate study, here we can only mention that by following Demange et al. (2019), we perform fewer external time steps per baroclinic time step than in MPAS (Ringler et al., 2013).
We note that on unstructured meshes, a semiimplicit method can be more forgiving than a splitexplicit one toward the size of mesh elements. A small element in deep water will hardly affect the solution of the semiimplicit solver but may require an increased number of barotropic substeps in a splitexplicit method. This is why the semiimplicit option will be maintained in FESOM alongside the novel splitexplicit option. It will however, be modified to allow more general ALE options as described in Appendix B. To conclude, this work suggests that the new splitexplicit external mode solver is a good alternative to the existing solver of FESOM2.
The main difference from the splitexplicit method is that the elevation η has to be defined at the same time levels as the horizontal velocity. The elevation is therefore detached from the thicknesses, which creates some conceptual difficulty. We consider quantities at $n\mathrm{1}/\mathrm{2}$ and n to be known (to start from velocity).

The predictor step is as follows
$$\begin{array}{rl}{\mathit{U}}^{*}& ={\mathit{U}}^{n\mathrm{1}/\mathrm{2}}+\mathit{\tau}({\mathit{R}}_{U}^{A}+{\mathit{R}}_{U}^{C}+{\stackrel{\mathrm{\u0303}}{\mathit{R}}}_{U}^{\mathrm{PGF}}{)}^{n}+\mathit{\tau}({\mathit{R}}_{U}^{\mathrm{hV}}{)}^{\mathrm{v}}\\ & \mathit{\tau}g{h}^{n}{\mathrm{\nabla}}_{\mathrm{h}}{\mathit{\eta}}^{n\mathrm{1}/\mathrm{2}}\mathit{\tau}.\end{array}$$Here, tilde implies that the contribution from the elevation to the pressure gradient force (PGF) is omitted. It is taken into account explicitly (the last term). However, since ${\mathit{\eta}}^{n+\mathrm{1}/\mathrm{2}}$ is unknown, we take the value from the current time level $n\mathrm{1}/\mathrm{2}$. Our intention is to get the SemiImplicit form $\mathit{\theta}{\mathit{\eta}}^{n+\mathrm{1}/\mathrm{2}}+(\mathrm{1}\mathit{\theta}){\mathit{\eta}}^{n\mathrm{1}/\mathrm{2}}$, $\mathrm{1}/\mathrm{2}\le \mathit{\theta}\le \mathrm{1}$ in the end. The momentum advection and Coriolis terms are AB2 or AB3 interpolated to n. Implicit vertical viscosity is taken into account by solving
$${\mathit{U}}^{**}={\mathit{U}}^{*}+\mathit{\tau}\left({A}_{\mathrm{v}}{\partial}_{z}{\mathit{u}}^{**}\right){}_{\mathrm{b}}^{\mathrm{t}}.$$It is solved similarly to the splitexplicit asynchronous case.

The corrector step is as follows
$${\mathit{U}}^{n+\mathrm{1}/\mathrm{2}}={\mathit{U}}^{**}\mathit{\tau}\mathit{\theta}g{h}^{n}{\mathrm{\nabla}}_{\mathrm{h}}({\mathit{\eta}}^{n+\mathrm{1}/\mathrm{2}}{\mathit{\eta}}^{n\mathrm{1}/\mathrm{2}}).$$This step is only written but is evaluated after ${\mathit{\eta}}^{n+\mathrm{1}/\mathrm{2}}$ is available.

We write the elevation step as
$${\mathit{\eta}}^{n+\mathrm{1}/\mathrm{2}}{\mathit{\eta}}^{n\mathrm{1}/\mathrm{2}}=\mathit{\tau}{\mathrm{\nabla}}_{\mathrm{h}}\cdot \sum _{k}(\mathit{\alpha}{\mathit{U}}_{k}^{n+\mathrm{1}/\mathrm{2}}+(\mathrm{1}\mathit{\alpha}\left){\mathit{U}}_{k}^{n\mathrm{1}/\mathrm{2}}\right).$$Here, $\mathrm{1}/\mathrm{2}\le \mathit{\alpha}\le \mathrm{1}$, which is needed for stability. This equation has to be solved together with the corrector equation. We express ${\mathit{U}}^{n+\mathrm{1}/\mathrm{2}}$ from the corrector equation and insert the corrector step into the elevation equation to get
$$\mathit{\delta}\mathit{\eta}=g\mathit{\theta}\mathit{\alpha}{\mathit{\tau}}^{\mathrm{2}}{\mathrm{\nabla}}_{\mathrm{h}}\cdot {H}^{n}{\mathrm{\nabla}}_{\mathrm{h}}\mathit{\delta}\mathit{\eta}\mathit{\tau}{\mathrm{\nabla}}_{\mathrm{h}}\cdot \sum _{k}(\mathit{\alpha}{\mathit{U}}_{k}^{**}+(\mathrm{1}\mathit{\alpha}\left){\mathit{U}}_{k}^{n\mathrm{1}}\right).$$This equation is solved for $\mathit{\delta}\mathit{\eta}={\mathit{\eta}}^{n+\mathrm{1}/\mathrm{2}}{\mathit{\eta}}^{n\mathrm{1}/\mathrm{2}}$, giving ${\mathit{\eta}}^{n+\mathrm{1}/\mathrm{2}}={\mathit{\eta}}^{n\mathrm{1}/\mathrm{2}}+\mathit{\delta}\mathit{\eta}$.

The corrector step is used to compute ${\mathit{U}}^{n+\mathrm{1}/\mathrm{2}}$.

We write the thickness equation for the ALE step as
$${h}_{k}^{n+\mathrm{1}}{h}_{k}^{n}=\mathit{\tau}[{\mathrm{\nabla}}_{\mathrm{h}}\cdot {\mathit{U}}_{k}^{n+\mathrm{1}/\mathrm{2}}+w{}_{\mathrm{b}}^{\mathrm{t}}].$$These equations are summed vertically to give
$${H}^{n+\mathrm{1}}{H}^{n}=\mathit{\tau}{\mathrm{\nabla}}_{\mathrm{h}}\cdot \sum _{k}{\mathit{U}}_{k}.$$The quantity ${H}^{n+\mathrm{1}}{H}^{\mathrm{0}}$ is the elevation at time step n+1. It is used to define h^{n+1} for the z^{*} vertical coordinate. The extension to $\stackrel{\mathrm{\u0303}}{z}$ follows similarly to the SE case. After h^{n+1} is defined, w is found from the thickness equation.

The tracers are as follows
$$\begin{array}{rl}(\mathrm{Th}{)}_{k}^{n+\mathrm{1}}& (\mathrm{Th}{)}_{k}^{n}=\mathit{\tau}\left[\mathrm{\nabla}\cdot \left({\mathit{U}}_{k}^{n+\mathrm{1}/\mathrm{2}}{T}_{k}^{n+\mathrm{1}/\mathrm{2}}\right)+w{T}_{k}^{n+\mathrm{1}/\mathrm{2}}{}_{\mathrm{b}}^{\mathrm{t}}\right]\\ & +\mathit{\tau}(\mathrm{\nabla}{h}^{n}\mathbf{K}\cdot {\mathrm{\nabla}}_{\mathrm{3}}{T}^{n}{)}_{k}+\mathit{\tau}({K}_{\mathrm{v}}{\partial}_{z}{T}^{n+\mathrm{1}}){}_{\mathrm{b}}^{\mathrm{t}}.\end{array}$$ 
By virtue of the thickness equation above,
$${\mathit{\eta}}^{n+\mathrm{1}/\mathrm{2}}{\mathit{\eta}}^{n\mathrm{1}/\mathrm{2}}=\sum _{k}\left[\mathit{\alpha}\right({h}_{k}^{n+\mathrm{1}}{h}_{k}^{n})+(\mathrm{1}\mathit{\alpha}\left)\right({h}_{k}^{n}{h}_{k}^{n\mathrm{1}}\left)\right].$$We ignore the freshwater flux for simplicity, but it can be added. The solution is
$${\mathit{\eta}}^{n+\mathrm{1}/\mathrm{2}}=\sum _{k}(\mathit{\alpha}{h}_{k}^{n+\mathrm{1}}+(\mathrm{1}\mathit{\alpha}\left){h}_{k}^{n}\right){H}^{\mathrm{0}}.$$If satisfied initially on cold start by formally taking ${\mathit{\eta}}^{\mathrm{1}/\mathrm{2}}=\mathrm{0}$ and ${h}_{k}^{\mathrm{1}}={h}_{k}^{\mathrm{0}}$, this relationship will persist with time. However, to avoid accumulation of roundoff errors, we reset ${\mathit{\eta}}^{n+\mathrm{1}/\mathrm{2}}$ to the righthand side of the last expression after the computations of ${h}_{k}^{n+\mathrm{1}}$. This new ${\mathit{\eta}}^{n+\mathrm{1}/\mathrm{2}}$ will be used only in the next time step. The point here is that the original ${\mathit{\eta}}^{n+\mathrm{1}/\mathrm{2}}$ is computed by an iterative solver, whereby some significant digits are lost. The reset compensates for that. $\mathit{\alpha}=\mathrm{1}/\mathrm{2}$ provides centring in time.
Both splitexplicit and semiimplicit asynchronous schemes are relatively straightforward to implement. The semiimplicit method with $\mathit{\theta}=\mathrm{1}/\mathrm{2}$ and $\mathit{\alpha}=\mathrm{1}/\mathrm{2}$ is nondissipative, and dissipation is added by shifting θ toward 1. As explained above, even though the dissipation can be controlled well by offsetting $\mathit{\theta}=\mathrm{1}/\mathrm{2}$ only slightly, there are dispersive errors. Since the SI method is used with large Courant numbers for surface gravity waves, the contributions from such waves will come with large phase errors and should be damped. To keep the centring of η, we may take $\mathit{\alpha}=\mathrm{1}/\mathrm{2}$ and $\mathit{\theta}>\mathrm{1}/\mathrm{2}$. FESOM in most applications uses θ=1 and α=1, which implies more dissipation.
In the case of the $\stackrel{\mathrm{\u0303}}{z}$ vertical coordinate (Leclair and Madec, 2011), the horizontal divergence in a layer is split into fast and slow contributions. The fast one modifies layer thickness, and the slow one leads to diasurface w. Examples of practical implementation are provided by Petersen et al. (2015) and Megann et al. (2022). Our implementation presents a simplified version of both. The desired layer thickness is computed as
where ${h}_{k}^{*}$ corresponds to the z^{*} coordinate, and ${h}_{k}^{\mathrm{hf}}$ is the highfrequency component that augments z^{*} to $\stackrel{\mathrm{\u0303}}{z}$. They will be defined below. The bottom depth in FESOM is a cellwise constant, whereas elevation and layer thicknesses are defined at vertices. For this reason, for a given vertex v, we modify thicknesses of ${K}^{\prime}={K}^{\prime}\left(v\right)$ layers that do not touch topography (see Danilov et al., 2017). The total number of layers under vertex v will be denoted K=K(v). We take
An alternative definition would be to stretch the layers proportionally to their actual thickness, but Megann et al. (2022) warn that some drift in h^{*} may be present in such a case. Excluding the fixed layers, we split the divergence ${D}_{k}=\mathrm{\nabla}\cdot \left({\mathit{U}}_{k}\right)$ into a quasibarotropic part that corresponds to ${h}_{k}^{*}$ and the remaining quasibaroclinic part (“quasi” because we are limited to K^{′} layers),
where $D={\sum}_{k=\mathrm{1}}^{K}{D}_{k}$ is the vertically integrated divergence (note that all layers contribute to D). We will be interested in ${D}_{k}^{\prime}$, which is computed as the difference between D_{k} and ${D}_{k}^{*}$. We use the available thicknesses ${h}_{k}^{n}$ for ${h}_{k}^{n+\mathrm{1}/\mathrm{2}}$ in Eq. (13) to determine transports ${\mathit{U}}_{k}^{n+\mathrm{1}/\mathrm{2}}$ featuring in D_{k}. After ${h}_{k}^{n+\mathrm{1}}$ is fully specified, we retrim ${\mathit{U}}_{k}^{n+\mathrm{1}/\mathrm{2}}$ using ${h}^{n+\mathrm{1}/\mathrm{2}}$ defined as a half sum of the thicknesses at full steps. Our treatment of the barotropic part is admittedly less accurate than in Petersen et al. (2015) and Megann et al. (2022), and some barotropic waves will contaminate ${h}_{k}^{\mathrm{hf}}$. However, because of the fixed bottom layers, we have already introduced uncertainty from the very beginning. Since ${\partial}_{\mathrm{t}}{h}_{k}^{*}={D}_{k}^{*}$, ${\sum}_{k=\mathrm{1}}^{K}{D}_{k}^{\prime}=\mathrm{0}$. The highfrequency thickness ${h}_{k}^{\mathrm{hf}}$ will be related to ${D}_{k}^{\prime}$ and should sum to zero vertically. D^{′} is split into low and highfrequency parts,
The lowfrequency part is nudged to ${D}_{k}^{\prime}$ as
where τ_{lf} is the timescale (about 5 d in Petersen et al., 2015, but larger values can be of interest according to Megann et al., 2022). The fastfrequency part is obtained by subtracting the lowfrequency part from ${D}_{k}^{\prime}$. The highfrequency contribution to thickness is
The second term on the righthand side damps ${h}_{k}^{\mathrm{hf}}$ to zero over the timescale τ_{hf} (about 30 d). The last term will smooth the thickness, and the diffusivity K_{hf} is determined experimentally. If K_{hf} is vertically constant, ${\sum}_{k=\mathrm{1}}^{K}{h}_{k}^{\mathrm{hf}}=\mathrm{0}$ if it was initially so. A potential difficulty with Eq. (C2) is that ${h}_{k}^{\mathrm{hf}}$ is not bounded. A simple procedure is implemented at present. Equation (C2) is stepped implicitly with respect to the relaxation term, and diffusion is applied in a separate step. If $({h}_{k}^{\mathrm{hf}}{)}^{n+\mathrm{1}}$ is outside the bounds for any k in the column at vertex v, τ_{hf} is adjusted for the entire column at this time step to ensure that $({h}_{k}^{\mathrm{hf}}{)}^{n+\mathrm{1}}$ will be within the bounds, and computations of $({h}_{k}^{\mathrm{hf}}{)}^{n+\mathrm{1}}$ are repeated. While this procedure is sufficient for the simple channel test case, it remains to be seen whether it will be sufficient in more realistic cases or whether the solutions reported by Megann et al. (2022) will be needed. The field h^{hf} is always damped stronger at locations close to topography to eliminate possible inconsistencies with h^{hf}=0 in cells touching bottom topography. After $({h}_{k}^{\mathrm{hf}}{)}^{n+\mathrm{1}}$ is estimated, h^{n+1} is available; the transports ${\mathit{U}}_{k}^{n+\mathrm{1}/\mathrm{2}}$ can be retrimmed, and diasurface velocities can be estimated from the thickness equations.
A preliminary implementation of the new splitexplicit external mode solver within the sea ice model FESOM2 as proposed in this paper, including the test cases, can be found in the public repository at https://doi.org/10.5281/zenodo.10040944 (Banerjee et al., 2023).
TB, SD, and KK developed the algorithm. TB, SD, DS, and PS implemented the FESOM prototype and the main FESOM branch. All authors contributed to the writing and discussions.
The contact author has declared that none of the authors has any competing interests.
Publisher’s note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors.
This paper is a contribution to the projects M5 (reducing spurious mixing and energetic inconsistencies in realistic ocean modelling applications) and S2 (improved parameterizations and numerics in climate models) of the Collaborative Research Centre TRR 181 “Energy Transfer in Atmosphere and Ocean”.
This research has been supported by the Deutsche Forschungsgemeinschaft (grant no. 274762653).
The article processing charges for this openaccess publication were covered by the AlfredWegenerInstitut HelmholtzZentrum für Polar und Meeresforschung.
This paper was edited by Christopher Horvat and reviewed by Mark R. Petersen and one anonymous referee.
Adcroft, A., Anderson, W., Balaji, V., Blanton, C., Bushuk, M., Dufour, C. O., Dunne, J. P., Griffies, S. M., Hallberg, R., Harrison, M. J., Held, I. M., Jansen, M. F., John, J. G., Krasting, J. P., Langenhorst, A. R., Legg, S., Liang, Z., McHugh, C., Radhakrishnan, A., Reichl, B. G., Rosati, T., Samuels, B. L., Shao, A., Stouffer, R., Winton, M., Wittenberg, A. T., Xiang, B., Zadeh, N., and Zhang, R.: The GFDL global ocean and sea ice model OM4.0: Model description and simulation features, J. Adv. Model. Earth Sy., 11, 3167–3211, https://doi.org/10.1029/2019MS001726, 2019. a
Aiki, H., Richards, K. J., and Sakuma, H.: Maintenance of the mean kinetic energy in the global ocean by the barotropic and baroclinic energy routes: the roles of JEBAR and Ekman dynamics, Ocean Dynam., 61, 675–700, 2011. a
Androsov, A., Fofonova, V., Kuznetsov, I., Danilov, S., Rakowsky, N., Harig, S., Brix, H., and Wiltshire, K. H.: FESOMC v.2: coastal dynamics on hybrid unstructured meshes, Geosci. Model Dev., 12, 1009–1028, https://doi.org/10.5194/gmd1210092019, 2019. a
Banerjee, T., Danilov, S., Scholz, P., Klingbeil, K., and Sidorenko, D.: FESOM 2.5 with preliminary SplitExplicit Subcycling (2.5), Zenodo [code and data set], https://doi.org/10.5281/zenodo.10040944, 2023. a, b
Cools, S. and Vanroose, W.: The communicationhiding pipelined BiCGStab method for the parallel solution of large unsymmetric linear systems, Parallel Comput., 65, 1–20, https://doi.org/10.1016/j.parco.2017.04.005, 2017. a
Danilov, S., Sidorenko, D., Wang, Q., and Jung, T.: The FinitevolumE Sea ice–Ocean Model (FESOM2), Geosci. Model Dev., 10, 765–789, https://doi.org/10.5194/gmd107652017, 2017. a, b, c, d, e, f, g, h, i
Demange, J., Debreu, L., Marchesiello, P., Lemarié, F., Blayo, E., and Eldred, C.: Stability analysis of splitexplicit free surface ocean models: implication of the depthindependent barotropic mode approximation, J. Comput. Phys., 398, 108875, https://doi.org/10.1016/j.jcp.2019.108875,, 2019. a, b, c, d, e, f, g, h, i, j, k, l, m
Griffies, S. M., Adcroft, A. J., and Hallberg, R. W.: A primer on the vertical Lagrangianremap method in ocean models based on finite volume generalized vertical coordinates, J. Adv. Model. Earth Sy., 12, e2019MS001954, https://doi.org/10.1029/2019MS001954, 2020. a
Hallberg, R. and Adcroft, A.: Reconciling estimates of the free surface height in Lagrangian vertical coordinate ocean models with modesplit time stepping, Ocean Model., 29, 15–26, 2009. a
Huang, X., Tang, Q., Tseng, Y., Hu, Y., Baker, A. H., Bryan, F. O., Dennis, J., Fu, H., and Yang, G.: PCSI v1.0, an accelerated barotropic solver for the highresolution ocean model component in the Community Earth System Model v2.0, Geosci. Model Dev., 9, 4209–4225, https://doi.org/10.5194/gmd942092016, 2016. a, b
Jullien, S., Caillaud, M., Benshila, R., Bordois, L., Cambon, G., Dumas, F., Gentil, S. L., Lemarié, F., Marchesiello, P., Theetten, S., Dufois, F., Le Corre, M., Morvan, G., Le Gac, S., Gula, J., Pianezze, J., and Schreiber, M.: Croco Technical and numerical documentation, Zenodo, https://doi.org/10.5281/zenodo.11036558, 2022. a
Kang, H.G., Evans, K. J., Petersen, M. R., Jones, P. W., and Bishnu, S.: A Scalable SemiImplicit Barotropic Mode Solver for the MPASOcean, J. Adv. Model. Earth Sy., 13, e2020MS002238, https://doi.org/10.1029/2020MS002238, 2021. a
Klingbeil, K., Lemarié, F., Debreu, L., and Burchard, H.: The numerics of hydrostatic structuredgrid coastal ocean models: state of the art and future perspectives, Ocean Model., 125, 80–105, https://doi.org/10.1016/j.ocemod.2018.01.007, 2018. a
Koldunov, N. V., Aizinger, V., Rakowsky, N., Scholz, P., Sidorenko, D., Danilov, S., and Jung, T.: Scalability and some optimization of the FinitevolumE Sea ice–Ocean Model, Version 2.0 (FESOM2), Geosci. Model Dev., 12, 3991–4012, https://doi.org/10.5194/gmd1239912019, 2019. a, b
Leclair, M. and Madec, G.: $\stackrel{\mathrm{\u0303}}{z}$coordinate, an Arbitrary Lagrangian–Eulerian coordinate separating high and low frequency motions, Ocean Model., 37, 139–152, 2011. a, b
Madec, G., BourdalléBadie, R., Chanut, J., Clementi, E., Coward, A., Ethé, C., Iovino, D., Lea, D., Lévy, C., Lovato, T., Martin, N., Masson, S., Mocavero, S., Rousset, C., Storkey, D., Vancoppenolle, M., Müeller, S., Nurser, G., Bell, M., and Samson, G.: NEMO ocean engine, Zenodo [code], https://doi.org/10.5281/zenodo.3878122, 2019. a
Megann, A., Chanut, J., and Storkey, D.: Assessment of the $\stackrel{\mathrm{\u0303}}{z}$ timefiltered Arbitrary LagrangianEulerian coordinate in a global eddypermitting ocean model, J. Adv. Model. Earth Syst., e2022MS003056, https://doi.org/10.1029/2022MS003056, 2022. a, b, c, d, e, f, g
Petersen, M., Jacobsen, D., Ringler, T., Hecht, M., and Maltrud, M.: Evaluation of the arbitrary Lagrangian–Eulerian vertical coordinate method in the MPASOcean model, Ocean Model., 86, 93–113, 2015. a, b, c, d
Ringler, T., Petersen, M., Higdon, R. L., Jacobsen, D., Jones, P. W., and Maltrud, M.: A multiresolution approach to global ocean modeling, Ocean Model., 69, 211–232, 2013. a, b
Scholz, P., Sidorenko, D., Danilov, S., Wang, Q., Koldunov, N., Sein, D., and Jung, T.: Assessment of the FiniteVolumE Sea ice–Ocean Model (FESOM2.0) – Part 2: Partial bottom cells, embedded sea ice and vertical mixing library CVMix, Geosci. Model Dev., 15, 335–363, https://doi.org/10.5194/gmd153352022, 2022. a
Shchepetkin, A. F. and McWilliams, J. C.: A method for computing horizontal pressure gradient force in an oceanic model with a nonaligned vertical coordinate, J. Geophys. Res., 108, 3090–3124, https://doi.org/10.1029/2001JC001047, 2003. a
Shchepetkin, A. F. and McWilliams, J. C.: The regional oceanic modeling system (ROMS): a splitexplicit, freesurface, topographyfollowingcoordinate oceanic model, Ocean Model., 9, 347–404, 2005. a, b, c, d, e, f, g, h, i, j, k, l
Soufflet, Y., Marchesiello, P., Lemarié, F., Jouanno, J., Capet, X., Debreu, L., and Benshila, R.: On effective resolution in ocean models, Ocean Model., 98, 36–50, 2016. a, b
Wang, Q., Danilov, S., Sidorenko, D., Timmermann, R., Wekerle, C., Wang, X., Jung, T., and Schröter, J.: The Finite Element Sea IceOcean Model (FESOM) v.1.4: formulation of an ocean general circulation model, Geosci. Model Dev., 7, 663–693, https://doi.org/10.5194/gmd76632014, 2014. a
Note that in FESOM, θ is implemented as a tunable parameter. Its value of 0.14 was obtained by Demange et al. (2019) under certain assumptions. However, as is shown in this paper, in practice it has been found to work well.
Note that for the SE case, we integrate transport $\stackrel{\mathrm{\u203e}}{U}$ before elevation η in comparison to this case (SESM). The reason is historical. We follow how these two methods were originally proposed, and in principle, the integrations in SE can be interchanged where the η is integrated first, but the gradient is θweighted, similar to the one in SESM. Both, however, should result in identical damping.
Note that alternative formulations of retrimming are also possible, for example, setting ${\sum}_{k}{\mathit{U}}_{k}^{n+\mathrm{1}/\mathrm{2}}=(\langle \stackrel{\mathrm{\u203e}}{\mathit{U}}{\rangle}^{n}+\langle \stackrel{\mathrm{\u203e}}{\mathit{U}}{\rangle}^{n+\mathrm{1}})/\mathrm{2}$. This is also a valid estimate and was tried, but similar to the case mentioned in the paper, this too was seen to be redundant in practice.
Also note that despite being found to be redundant in practice, the retrimming ${\sum}_{k}{\mathit{U}}_{k}^{n+\mathrm{1}/\mathrm{2}}=\langle \langle \stackrel{\mathrm{\u203e}}{\mathit{U}}\rangle {\rangle}^{n+\mathrm{1}/\mathrm{2}}$ was still left in the code.
Here, we are exploring only the physical solution that corresponds to a wave propagating in a negative direction.
 Abstract
 Introduction
 Splitexplicit asynchronous time stepping
 Temporal discretization for barotropic solver
 Numerical experiments
 Runtime performance and parallel scalability
 Conclusions
 Appendix A: Supplementary figures
 Appendix B: Adaptation of the semiimplicit scheme in FESOM2
 Appendix C: Implementation of $\stackrel{\mathrm{\u0303}}{z}$ in FESOM2
 Code and data availability
 Author contributions
 Competing interests
 Disclaimer
 Acknowledgements
 Financial support
 Review statement
 References
 Abstract
 Introduction
 Splitexplicit asynchronous time stepping
 Temporal discretization for barotropic solver
 Numerical experiments
 Runtime performance and parallel scalability
 Conclusions
 Appendix A: Supplementary figures
 Appendix B: Adaptation of the semiimplicit scheme in FESOM2
 Appendix C: Implementation of $\stackrel{\mathrm{\u0303}}{z}$ in FESOM2
 Code and data availability
 Author contributions
 Competing interests
 Disclaimer
 Acknowledgements
 Financial support
 Review statement
 References