the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
SERGHEI (SERGHEISWE) v1.0: a performanceportable highperformance parallelcomputing shallowwater solver for hydrology and environmental hydraulics
Daniel CaviedesVoullième
Mario MoralesHernández
Matthew R. Norman
Ilhan ÖzgenXian
The Simulation EnviRonment for Geomorphology, Hydrodynamics, and Ecohydrology in Integrated form (SERGHEI) is a multidimensional, multidomain, and multiphysics model framework for environmental and landscape simulation, designed with an outlook towards Earth system modelling. At the core of SERGHEI's innovation is its performanceportable highperformance parallelcomputing (HPC) implementation, built from scratch on the Kokkos portability layer, allowing SERGHEI to be deployed, in a performanceportable fashion, in graphics processing unit (GPU)based heterogeneous systems. In this work, we explore combinations of MPI and Kokkos using OpenMP and CUDA backends. In this contribution, we introduce the SERGHEI model framework and present with detail its first operational module for solving shallowwater equations (SERGHEISWE) and its HPC implementation. This module is designed to be applicable to hydrological and environmental problems including flooding and runoff generation, with an outlook towards Earth system modelling. Its applicability is demonstrated by testing several wellknown benchmarks and largescale problems, for which SERGHEISWE achieves excellent results for the different types of shallowwater problems. Finally, SERGHEISWE scalability and performance portability is demonstrated and evaluated on several TOP500 HPC systems, with very good scaling in the range of over 20 000 CPUs and up to 256 stateofthe art GPUs.
The upcoming exascale highperformance parallelcomputing (HPC) systems will enable physicsbased geoscientific modelling with unprecedented detail (Alexander et al., 2020). Although the need for such HPC systems is traditionally driven by climate, ocean, and atmospheric modelling, hydrological models are progressively becoming as physical, sophisticated, and computationally intensive. Physically based, integrated hydrological models such as Parflow (Kuffour et al., 2020), Amanzi/ATS (Coon et al., 2019), and Hydrogeosphere (Brunner and Simmons, 2012) are becoming more prominent in hydrological research and Earth system modelling (ESM) (Fatichi et al., 2016; Paniconi and Putti, 2015), making HPC more and more relevant for computational hydrology (Clark et al., 2017).
Hydrological models, as with many other HPC applications, are currently facing challenges in exploiting available and future HPC systems. These challenges arise, not only because of the intrinsic complexity of maintaining complex codes over large periods of time, but because HPC and its hardware are undergoing a large paradigm change (Leiserson et al., 2020; Mann, 2020), which is strongly driven by the end of Moore's law (MoralesHernández et al., 2020). In order to gain higher processing capacity, computers will require heterogeneous and specialised hardware (Leiserson et al., 2020), potentially making highperforming code harder to develop and maintain and demanding that developers adapt and optimise code for an evolving hardware landscape. It has become clear that upcoming exascale systems will have heterogeneous architectures embedded in modular and reconfigurable architectures (Djemame and Carr, 2020; Suarez et al., 2019) that will consist of different types of CPUs and accelerators, possibly from multiple vendors requiring different programming models. This puts pressure on domain scientists to write portable code that performs efficiently on a range of existing and future HPC architectures (Bauer et al., 2021; Lawrence et al., 2018; Schulthess, 2015) and to ensure the sustainability of such code (Gan et al., 2020).
Different strategies are currently being developed to cope with this grand challenge. One strategy is to offload the architecturedependent parallelisation tasks to the compiler – see, for example, Vanderbauwhede and Takemi (2013); Vanderbauwhede and Davidson (2018); Vanderbauwhede (2021). Another strategy is to use an abstraction layer that provides a unified programming interface to different computational backends – a socalled “performance portability framework” – that allows the same code to be compiled across different HPC architectures. Examples of this strategy include RAJA (Beckingsale et al., 2019) and Kokkos (Edwards et al., 2014; Trott et al., 2021), which are both very similar in their scope and their capability. Both RAJA and Kokkos are C++ libraries that implement a sharedmemory programming model to maximise the amount of code that can be compiled across different hardware devices with nearly the same parallel performance. They allow access to several computational backends, in particular multigraphics processing unit (GPU) and heterogeneous HPC systems.
This paper introduces the Kokkosbased computational (eco)hydrology framework SERGHEI (Simulation EnviRonment for Geomorphology, Hydrodynamics, and Ecohydrology in Integrated form) and its surface hydrology module SERGHEISWE. The primary aim of SERGHEI's implementation is scalability and performance portability. In order to achieve this, SERGHEI is written in C++ and based from scratch on the Kokkos abstraction. Kokkos currently supports CUDA, OpenMP, HIP, SYCL, and Pthreads as backends. We chose Kokkos over other alternatives because it is actively engaged in securing the sustainability of its programming model, fostering its partial inclusion into ISO C++ standards (Trott et al., 2021). Indeed, there is an increasing number of applications in multiple domains leveraging on Kokkos – for example, Bertagna et al. (2019); Demeshko et al. (2018); Grete et al. (2021); Halver et al. (2020); Watkins et al. (2020). Thus, among other similar solutions, Kokkos has been identified as advantageous in terms of performance portability and project sustainability, although it is perhaps somewhat more invasive and less clear on the resulting code (Artigues et al., 2019). We present the full implementation of the SERGHEISWE module, the shallowwater equations (SWEs) solver for freesurface hydrodynamics at the heart of SERGHEI.
SERGHEISWE enables the simulation of surface hydrodynamics of overland flow and streamflow seamlessly and across scales. Historically, hydrological models featuring surface flow have relied on kinematic or zeroinertia (diffusive) approximations due to their apparent simplicity (CaviedesVoullième et al., 2018; Kollet et al., 2017) and because until the last decade, robust SWE solvers were not available (CaviedesVoullième et al., 2020a; GarcíaNavarro et al., 2019; Simons et al., 2014; ÖzgenXian et al., 2021). However, the current capabilities of SWE solvers, the increase in computational capabilities, and the need to better exploit parallelism – easier to achieve with explicit solvers than with implicit solvers, as usually required by diffusive equations (CaviedesVoullième et al., 2018; FernándezPato and GarcíaNavarro, 2016) – have been pushing to replace simplified surface flow models (for hydrological purposes) with fully dynamic SWE solvers. There is an increasing number of studies using SWE solvers for rainfall runoff and overland flow simulations from hillslope to catchment scales – for example, Bellos and Tsakiris (2016); Bout and Jetten (2018); CaviedesVoullième et al. (2012, 2020a); Costabile and Costanzo (2021); Costabile et al. (2021); David and Schmalz (2021); Dullo et al. (2021a, b); FernándezPato et al. (2020); GarcíaAlén et al. (2022); Simons et al. (2014); Xia and Liang (2018). This trend contributes to the transition from engineering hydrology towards Earth system science (Sivapalan, 2018), a shift that was motivated by necessity and opportunity, as continental (and larger) ESM will progressively require fully dynamic SWE solvers to cope with increasedresolution digitalterrain models and the dynamics that respond to them, improved spatiotemporal rainfall data and simulations, and increasingly more sophisticated process interactions across scales, from patch to hillslope to catchments (Fan et al., 2019).
MoralesHernández et al. (2021)Vacondio et al. (2014)Turchetto et al. (2019)Xia et al. (2019)Kobayashi et al. (2015)Carlotto et al. (2021)Shaw et al. (2021)(Bates and Roo, 2000)GarcíaFeal et al. (2018)Caldas Steinstraesser et al. (2021)Kirstetter et al. (2021)Delestre et al. (2017)Wittmann et al. (2017)Moulinec et al. (2011)Berger et al. (2011)Brunner (2021)Simons et al. (2014)Steffen et al. (2020)SERGHEISWE distinguishes itself from other HPC SWE solvers through a number of key novelties. Firstly, SERGHEISWE is open sourced under a permissive BSD license. While there are indeed many GPUenabled SWE codes, many of these are research codes that are not openly available – for example, Aureli et al. (2020); ButtingerKreuzhuber et al. (2022); Echeverribar et al. (2020); Hou et al. (2020); Lacasta et al. (2014, 2015); Liang et al. (2016); Vacondio et al. (2017) – or they are commercial codes, such as RiverFlow2D, TUFLOW, HydroAS_2D – see Jodhani et al. (2021) for a recent noncomprehensive review. Open source solvers are a fundamental need for the community, ensuring transparency, reproducibility, and providing a base for model (software) sustainability. We note that open source SWE solvers are becoming increasingly more available – see Table 1. However, only a handful of these freely available models are enabled for GPUs, mostly through CUDA. Fewer of them have multiGPU capabilities and are capable of fully leveraging HPC hardware. All of these multiGPUenabled codes are currently dependent on CUDA and are therefore somewhat limited to Nvidia hardware. This leads into the second and most relevant novelty of SERGHEISWE: it is a performanceportable, highly scalable, and GPUenabled solver. SERGHEISWE generalises hardware (CPU, GPU, accelerators) support to a performanceportability concept through Kokkos. This gives SERGHEISWE the key advantage of having a single code base for the currently fully operational OpenMP and CUDA backends, as well as HIP, which is currently experimental in SERGHEI but, most importantly, keeps this code base relevant for other backends, such as SYCL. This is particularly important, as the current HPC landscape features not only Nvidia GPUs but also a currently increased adoption of AMD GPUs, with the most recent leading TOP 500 systems – Frontier and LUMI – as well as upcoming systems (e.g. El Capitan) relying on AMD GPUs. In this way, SERGHEI is safely avoiding the vendor lock trap.
SERGHEISWE has been developed by harnessing the past 15 years' worth of numerical advances in the solution of SWE, ranging from fundamental numerical formulations (Echeverribar et al., 2019; MoralesHernández et al., 2020) to HPC GPU implementations (Brodtkorb et al., 2012; Hou et al., 2020; Lacasta et al., 2014, 2015; Liang et al., 2016; Vacondio et al., 2017; Sharif et al., 2020). Most of this work was done in the context of developing solvers for flood modelling, with rather engineeringoriented applications, demanding high quantitative accuracy and predictive capability. Most of the established models in Table 1 were developed within such contexts, although many are currently also adopted for more hydrological applications. Leveraging on this technology, SERGHEISWE is designed to cope with the classical shallowwater applications of fluvial and urban flooding, as well as with the emerging rainfall runoff problems in both natural and urban environments (for which coupling to sewer system models is a longerterm objective) and with other flows of broad hydrological and environmental interest that occur on (eco)hydrological timescales, priming it for further uses in ecohydrology and geomorphology. Nevertheless, all shallowwater applications should benefit from the high performance and high scalability of SERGHEISWE. With an HPCready SWE solver, catchmentscale rainfall runoff applications around the 1 m^{2} resolution are feasible. Similarly, large river and floodplain simulations can be enabled for operational flood forecasting, and flash floods in urban environments can be tackled with extremely high spatial resolution. Moreover, it is noteworthy that SERGHEISWE is not confined to HPC environments, and users with workstations can also benefit from improved performance.
1.1 The SERGHEI framework
SERGHEI is envisioned as a modular simulation framework around a physically based hydrodynamic core, which allows a variety of waterdriven and waterlimited processes to be represented in a flexible manner. In this sense, SERGHEI is based on the idea of water fluxes as a connecting thread among various components and processes within the Earth system (Giardino and Houser, 2015). As illustrated by the conceptual framework in Fig. 1, SERGHEI's hydrodynamic core will consist of a mechanistic surface (SERGHEISWE, the focus of this paper) and subsurface flow solvers (light and dark blue), around which a generalised transport framework for multispecies transport and reaction will be implemented (grey). The transport framework will further enable the implementation of morphodynamics (gold) and vegetation dynamics (green) models. The transport framework will also include a Lagrangian particletracking module (currently also under development). At the time of the writing of this paper, the subsurface flow solver – based on the threedimensional extension of the Richards solver by Li et al. (2021) – is experimentally operative and is underway to being coupled to the surface flow solver, thus making the hydrodynamic core of SERGHEI applicable to integrated surface–subsurface hydrology. The initial infrastructure for the three other transportbased frameworks is currently under development.
In this section we provide an overview of the underlying mathematical model and the numerical schemes implemented in SERGHEISWE. The implementation is based on wellestablished numerical schemes, and consequently, we limit this to a minimal presentation.
SERGHEISWE is based on the resolution of the twodimensional (2D) shallowwater equations that can be expressed in a compact differential conservative form as
Here, t [T] is time, x [L] and y [L] are Cartesian coordinates, U is the vector of conserved variables (that is to say the unknowns of the system) containing the water depth, h [L], and the unit discharges in x and y directions, called q_{x}=hu [L^{2} T^{−1}] and q_{y}=hv [L^{2} T^{−1}], respectively. F and G are the fluxes of these conserved variables with gravitational acceleration g [L T^{−2}]. The mass source terms S_{r} account for rainfall, r_{o} [L T^{−1}], and infiltration or exfiltration, r_{f} [L T^{−1}]. The momentum source terms include gravitational bed slope terms, S_{b}, expressed according to the gradient of the elevation z [L], and friction terms, S_{f}, as a function of the friction slope σ. This friction slope is often modelled by means of Gauckler–Manning's equation in terms of Manning's roughness coefficient n [$\mathrm{T}\phantom{\rule{0.125em}{0ex}}{\mathrm{L}}^{\mathrm{1}/\mathrm{3}}$] but also frequently with the Chezy and the Darcy–Weisbach formulations (CaviedesVoullième et al., 2020a). In addition, specialised formulations of the friction slope exist to consider the effect of microtopography and vegetation for small water depths, e.g. variable Manning's coefficients (Jain and Kothyari, 2004; Mügler et al., 2011) or generalised friction laws (Özgen et al., 2015b). A recent systematic comparison and indepth discussion of several friction models with a focus on rainfall runoff simulations is given in Crompton et al. (2020). Implementing additional friction models is of course possible – and relevant, especially to address the multiscale nature of runoff in catchments – but not essential to the points in this paper. The observant reader will note that in Eq. (1), viscous and turbulent fluxes have been neglected. The focus here is on applications (rainfall runoff, dam breaks) where the influence of these can be safely neglected. Turbulent viscosity may become significant for ecohydraulic simulations of river flow, and turbulent fluxes of course play an important role in mixing in transport simulations. We will address these issues in future implementations of the transport solvers in SERGHEI.
SERGHEISWE uses a firstorder accurate upwind finitevolume scheme with a forward Euler time integration to solve the system of Eq. (1) on uniform Cartesian grids with grid spacing Δx [L]. The numerical scheme, presented in detail in MoralesHernández et al. (2021), harnesses many solutions that have been reported in the literature in the past decade, ensuring that all desirable properties of the scheme (wellbalancing, depth positivity, stability, robustness) are preserved under the complex conditions of realistic environmental problems. In particular, we require the numerical scheme to stay robust and accurate in the presence of arbitrary rough topography and shallowwater depths with wetting and drying.
Wellbalancing and water depth positivity are ensured by solving numerical fluxes at each cell edge k with augmented Riemann solvers (Murillo and GarcíaNavarro, 2010, 2012) based on the Roe linearisation (Roe, 1981). In fluctuation form, the rule for updating the conserved variables in cell i from time step n to time step n+1 reads as follows:
followed by
where $\stackrel{\mathrm{\u0303}}{\mathit{\lambda}}$ and $\stackrel{\mathrm{\u0303}}{\mathit{e}}$ are the eigenvalues and eigenvectors of the linearised system of equations, $\stackrel{\mathrm{\u0303}}{\mathit{\alpha}}$ and $\stackrel{\mathrm{\u0303}}{\mathit{\beta}}$ are the fluxes and bed slope and friction source term linearisations, respectively, and the minus sign accounts for the upwind discretisation. Note that all the tilde variables are defined at each computational edge. The time step Δt is restricted to ensure stability, following the Courant–Friedrichs–Lewy (CFL) condition:
Although the wave speed values are formally defined at the interfaces, the corresponding cell values are used instead for the CFL condition. As pointed in MoralesHernández et al. (2021), this approach does not compromise the stability of the system but accelerates the computations and simplifies the implementation.
It is relevant to acknowledge that second (and higher)order schemes for SWE are available (e.g. ButtingerKreuzhuber et al., 2019; CaviedesVoullième et al., 2020b; Hou et al., 2015; NavasMontilla and Murillo, 2018). However, firstorder schemes are still a pragmatic choice (Ayog et al., 2021), especially when dealing with very high resolutions (as targeted with SERGHEI), which offsets their higher discretisation error and numerical diffusivity in comparison to higherorder schemes. Similarly, robust schemes for unstructured triangular meshes are well established together with their wellknown advantages in reducing cell counts and numerical diffusion (Bomers et al., 2019; CaviedesVoullième et al., 2012, 2020a). As these advantages are less relevant at very high resolutions, we opt for Cartesian grids to avoid issues with memory mapping, coalescence and cache misses in GPUs (Lacasta et al., 2014), and additional memory footprints while also making domain decomposition simpler. Both higherorder schemes and unstructured (and adaptive) meshes may also be implemented within SERGHEI.
In this section we describe the key ingredients of the HPC implementation of SERGHEI. Conceptually, this requires, firstly, handling parallelism inside a computational device (multicore CPU or GPU) with shared memory and the related portability and corresponding backends (i.e. OpenMP, CUDA, HIP, etc.). On a higher level of parallelism, distributing computations across many devices requires domain decomposition and a distributed memory problem, implemented via MPI. The complete implementation of SERGHEI encompasses both, distributing parallel computations into many subdomains, each of which is mapped onto a computational device. Here we start the discussion from the higher level of domain decomposition and highlight that the novelty of SERGHEI lies with the multiple levels of parallelism together with the performanceportable sharedmemory approach via Kokkos.
3.1 Domain decomposition
The surface domain is a twodimensional plane, discretised by a Cartesian grid with a total cell number of N_{t}=N_{x}N_{y}, where N_{x} and N_{y} are the number of cells in x and y directions, respectively. Operations are usually performed per subdomain, each one associated with an MPI rank. During initialisation, each MPI process constructs a local subdomain with n_{x} cells in x direction and n_{y} cells in y direction. The user specifies the number of subdomains in each Cartesian direction at runtime, and SERGHEI determines the subdomain size from this information. Subdomains are the same size, except for correction due to nonintegerdivisible decompositions. In order to communicate information across subdomains, SERGHEI uses socalled “halo cells”, nonphysical cells on the boundaries of the subdomain that overlap with physical cells from neighbouring subdomains. The halo cells augment the number of cells in x and y direction by 1 at each boundary. Thus, the subdomain size is ${n}_{t}=({n}_{x}+\mathrm{2})({n}_{y}+\mathrm{2})$. The definitions are sketched – without loss of generality – for a squareshaped subdomain in Fig. 2, and the way these subdomains overlap in the global domain is sketched in Fig. 3 (left). Halo cells are not updated as part of the time stepping. Instead, they are updated by receiving data from the neighbouring subdomain, a process which naturally requires MPI communications.
Besides the global cell index that ranges from 0 to N_{t}, each subdomain uses two sets of local indices to access data stored in its cells. The first set spans over all physical cells inside the subdomain, and the second index spans over both halo cells and physical cells – see Fig. 2. The second set maps into memory position. For example, in order to access the physical cell 14 in Fig. 2, one has to access memory position 27.
3.2 Data exchange between subdomains
The underlying methods for data exchange between subdomains are centred on the subdomains rather than on the interfaces. Data are exchanged through MPIbased sendandreceive calls (nonblocking) that aggregate data in the halo cells across the subdomains. Note that, by default, Kokkos implicitly assumes that the MPI library is GPU aware, allowing GPUtoGPU communication provided that the MPI libraries support this feature. Figure 3 (right) illustrates the concept of sending a halo buffer containing state variables from subdomain 1 to update halo cells of subdomain 0. The halo buffer contains state variables for n_{y} cells, grouped as water depth (h), unit discharge in x direction (hu), and unit discharge in y direction (hv).
3.3 Performanceportable implementation
Intradevice parallelism is achieved per subdomain through the Kokkos framework, which allows the user to choose between sharedmemory parallelism
and GPU backends for further acceleration. SERGHEI's implementation makes use of the Kokkos concept of View
s, which are memoryspaceaware abstractions. For example, for arrays of real numbers, SERGHEI defines a type realArr
, based on View
. This takes the form of
Listing 1 for the shared (host) memory space and Listing 2 for the unified virtual memory (UVM) GPU device CUDA memory
space. The UVM significantly facilitates development while avoiding writing explicit hosttodevice (and vice versa) memory movements.
For a CUDA backend, the use of unified memory (CudaUVMSpace
) is shown in Listing 2.
Similar definitions can be constructed for integer arrays. These arrays describe spatially distributed fields, such as conserved variables, model
parameters, and forcing data. Deriving these arrays from View
allows us to operate on them via Kokkos to achieve performance portability.
Conceptually, the SERGHEISWE solver consists of two computationally intensive kernels: (i) cellspanning and (ii) edgespanning kernels. The update
of the conserved variables following Eq. (2) results in a kernel around a cellspanning loop. These cellspanning loops are the most
frequent ones in SERGHEISWE and are used for many processes of different computational demand. The standard C++ implementation of such a kernel
is illustrated in Listing 3, which spans indices i
and j
of a 2D Cartesian grid. Here, the loops may be
parallelised using, for example, OpenMP or CUDA. However, such a direct implementation of, for example, an OpenMP parallelisation would not
automatically allow leveraging GPUs. That is to say, such an implementation is not portable.
In order to achieve the desired portability, we replace the standard for
by a Kokkos::parallel_for
, which enables a lambda
function, is minimally intrusive, and reformulates this kernel to the code shown in Listing 4. As a result, this implementation can
be compiled for both OpenMP applications and GPUs with Kokkos handling the lowlevel parallelism on different backends.
Edgespanning loops are conceptually necessary to compute numerical fluxes (Eq. 2). Although numerical fluxes can be computed in a
cellcentred fashion, this would lead to inefficiencies due to duplicated computations. In Listing 5 we illustrate the edgespanning
kernel solving the numerical fluxes in SERGHEISWE. Notably, Listing 5 is indexed by cells, and the construction of edgewise tuples
occurs inside of the kernel. This bypasses the need for additional memory structures to hold edgebased information, but only for Cartesian
meshes. Generalisation to adaptive or unstructured meshes would require explicitly an edgebased loop with an additional View
of size equal to the
number of edges.
In this section we report evidence supporting the claim that SERGHEISWE is an accurate, robust and efficient shallowwater solver. The formal accuracy testing strategy is based on several wellknown benchmark cases with welldefined reference solutions. Herein, for brevity, we focus only on the results of these tests, while providing a minimal presentation of the setups. We refer the interested reader to the original publications (and to the many instances in which these tests have been used) for further details on the geometries, parametrisations and forcing.
We purposely report an extensive testing exercise to show the wide applicability of SERGHEI across hydraulic and hydrological problems, with a wide range of the available benchmark tests. Analytical, experimental and fieldscale tests are included. The first are aimed at showing formal convergence and accuracy. The experimental cases are meant as validation of the capabilities of the model to reach physically meaningful solutions under a variety of conditions. The fieldscale tests showcase the applicability of the solver for real problems, and allow for strenuous computational tasks to show performance, efficiency, and parallel scaling. All solutions reported here were computed using double precision arithmetic.
4.1 Analytical steady flows
We test SERGHEI's capability to capture moving equilibria in a number of steadyflow test cases compiled in Delestre et al. (2013). Details of the test cases for reproduction purposes can be retrieved from Delestre et al. (2013) and the accompanying software, SWASHES – in this work, we use SWASHES version 1.03. In the following test cases, the domain is always discretised using 1000 computational cells. A summary of L norms for all test cases is given in Table 2. The definition of the L norms is given in Appendix A.
4.1.1 C property
These tests feature a smooth bump in a onedimensional, frictionless domain which can be used to validate the C property, wellbalancing, and the shockcapturing ability of the numerical solver (MoralesHernández et al., 2012; Murillo and GarcíaNavarro, 2012). Figure 4 shows that SERGHEISWE satisfies the C property by preserving a lake at rest in the presence of an emerged bump (an immersed bump test is shown in Sect. A1) and matches the analytical solution provided by SWASHES.
4.1.2 Wellbalancing
To show wellbalancing under steady flow, we computed two transcritical flows based on the analytical benchmark of a onedimensional flume with varying geometry proposed by MacDonald et al. (1995). These tests are well known and widely used as benchmark solutions (e.g. CaviedesVoullième and Kesserwani, 2015; Delestre et al., 2013; Kesserwani et al., 2019; MoralesHernández et al., 2012; Murillo and GarcíaNavarro, 2012). Additional wellbalancing tests can be found in Sect. A2. At steady state, local acceleration terms and source terms balance each other out such that the free surface water elevation becomes a function of bed slope and friction source terms. Thus, these test cases can be used to validate the implementation of these source terms and the wellbalanced nature of the complete numerical scheme. This is particularly important to subcritical fluvial flows and rainfall runoff problems, since both are usually dominated by these two terms.
Figure 5 shows comparisons between SERGHEISWE and analytical solutions (obtained through SWASHES) for two transcritical steady flows. Very good agreement is obtained. Note that the unit discharge is captured with machine accuracy in the presence of friction and bottom changes, which is mainly due to the upwind friction discretisation used in the SERGHEISWE solver. As reported by Burguete et al. (2008) and Murillo et al. (2009), a centred friction discretisation does not ensure a perfect balance between fluxes and source terms for steady states even if using the improved discretisation by Xia et al. (2017).
4.2 Analytical dam break
We verify SERGHEISWE's capability to capture transient flow based on analytical dam breaks (Delestre et al., 2013). Dam break problems are defined by an initial discontinuity in the water depth in the domain h(x), such that
where h_{L} denotes a specified water depth on the lefthand side of the location of the discontinuity x_{0}, and h_{R} denotes the specified water depth on the righthand side of x_{0}. The domain is 10 m long, the discontinuity is located at x_{0} = 5 m, and the total run time is 6 s. Initial velocities are nil in the entire domain. In the following, we report empirical evidence of the numericalschemes mesh convergence property by comparing model predictions for test cases with 100, 1000, 10 000, and 100 000 elements, respectively.
A classical frictionless dam break over a wet bed is reported in Sect. A3. Here we focus on a frictionless dam break over a dry bed. Flow featuring depth close to dry bed is a special case for the numerical solver because regular wave speed estimations become invalid Toro (2001). Initial conditions are set as h_{L} = 0.005 m and h_{R} = 0 m. Model results are plotted against the analytical solution by Ritter for different grid resolutions in Fig. 6. The model results converge to the analytical solution as the grid is refined. This is also seen in Table 3, where errors and convergence rates for this test case are summarised. Note that the norms definition can be found in Sect. A2. The observed convergence rate is below the theoretical convergence rate of R=1 because of the increased complexity introduced by the discontinuity in the solution and the presence of dry bed.
4.3 Analytical oscillation: parabolic bowl
We present transient twodimensional test cases with moving wet–dry fronts that consider the periodical movement of water in a parabolic bowl, socalled “oscillations” that have been studied by Thacker (1981). We replicate two cases from the SWASHES compilation (Delestre et al., 2013), using a mesh spacing of Δx = 0.01 m, one reported here and the other in Sect. A4.
The wellestablished test case by Thacker (1981) for a periodic oscillation of a planar surface in a frictionless paraboloid has been extensively used for validation of shallowwater solvers (e.g. Aureli et al., 2008; Dazzi et al., 2018; Liang et al., 2015; Murillo and GarcíaNavarro, 2010; Vacondio et al., 2014; Zhao et al., 2019) because of its rather complex 2D nature and the presence of moving wet–dry fronts. The topography is defined as
where r is the radius, h_{0} is the water depth at the centre of the paraboloid, a is the distance from the centre to the zeroelevation shoreline, L is the length of the squareshaped domain, and x and y denote coordinates inside the domain. The analytical solution is derived in Thacker (1981). We use the same values as Delestre et al. (2013), that is h_{0} = 0.1 m, a = 1 m, and L = 4 m. The simulation is run for three periods (T = 2.242851 s), with a spatial resolution of δx = 0.01 m. The analytical solution can be found in Thacker (1981); Delestre et al. (2013).
Snapshots of the simulation are plotted in Fig. 7 and compared to the analytical solution. The model results agree well with the analytical solution after three periods, with slight growingphase error, as is commonly observed on this test case.
4.4 Variable rainfall over a sloping plane
Govindaraju et al. (1990) presented an analytical solution to a timedependent rainfall over a sloping plane, which is commonly used (CaviedesVoullième et al., 2020a; Gottardi and Venutelli, 2008; Singh et al., 2015). The plane is 21.945 m long, with a slope of 0.04. We select rainfall B from Govindaraju et al. (1990), a piecewise constant rainfall with two periods of alternating low and high intensities (50.8 and 101.6 mm h^{−1}) up until 2400 s. Friction is modelled with Chezy's equation, with a roughness coefficient of 1.767 ${\mathrm{m}}^{\mathrm{1}/\mathrm{2}}\phantom{\rule{0.125em}{0ex}}{\mathrm{s}}^{\mathrm{1}}$. The computational domain was defined by a 200 × 10 grid, with δx = 0.109725 m.
The simulated discharge hydrograph at the outlet is compared against the analytical solution in Fig. 8. The numerical solutions matches the analytical one very well. The only relevant difference occurs in the magnitude of the second discharge peak, which is slightly underestimated in the simulation.
5.1 Experimental steady and dam break flows over complex geometry
MartínezAranda et al. (2018) presented experimental results of steady and transient flows over several obstacles while recording transient 3D water surface elevation in the region of interest. We selected the socalled G3 case and simulated both a dam break and steady flow. The experiment took place in a doublesloped plexiglass flume, 6 m long and 24 cm wide. The obstacles in this case are a symmetric contraction and a rectangular obstacle on the centreline, downstream of the contraction.
For both cases the flume (including the upstream wider reservoir) was discretised at a 5 mm resolution, resulting in a computational domain with 106 887 cells. Manning's roughness was set to 0.01 $\mathrm{s}\phantom{\rule{0.125em}{0ex}}{\mathrm{m}}^{\mathrm{1}/\mathrm{3}}$. The steady simulation was run from an initial state with uniform depth h = 5 cm up to t = 300 s. The dam break simulation duration was 40 s.
The steadyflow case had a discharge of 2.5 L s^{−1}. Steady water surface results in the obstacle region are shown in Fig. 9 for a centreline profile (y=0) and a crosssection at the rectangular obstacle, specifically at x = 2.40 m (the coordinate system is set at the centre of the flume inlet gate). The simulation results approximate experimental results well. The mismatches are similar to those analysed by MartínezAranda et al. (2018) and can be attributed to turbulent and 3D phenomena near the obstacles.
The dam break case is triggered by a sudden opening of the gate followed by a wave advancing along the dry flume. Results for this case at three gauge points are shown in Fig. 10. Again, the simulations approximate experiments well, capturing both the overall behaviour of the water depths and the arrival of the dam break wave, with local errors attributable to the violent dynamics (MartínezAranda et al., 2018).
5.2 Experimental unsteady flow over an island
Briggs et al. (1995) presented an experimental test of an unsteady flow over a conical island. This test has been extensively used for benchmarking (Bradford and Sanders, 2002; Choi et al., 2007; GarcíaNavarro et al., 2019; Hou et al., 2013b; Liu et al., 1995; Lynett et al., 2002; Nikolos and Delis, 2009). A truncated cone of base diameter 7.2 m and top diameter 2.2 m and with a height of 0.625 m was placed at the centre of a 26 m × 27.6 m smooth and flat domain. An initial hydrostatic water level of h_{0} = 0.32 m was set, and a wave was imposed on the boundary following
where A = 0.032 m is the wave amplitude, and T = 2.84 s is the time at which the peak of the wave enters the domain. Figure 11 shows results for a simulation with a 2.5 cm resolution, resulting in 1.2 million cells. A roughness coefficient of 0.013 $\mathrm{s}\phantom{\rule{0.125em}{0ex}}{\mathrm{m}}^{\mathrm{1}/\mathrm{3}}$ was used for the concrete surface. The results are comparable to previous solutions in the literature, in general reproducing well the water surface, with some delay over experimental measurements.
5.3 Experimental rainfall runoff over an idealised urban area
Cea et al. (2010a) presented experimental and numerical results for a range of laboratoryscale rainfall runoff experiments on an impervious surface with different arrangements of buildings, which have been frequently used for model validation (CaviedesVoullième et al., 2020a; Cea et al., 2010b; Cea and Bladé, 2015; FernándezPato et al., 2016; Su et al., 2017; Xia et al., 2017). This laboratoryscale test includes nontrivial topographies, small water layers, and wetting–drying fronts, making it a good benchmark for realistic rainfall runoff conditions.
The dimensions of the experimental flume are 2 m × 2.5 m. Here, we select one building arrangement named A12 by Cea et al. (2010a). The original digital elevation model (DEM) is available (from Cea et al., 2010a) at a resolution of 1 cm. The buildings are 20 cm high and are represented as topographical features on the domain. All boundaries are closed, except for the free outflow at the outlet. The domain was discretised with a δx = 1 cm resolution, resulting in 54 600 cells. The domain was forced by two constant pulses of rain of 85 and 300 mm h^{−1} (lowest and highest intensities in the experiments) with durations of 60 and 20 s. The simulation was run up to t = 200 s. Friction was modelled by Manning's equation, with a constant roughness coefficient of 0.010 $\mathrm{s}\phantom{\rule{0.125em}{0ex}}{\mathrm{m}}^{\mathrm{1}/\mathrm{3}}$ for steel (Cea et al., 2010a).
Figure 12 shows the experimental and simulated outflow discharge for both rainfall pulses. There is a very good qualitative agreement, and peak flow is quantitatively well reproduced by the simulations. For the 300 mm h^{−1} intensity rainfall, the onset of runoff is earlier than in the experiments, and overall the hydrograph is shifted towards earlier times. Cea et al. (2010a) observed a similar behaviour and pointed out that this is likely caused by surface tension during the early wetting of the surface, and it was most noticeable on the experiments with higher rainfall intensity.
6.1 Plotscale field rainfall runoff experiment
Tatard et al. (2008) presented a rainfall runoff plotscale experiment performed in Thiès, Senegal. This test has been used often for benchmarking of rainfall runoff models (CaviedesVoullième et al., 2020a; Chang et al., 2016; Mügler et al., 2011; ÖzgenXian et al., 2020; Park et al., 2019; Simons et al., 2014; Yu and Duan, 2017; Weill, 2007). The domain is a field plot of 10 m × 4 m, with an average slope of 0.01. A rainfall simulation with an intensity of 70 mm h^{−1} during 180 s was performed. Steady velocity measurements were taken at 62 locations. The Gauckler–Manning roughness coefficient was set to 0.02 $\mathrm{s}\phantom{\rule{0.125em}{0ex}}{\mathrm{m}}^{\mathrm{1}/\mathrm{3}}$, and a constant infiltration rate was set to 0.0041667 mm s^{−1} (Mügler et al., 2011). The domain was discretised with δx = 0.02666 m, resulting in 56 250 cells, with a single freeoutflow boundary downslope.
Simulated velocities are compared to experimental velocities at the 62 gauged locations in Fig. 13. A good agreement between simulated and experimental velocities exists, especially in the lowervelocity range. The agreement is similar to previously reported results (e.g. CaviedesVoullième et al., 2020a), and the differences between simulated and observed velocities have been shown to be a limitation of a depthindependent roughness and Manning's model (Mügler et al., 2011).
6.2 Malpasset dam break
The Malpasset dam break event (Hervouet and Petitjean, 1999) is the most commonly used realscale benchmark test in shallowwater modelling (An et al., 2015; Brodtkorb et al., 2012; Brufau et al., 2004; CaviedesVoullième et al., 2020b; Duran et al., 2013; George, 2010; Hervouet and Petitjean, 1999; Hou et al., 2013a; Kesserwani and Liang, 2012; Kesserwani and Sharifian, 2020; Kim et al., 2014; Liang et al., 2007; Sætra et al., 2015; Schwanenberg and Harms, 2004; Smith and Liang, 2013; Valiani et al., 2002; Xia et al., 2011; Yu and Duan, 2012; Wang et al., 2011; Zhou et al., 2013; Zhao et al., 2019). Although it may not be particularly challenging for current solvers, it remains an interesting case due to its scale and the available field and experimental data (Aureli et al., 2021). The computational domain was discretised to δx = 25 m and δx = 10 m (resulting in 83 137 and 515 262 cells, respectively). The Gauckler–Manning coefficient was set to a uniform value of 0.033 $\mathrm{s}\phantom{\rule{0.125em}{0ex}}{\mathrm{m}}^{\mathrm{1}/\mathrm{3}}$, which has been shown to be a good approximation in the literature. Figure 14 shows a comparison of simulated water surface elevation (WSE) and arrival time for two resolutions against the reference experimental and field data. Figure 15 shows the geospatial distribution of the relative WSE error and the ratio of the simulated arrival time to the observed time. Overall, WSE shows a good agreement and somewhat smaller scatter for the higher resolution. Arrival time tends to be overestimated, somewhat more for coarser resolutions.
In this section we report an investigation of the computational performance and parallel scaling of SERGHEISWE for selected test cases. To demonstrate performance portability, we show performance metrics for both OpenMP and CUDA backends enabled by Kokkos, computed on CPU and GPU architectures, respectively. For that, hybrid MPIOpenMP and MPICUDA implementations are used, with one MPI task per node for MPIOpenMP and one MPI task per GPU for MPICUDA. Most of the runs were performed on JUWELS at JSC (Jülich Supercomputing Centre). Additional HPC systems were also used for come cases. Properties of all systems are shown in Table 4. Additionally, we provide performance metrics on nonHPC systems, including some consumergrade GPUs.
It is important to highlight that no performance tuning or optimisation has been carried out for these tests and that no systemspecific porting efforts were done. All runs relied entirely on Kokkos for portability. The code was simply compiled with the available software stacks in the HPC systems and executed. All results reported here were computed using doubleprecision arithmetic.
7.1 Singlenode scaling – Malpasset dam break
The commonly used Malpasset dam break test (introduced in Sect. 6.2) was also tested for computational performance at a resolution of δx = 10 m. Results are shown in Fig. 16. The case was computed on CPUs, a single JUWELS node, and a single JURECADC node. Three additional runs with single Nvidia GPUs were carried out: a commercialgrade GeForce RTX 3070, 8 GB GPU (in a desktop computer), and two scientificgrade cards V100 and A100, respectively (in JUWELS). As Fig. 16 shows, CPU runtime quickly approaches an asymptotic behaviour (therefore demonstrating that additional nodes are not useful in this case). Notably, all three GPUs outperform a single CPU node, and the performance gradient among the GPUs is evident. The A100 GPU is roughly 6.5 faster than a full JUWELS CPU node, and even for the consumergrade RTX 3070, the speedup compared to a single HPC node is 2.2. Although it is possible to scale up this case with significantly higher resolution and test it with multiple GPUs, it is not a case well suited to such a scaling test. Multiple GPUs (as well as multiple nodes with either CPUs or GPUs) require a domain decomposition. The orientation of the Malpasset domain is roughly NW–SE, which makes both 1D decompositions (along x or y) and 2D decompositions (x and y) inefficient, as many regions have no computational load. Moreover, the dam break nature of the case implies that a large part of the valley is dry for long periods of time; therefore, load balancing among the different nodes and/or GPUs will be poor.
7.2 HPC scaling – 2D circular dam break case
This is a simple analytical verification test in the shallowwater literature, which generalises the 1D dam break solution. We purposely select this case (instead of one of the many verification problems) for its convenience for scaling studies. Firstly, resolution can be increased at will. Additionally, the square domain allows for trivial domain decomposition, which together with the fully wet domain and the radially symmetric flow field minimises loadbalancing issues. Essentially, it allows for a very clean scalability test with minimal interference from the problem topology, which facilitates scalability and performance analysis (in contrast to the limitations of the Malpasset domain discussed in Sect. 7.1). We take a 400 m × 400 m flat domain with the centre at (0,0) and initial conditions given by
We generated three computational grids, with δx = 0.05, 0.025, 0.0175 m, which correspond to 64, 256, and 552 million cells, respectively. Figure 17 shows the strongscaling results for the 64 and 256 million cells cases, computed in the JUWELSbooster system, on A100 Nvidia GPUs. The 64 million does not scale well beyond 4 GPUs. However, the 256millioncells problem scales well up to 64 GPUs (and efficiency starts to decrease with 128), showing that the first case simply is too small for significant gains.
For the 552millioncells grid, only two runs were computed with 128 and 160 GPUs (corresponding to 32 and 40 nodes in JUWELSbooster, respectively). Runtime for these was 95.4 and 84.7 s, respectively, implying a very good 89 % scaling efficiency for this large number of GPUs. For this problem and these resources, the time required for interGPU communications is comparable to that used by kernels computing fluxes and updating cells, signalling scalability limits for this case on the current implementation.
7.3 HPC scaling of rainfall runoff in a large catchment
To demonstrate scaling under production conditions of real scenarios, we use an idealised rainfall runoff simulation over the Lower Triangle region in the East River Watershed (Colorado, USA) (Carroll et al., 2018; Hubbard et al., 2018; ÖzgenXian et al., 2020). The domain has an area of 14.82 km^{2} and elevations ranging from 2759–3787 m. The computational problem is defined with a resolution of δx = 0.5 m (matching the highestresolution DEM available), resulting in 122 × 10^{6} computational cells. Although this is not a particularly large catchment, the veryhighresolution DEM available makes it an interesting performance benchmark, which is our sole interest in this paper.
For practical purposes, two configurations have been used for this test: a short rainfall of T = 870 s, which was computed in Cori and JUWELS to assess CPU performance and scalability (results shown in Fig. 18), and a long rainfall event lasting T = 12 000 s, which was simulated in Summit and JUWELS to assess GPU performance and scalability, with results shown in Fig. 19. CPU results (Fig. 18) show that the strong scaling behaviour in Cori and JUWELS is very similar. Absolute runtimes are longer for Cori, since the scaling study was carried out starting from a single core, whereas in JUWELS it was with a full node (i.e. 48 cores). Most importantly, the GPU strongscaling behaviour overlaps almost completely between JUWELS and Summit, although computations in Summit were somewhat faster. CPU and GPU scaling are clearly highly efficient, with similar behaviour. These results demonstrate the performance portability delivered via Kokkos to SERGHEI.
In this paper we present the SERGHEI framework and, in particular, the SERGHEISWE module. SERGHEISWE implements a 2D fully dynamic shallowwater solver, harnessing stateoftheart numerics and leveraging on Kokkos to facilitate portability across architectures. We show through empirical evidence with a large set of wellestablished benchmarks that SERGHEISWE is accurate, numerically stable, and robust. Importantly, we show that SERGHEISWE's parallel scaling is very good for CPUbased HPC systems, consumergrade GPUs, and GPUbased HPC systems. Consequently, we claim that SERGHEI is indeed performance portable and approaching exascale readiness. These features make SERGHEISWE a plausible community code for shallowwater modelling for a wide range of applications requiring largescale, veryhighresolution simulations.
Exploiting increasingly better and highly resolved geospatial information (DEMs, land use, vector data of structures) prompts the need for highresolution solvers. At the same time, the push towards the study of multiscale systems and integrated management warrants increasingly larger domains. Together, these trends result in larger computational problems, motivating the need for exascaleready shallowwater solvers. Additionally, HPC technology is evermore available, not only via (inter)national research facilities but also through cloudcomputing facilities. It is arguably timely to enable such an HPCready solver.
The HPC allows for not only large simulations but also large ensembles of simulations, allowing uncertainty issues to be addressed and enabling scenario analysis for engineering problems, parameter space exploration, and hypothesis testing. Furthermore, although the benefits of high resolution may be marginal for runoff hydrograph estimations, they allow the local dynamics to be better resolved in the domain. Flow paths, transit times, wetting–drying dynamics, and connectivity play important roles in transport and ecohydrological processes. For these purposes, enabling veryhighresolution simulations will prove to be highly beneficial. We also envision that, provided with sufficient computational resources, SERGHEISWE could be used for operational flood forecasting and probabilistic flashflood modelling. Altogether, this strongly paves the way for the uptake of shallowwater solvers by the broader ESM community and its coupling to Earth system models, as well as their many applications, from process and system understanding to hydrometeorological risk and impact assessment. We also envision that, for users not requiring HPC capabilities, the benefit of SERGHEISWE is access to a transparent, opensourced, performanceportable software that allows workstation GPUs to be exploited efficiently.
As additional SERGHEI modules become operational, the HPC capabilities will further enable simulations that are unfeasible with the current generation of available solvers. For example, with a fully operational transport and morphology module, it will be possible to run decadelong morphological simulations relevant for river management applications; to better capture sediment connectivity and sediment cascades across the landscape, a relevant topic for erosion and catchment management; or to perform catchmentscale hydrobiogeochemical simulations with unprecedented high spatial resolutions for better understanding of ecohydrological and biogeochemical processes.
Finally, SERGHEI is conceptualised and designed with extendibility and software interoperability in mind, with design choices made to facilitate foreseeable future developments on a wide range of topics, such as

numerics, e.g. the discontinuous Galerkin discretisation strategies (CaviedesVoullième and Kesserwani, 2015; Shaw et al., 2021) and multiresolution adaptive meshing (CaviedesVoullième et al., 2020b; Kesserwani and Sharifian, 2020, 2022; ÖzgenXian et al., 2020);

interfaces to mature geochemistry engines, e.g. CrunchFlow (Steefel, 2009) and PFLOTRAN (Lichtner et al., 2015);

vegetation models with varying degrees of complexity, for example, Ecosys (e.g. Grant et al., 2007; Grant and the Ecosys development team, 2022) and EcH2O (Maneta and Silverman, 2013).
This appendix contains an extended set of relevant test cases that are commonly used as validation cases in the literature. It complements and extends the verification evidence in Sect. 4.
A1 C property: immersed bump
Using the same setup as in Sect. 4.1.1, but with a higher water surface elevation, in Fig. A1 we demonstrate how SERGHEISWE also conserves the C property for an immersed bump.
A2 Wellbalancing
To further show that SERGHEISWE is wellbalanced, we computed three steady flows over a bump. We include a transcritical flow with a shock wave, a fully subcritical flow, and a transcritical flow, as shown in Fig. A2. All of SERGHEISWE predictions show excellent agreement with the analytical solution. The constant unit discharge is captured with machine accuracy without oscillations at the shock, which is an inherent feature of the augmented Roe solver (Murillo and GarcíaNavarro, 2010).
We also include two additional cases from MacDonald et al. (1995) for fully supercritical and subcritical flows in Fig. A3. These results and their L norms in Table A1 further confirm wellbalancing.
Additionally, MacDonaldtype solutions can be constructed for frictionless flumes to study the bed slope source term implementation in isolation. We present a frictionless test case with SERGHEISWE that is not part of the SWASHES benchmark compilation. We discretise the bed elevation of the flume as
where C_{0} is an arbitrary integration constant and q_{0} is a specified unit discharge. The water depth for this topography is
Using C_{0} = 1.0 m and q_{0} = 1.0 m^{2} s^{−1}, we obtain the solution plotted in Fig. A4. SERGHEISWE's prediction and the analytical solution show good agreement.
L norms for errors in water depth are summarised in Table A1 for the sake of completeness. L norms of a vector x with length N and entries x_{i}, where $i\in [\mathrm{0},N)\subset {\mathbb{Z}}^{+}$ is the index of the entries, are calculated as
with $\langle n\rangle \in {\mathbb{Z}}^{+}$ being the order of the L norm. The L_{∞} norm is calculated as
The L norms for errors in unit discharge are in the range of machine accuracy for all cases and are omitted here.
A3 Dam break over a wet bed without friction
The dam break on a wetbedwithoutfriction test case is configured by setting water depths in the domain as h_{L} = 0.005 m and h_{R} = 0.001 m. The domain is 10 m long, and the discontinuity is located at x_{0} = 5 m. The total run time is 6 s. Figure A5 shows the model results obtained on successively refined grids compared against the analytical solution by Stoker (1957). Errors for this test case are reported in Table A2. We also report the observed convergence rate qR, calculated on the basis of the L_{1} norm. As the grid is refined, the model result converges to the analytical solution. Due to the discontinuities in the solution, the observed convergence rate is below the theoretical convergence rate of R=1.
A4 Radially symmetrical paraboloid
Using the same computational domain and bed topography as the case in Sect. 4.3, results for the radially symmetrical oscillation in a frictionless paraboloid (Thacker, 1981) are presented here. The details about the initial condition and the analytical solution for the water depth and velocities can be found in Delestre et al. (2013). In particular, the analytical solution at t = 0 s is set as the initial condition, and three periods are simulated using δx = 0.01 m as the grid resolution. Figure A6 shows the numerical and analytical solution at four different times. Although the analytical solution is periodic without dumping, the numerical results show a diffusive behaviour attributed to the numerical diffusion introduced by the firstorder scheme. Other than that, model results show good agreement with the analytical solution.
A5 Experimental laboratoryscale tsunami
A 1:400scale experiment of a tsunami runup over the Monai valley (Japan) was reported by Matsuyama and Tanaka (2001); The third international workshop on longwave runup models (2004), providing experimental data on the temporal evolution of the water surface at three locations and of the maximum runup. A laboratory basin of 2.05 m × 3.4 m was used to create a physical scale model of the Monai coastline. A tsunami was simulated by appropriate forcing of the boundary conditions. This experiment has been extensively used to benchmark SWE solvers (Arpaia and Ricchiuto, 2018; CaviedesVoullième et al., 2020b; Hou et al., 2015, 2018; Kesserwani and Liang, 2012; Kesserwani and Sharifian, 2020; MoralesHernández et al., 2014; Murillo et al., 2009; Murillo and GarcíaNavarro, 2012; Nikolos and Delis, 2009; SerranoPacheco et al., 2009; Vater et al., 2019). The domain was discretised with a resolution of 1.4 cm, producing 95 892 elements. Simulated water surface elevations are shown together with the experimental measurements in Fig. A7 at three gauge locations. The results agree well with experimental measurements, both in the water surface elevations and the arrival times of the waves.
A6 Experimental dam break over a triangular sill
Hiver (2000) presented a large flume experiment of a dam break over a triangular sill, which is a standard benchmark in dam break problems (CaviedesVoullième and Kesserwani, 2015; Bruwier et al., 2016; Kesserwani and Liang, 2010; Loukili and Soulaïmani, 2007; Murillo and GarcíaNavarro, 2012; Yu and Duan, 2017; Zhou et al., 2013), together with the reducedscale version (SoaresFrazāo, 2007; Hou et al., 2013a, b; Yu and Duan, 2017).
The computational domain was discretised with a 380 × 5 grid, with a δx = 0.1 m resolution. Figure A8 shows simulated and experimental results for the triangular sill case. A very good agreement can be observed, both in terms of peak depths occurring whenever the shock wave passes through a gauge and in the timing of the shock wave movement. The simulations tend to slightly overestimate the peaks of the shock wave, as well as to overestimate the waves downstream of the sill (see plot for gauge at x = 35.5 m). Both behaviours are well documented in the literature.
A7 Experimental idealised urban dam break
A laboratoryscale experiment of a dam break over an idealised urban area was reported by SoaresFrazāo and Zech (2008) in a concrete channel including 25 obstacles representing buildings separated by 10 cm. It is widely used in the shallowwater community (Abderrezzak et al., 2008; CaviedesVoullième et al., 2020b; Ginting, 2019; Hartanto et al., 2011; Jeong et al., 2012; Özgen et al., 2015a; Petaccia et al., 2010; Wang et al., 2017) because of its fundamental phenomenological interest and because it is demanding in terms of numerical stability and model performance. The small buildings and streets in the geometry require sufficiently high resolutions, both to capture the geometry and to capture the complex flow phenomena which are triggered in the streets. Experimental measurements of transient water depth exist at different locations, including in between the buildings. A resolution of 2 cm was used for the simulated results in Fig. A9, together with experimental data. The results agree well with the experimental observations to a similar degree as to what has been reported in the literature.
A8 Experimental rainfall runoff over a dense idealised urban area
Cea et al. (2010b) presented a laboratoryscale experiment in a flume with a dense idealised urban area. The case elaborates on the setup of Cea et al. (2010a) (Sect. 5.3), including 180 buildings (case L180) in contrast to the 12 buildings in Sect. 5.3, which potentially requires a higher resolution to resolve the building (6.2 cm sides) and street width (∼ 2 cm) and the flow in the streets. We keep a 1 cm resolution. Rainfall is a single pulse of constant intensity. Two setups were used with intensities 180 and 300 mm h^{−1} and durations of 60 and 20 s, respectively. As Fig. A10 shows, the hydrographs are well captured by the simulation, albeit with a delay. Analogously to Sect. 5.3, this can be attributed to surface tension in the early wetting phase.
CFL  Courant–Friedrichs–Lewy 
Cori  Cori supercomputer at the National Energy Research Scientific Computing Center (USA) 
CPU  Central processing unit 
CUDA  Compute Unified Device Architecture, programming interface for Nvidia GPUs 
El Capitan  El Capitan supercomputer at the Lawrence Livermore National Laboratory (USA) 
ESM  Earth system modelling 
Frontier  Frontier supercomputer at the Oak Ridge Leadership Computing Facility (USA) 
GPU  Graphics processing unit 
HIP  Heterogeneous Interface for Portability, programming interface for AMD GPUs 
HPC  Highperformance computing 
JURECADC  Data Centric module of the Jülich Research on Exascale Cluster Architectures supercomputer at the Jülich Supercomputing Centre (Germany) 
JUWELS  Jülich Wizard for European Leadership Science, supercomputer at the Jülich Supercomputing Centre (Germany) 
JUWELSbooster  Booster module of the JUWELS supercomputer (Germany) 
Kokkos  Kokkos, a C++ performance portability layer 
LUMI  LUMI supercomputer at CSC (Finland) 
OpenMP  Open MultiProcessing, sharedmemory programming interface for parallel computing 
MPI  Message Passing Interface for parallel computing 
SERGHEI  Simulation EnviRonment for Geomorphology, Hydrodynamics, and Ecohydrology in Integrated form 
SERGHEISWE  SERGHEI's shallowwater equations solver 
Summit  Summit supercomputer at the Oak Ridge Leadership Computing Facility (USA) 
SWE  Shallowwater equations 
SYCL  A programming model for hardware accelerators 
UVM  Unified Virtual Memory 
WSE  Water surface elevation 
SERGHEI is available through GitLab, at https://gitlab.com/sergheimodel/serghei (last access: 6 February 2023), under a 3clause BSD license. SERGHEI v1.0 was tagged as the first release at the time of submission of this paper. A static version of SERGHEI v1.0 is archived in Zenodo, DOI: https://doi.org/10.5281/zenodo.7041423 (Caviedes Voullième et al., 2022a).
A repository containing test cases is available at https://gitlab.com/sergheimodel/serghei_testcases. This repository contains many of the cases reported here, except those for which we cannot publicly release data but which can be obtained from the original authors of the datasets. A static version of this datasets is archived in Zenodo, with DOI: https://doi.org/10.5281/zenodo.7041392 (Caviedes Voullième et al., 2022b).
Additional convenient pre and postprocessing tools are also available at https://gitlab.com/sergheimodel/sergheir (last access: 6 February 2023).
DCV contributed to conceptualisation, investigation, software development, model validation, visualisation, and writing. MMH contributed to conceptualisation, methodology design, software development, formal analysis, model validation, and writing. MRN contributed to software development. IÖX contributed to formal analysis, software development, model validation, visualisation, and writing.
The contact author has declared that none of the authors has any competing interests.
Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The authors gratefully acknowledge the Earth System Modelling Project (ESM) for supporting this work by providing computing time on the ESM partition of the JUWELS supercomputer at the Jülich Supercomputing Centre (JSC) through the compute time project Runoff Generation and Surface Hydrodynamics across Scales with the SERGHEI model (RUGSHAS), project no. 22686. This work used resources of the National Energy Research Scientific Computing Center (NERSC), a US Department of Energy, Office of Science, user facility operated under contract no. DEAC0205CH11231. This research was also supported by the US Air Force Numerical Weather Modelling programme and used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, which is a US Department of Energy (DOE) Office of Science User Facility.
The article processing charges for this openaccess publication were covered by the Forschungszentrum Jülich.
This paper was edited by Charles Onyutha and reviewed by Reinhard Hinkelmann and Kenichiro Kobayashi.
Abderrezzak, K. E. K., Paquier, A., and Mignot, E.: Modelling flash flood propagation in urban areas using a twodimensional numerical model, Nat. Hazards, 50, 433–460, https://doi.org/10.1007/s1106900893000, 2008. a
Alexander, F., Almgren, A., Bell, J., Bhattacharjee, A., Chen, J., Colella, P., Daniel, D., DeSlippe, J., Diachin, L., Draeger, E., Dubey, A., Dunning, T., Evans, T., Foster, I., Francois, M., Germann, T., Gordon, M., Habib, S., Halappanavar, M., Hamilton, S., Hart, W., Huang, Z. H., Hungerford, A., Kasen, D., Kent, P. R. C., Kolev, T., Kothe, D. B., Kronfeld, A., Luo, Y., Mackenzie, P., McCallen, D., Messer, B., Mniszewski, S., Oehmen, C., Perazzo, A., Perez, D., Richards, D., Rider, W. J., Rieben, R., Roche, K., Siegel, A., Sprague, M., Steefel, C., Stevens, R., Syamlal, M., Taylor, M., Turner, J., Vay, J.L., Voter, A. F., Windus, T. L., and Yelick, K.: Exascale applications: skin in the game, Philos. T. R. Soc. A, 378, 20190056, https://doi.org/10.1098/rsta.2019.0056, 2020. a
An, H., Yu, S., Lee, G., and Kim, Y.: Analysis of an open source quadtree grid shallow water flow solver for flood simulation, Quatern. Int., 384, 118–128, https://doi.org/10.1016/j.quaint.2015.01.032, 2015. a
Arpaia, L. and Ricchiuto, M.: radaptation for Shallow Water flows: conservation, well balancedness, efficiency, Comput. Fluids, 160, 175–203, https://doi.org/10.1016/j.compfluid.2017.10.026, 2018. a
Artigues, V., Kormann, K., Rampp, M., and Reuter, K.: Evaluation of performance portability frameworks for the implementation of a particleincell code, Concurr. Comput.Pract. E., 32, https://doi.org/10.1002/cpe.5640, 2019. a
Aureli, F., Maranzoni, A., Mignosa, P., and Ziveri, C.: A weighted surfacedepth gradient method for the numerical integration of the 2D shallow water equations with topography, Adv. Water Resour., 31, 962–974, https://doi.org/10.1016/j.advwatres.2008.03.005, 2008. a
Aureli, F., Prost, F., Vacondio, R., Dazzi, S., and Ferrari, A.: A GPUAccelerated ShallowWater Scheme for Surface Runoff Simulations, Water, 12, 637, https://doi.org/10.3390/w12030637, 2020. a
Aureli, F., Maranzoni, A., and Petaccia, G.: Review of Historical DamBreak Events and Laboratory Tests on Real Topography for the Validation of Numerical Models, Water, 13, 1968, https://doi.org/10.3390/w13141968, 2021. a
Ayog, J. L., Kesserwani, G., Shaw, J., Sharifian, M. K., and Bau, D.: Secondorder discontinuous Galerkin flood model: Comparison with industrystandard finite volume models, J. Hydrol., 594, 125924, https://doi.org/10.1016/j.jhydrol.2020.125924, 2021. a
Bates, P. and Roo, A. D.: A simple rasterbased model for flood inundation simulation, J. Hydrol., 236, 54–77, https://doi.org/10.1016/S00221694(00)00278X, 2000. a
Bauer, P., Dueben, P. D., Hoefler, T., Quintino, T., Schulthess, T. C., and Wedi, N. P.: The digital revolution of Earthsystem science, Nature Computational Science, 1, 104–113, https://doi.org/10.1038/s43588021000230, 2021. a
Beckingsale, D. A., Burmark, J., Hornung, R., Jones, H., Killian, W., Kunen, A. J., Pearce, O., Robinson, P., Ryujin, B. S., and Scogland, T. R.: RAJA: Portable Performance for LargeScale Scientific Applications, in: 2019 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), 71–81, https://doi.org/10.1109/p3hpc49587.2019.00012, 2019. a
Bellos, V. and Tsakiris, G.: A hybrid method for flood simulation in small catchments combining hydrodynamic and hydrological techniques, J. Hydrol., 540, 331–339, https://doi.org/10.1016/j.jhydrol.2016.06.040, 2016. a
Berger, M. J., George, D. L., LeVeque, R. J., and Mandli, K. T.: The GeoClaw software for depthaveraged flows with adaptive refinement, Adv. Water Resour., 34, 1195–1206, https://doi.org/10.1016/j.advwatres.2011.02.016, 2011. a
Bertagna, L., Deakin, M., Guba, O., Sunderland, D., Bradley, A. M., Tezaur, I. K., Taylor, M. A., and Salinger, A. G.: HOMMEXX 1.0: a performanceportable atmospheric dynamical core for the Energy Exascale Earth System Model, Geosci. Model Dev., 12, 1423–1441, https://doi.org/10.5194/gmd1214232019, 2019. a
Bomers, A., Schielen, R. M. J., and Hulscher, S. J. M. H.: The influence of grid shape and grid size on hydraulic river modelling performance, Environ. Fluid Mech., 19, 1273–1294, https://doi.org/10.1007/s10652019096704, 2019. a
Bout, B. and Jetten, V.: The validity of flow approximations when simulating catchmentintegrated flash floods, J. Hydrol., 556, 674–688, https://doi.org/10.1016/j.jhydrol.2017.11.033, 2018. a
Bradford, S. F. and Sanders, B. F.: FiniteVolume Model for ShallowWater Flooding of Arbitrary Topography, J. Hydraul. Eng., 128, 289–298, https://doi.org/10.1061/(asce)07339429(2002)128:3(289), 2002. a
Briggs, M. J., Synolakis, C. E., Harkins, G. S., and Green, D. R.: Laboratory experiments of tsunami runup on a circular island, Pure Appl. Geophys., 144, 569–593, https://doi.org/10.1007/bf00874384, 1995. a
Brodtkorb, A. R., Sætra, M. L., and Altinakar, M.: Efficient shallow water simulations on GPUs: Implementation, visualization, verification, and validation, Comput. Fluids, 55, 1–12, https://doi.org/10.1016/j.compfluid.2011.10.012, 2012. a, b
Brufau, P., GarcíaNavarro, P., and VázquezCendón, M. E.: Zero mass error using unsteady wettingdrying conditions in shallow flows over dry irregular topography, Int. J. Numer. Meth. Fl., 45, 1047–1082, https://doi.org/10.1002/fld.729, 2004. a
Brunner, G.: HECRAS 2D User's Manual Version 6.0, Hydrologic Engineering Center, Davis, CA, USA, https://www.hec.usace.army.mil/confluence/rasdocs/r2dum/latest (last access: 22 August 2022), 2021. a
Brunner, P. and Simmons, C. T.: HydroGeoSphere: A Fully Integrated, Physically Based Hydrological Model, Ground Water, 50, 170–176, https://doi.org/10.1111/j.17456584.2011.00882.x, 2012. a
Bruwier, M., Archambeau, P., Erpicum, S., Pirotton, M., and Dewals, B.: Discretization of the divergence formulation of the bed slope term in the shallowwater equations and consequences in terms of energy balance, Appl. Math. Model., 40, 7532–7544, https://doi.org/10.1016/j.apm.2016.01.041, 2016. a
Burguete, J., GarcíaNavarro, P., and Murillo, J.: Friction term discretization and limitation to preserve stability and conservation in the 1D shallowwater model: Application to unsteady irrigation and river flow, Int. J. Numer. Meth. Fl., 58, 403–425, https://doi.org/10.1002/fld.1727, 2008. a
ButtingerKreuzhuber, A., Horváth, Z., Noelle, S., Blöschl, G., and Waser, J.: A fast secondorder shallow water scheme on twodimensional structured grids over abrupt topography, Adv. Water Resour., 127, 89–108, https://doi.org/10.1016/j.advwatres.2019.03.010, 2019. a
ButtingerKreuzhuber, A., Konev, A., Horváth, Z., Cornel, D., Schwerdorf, I., Blöschl, G., and Waser, J.: An integrated GPUaccelerated modeling framework for highresolution simulations of rural and urban flash floods, Environ. Modell. Softw., 156, 105480, https://doi.org/10.1016/j.envsoft.2022.105480, 2022. a
Caldas Steinstraesser, J. G., Delenne, C., FinaudGuyot, P., Guinot, V., Kahn Casapia, J. L., and Rousseau, A.: SW2DLEMON: a new software for upscaled shallow water modeling, in: Simhydro 2021 – 6th International Conference Models for complex and global water issues – Practices and expectations, Sophia Antipolis, France, https://hal.inria.fr/hal03224050 (last access: 22 August 2022), 2021. a
Carlotto, T., Chaffe, P. L. B., dos Santos, C. I., and Lee, S.: SW2DGPU: A twodimensional shallow water model accelerated by GPGPU, Environ. Modell. Softw., 145, 105205, https://doi.org/10.1016/j.envsoft.2021.105205, 2021. a
Carroll, R. W. H., Bearup, L. A., Brown, W., Dong, W., Bill, M., and Willlams, K. H.: Factors controlling seasonal groundwater and solute flux from snowdominated basins, Hydrol. Process., 32, 2187–2202, https://doi.org/10.1002/hyp.13151, 2018. a
CaviedesVoullième, D. and Kesserwani, G.: Benchmarking a multiresolution discontinuous Galerkin shallow water model: Implications for computational hydraulics, Adv. Water Resour., 86, 14–31, https://doi.org/10.1016/j.advwatres.2015.09.016, 2015. a, b, c
CaviedesVoullième, D., GarcíaNavarro, P., and Murillo, J.: Influence of mesh structure on 2D full shallow water equations and SCS Curve Number simulation of rainfall/runoff events, J. Hydrol., 448–449, 39–59, https://doi.org/10.1016/j.jhydrol.2012.04.006, 2012. a, b
CaviedesVoullième, D., FernándezPato, J., and Hinz, C.: Cellular Automata and Finite Volume solvers converge for 2D shallow flow modelling for hydrological modelling, J. Hydrol., 563, 411–417, https://doi.org/10.1016/j.jhydrol.2018.06.021, 2018. a, b
CaviedesVoullième, D., FernándezPato, J., and Hinz, C.: Performance assessment of 2D ZeroInertia and Shallow Water models for simulating rainfallrunoff processes, J. Hydrol., 584, 124663, https://doi.org/10.1016/j.jhydrol.2020.124663, 2020a. a, b, c, d, e, f, g, h
CaviedesVoullième, D., Gerhard, N., Sikstel, A., and Müller, S.: Multiwaveletbased mesh adaptivity with Discontinuous Galerkin schemes: Exploring 2D shallow water problems, Adv. Water Resour., 138, 103559, https://doi.org/10.1016/j.advwatres.2020.103559, 2020b. a, b, c, d, e
Caviedes Voullième, D., MoralesHernández, M., and ÖzgenXian, I.: SERGHEI (1.0), Zenodo [code], https://doi.org/10.5281/zenodo.7041423, 2022. a
Caviedes Voullième, D., MoralesHernández, M., and ÖzgenXian, I.: Test cases for SERGHEI v1.0, Zenodo [data set], https://doi.org/10.5281/zenodo.7041392, 2022b. a
Cea, L. and Bladé, E.: A simple and efficient unstructured finite volume scheme for solving the shallow water equations in overland flow applications, Water Resour. Res., 51, 5464–5486, https://doi.org/10.1002/2014WR016547, 2015. a
Cea, L., Garrido, M., and Puertas, J.: Experimental validation of twodimensional depthaveraged models for forecasting rainfall–runoff from precipitation data in urban areas, J. Hydrol., 382, 88–102, https://doi.org/10.1016/j.jhydrol.2009.12.020, 2010a. a, b, c, d, e, f, g
Cea, L., Garrido, M., Puertas, J., Jácome, A., Río, H. D., and Suárez, J.: Overland flow computations in urban and industrial catchments from direct precipitation data using a twodimensional shallow water model, Water Sci. Technol., 62, 1998–2008, https://doi.org/10.2166/wst.2010.746, 2010b. a, b, c
Chang, T.J., Chang, Y.S., and Chang, K.H.: Modeling rainfallrunoff processes using smoothed particle hydrodynamics with massvaried particles, J. Hydrol., 543, 749–758, https://doi.org/10.1016/j.jhydrol.2016.10.045, 2016. a
Choi, B. H., Kim, D. C., Pelinovsky, E., and Woo, S. B.: Threedimensional simulation of tsunami runup around conical island, Coast. Eng., 54, 618–629, https://doi.org/10.1016/j.coastaleng.2007.02.001, 2007. a
Clark, M. P., Bierkens, M. F. P., Samaniego, L., Woods, R. A., Uijlenhoet, R., Bennett, K. E., Pauwels, V. R. N., Cai, X., Wood, A. W., and PetersLidard, C. D.: The evolution of processbased hydrologic models: historical challenges and the collective quest for physical realism, Hydrol. Earth Syst. Sci., 21, 3427–3440, https://doi.org/10.5194/hess2134272017, 2017. a
Coon, E., Svyatsky, D., Jan, A., Kikinzon, E., Berndt, M., Atchley, A., Harp, D., Manzini, G., Shelef, E., Lipnikov, K., Garimella, R., Xu, C., Moulton, D., Karra, S., Painter, S., Jafarov, E., and Molins, S.: Advanced Terrestrial Simulator, Computer Software, USDOE Office of Science (SC), Biological and Environmental Research (BER) (SC23), https://doi.org/10.11578/DC.20190911.1, 2019. a
Costabile, P. and Costanzo, C.: A 2D SWEs framework for efficient catchmentscale simulations: hydrodynamic scaling properties of river networks and implications for nonuniform grids generation, J. Hydrol., 599, 126306, https://doi.org/10.1016/j.jhydrol.2021.126306, 2021. a
Costabile, P., Costanzo, C., Ferraro, D., and Barca, P.: Is HECRAS 2D accurate enough for stormevent hazard assessment? Lessons learnt from a benchmarking study based on rainongrid modelling, J. Hydrol., 603, 126962, https://doi.org/10.1016/j.jhydrol.2021.126962, 2021. a
Crompton, O., Katul, G. G., and Thompson, S.: Resistance formulations in shallow overland flow along a hillslope covered with patchy vegetation, Water Resour. Res., 56, e2020WR027194, https://doi.org/10.1029/2020wr027194, 2020. a
David, A. and Schmalz, B.: A Systematic Analysis of the Interaction between RainonGridSimulations and Spatial Resolution in 2D Hydrodynamic Modeling, Water, 13, 2346, https://doi.org/10.3390/w13172346, 2021. a
Dazzi, S., Vacondio, R., Palù, A. D., and Mignosa, P.: A local time stepping algorithm for GPUaccelerated 2D shallow water models, Adv. Water Resour., 111, 274–288, https://doi.org/10.1016/j.advwatres.2017.11.023, 2018. a
Delestre, O., Lucas, C., Ksinant, P., Darboux, F., Laguerre, C., Vo, T., James, F., and Cordier, S.: SWASHES: a compilation of shallow water analytic solutions for hydraulic and environmental studies, Int. J. Numer. Meth. Fl., 72, 269–300, https://doi.org/10.1002/fld.3741, 2013. a, b, c, d, e, f, g, h
Delestre, O., Darboux, F., James, F., Lucas, C., Laguerre, C., and Cordier, S.: FullSWOF: Full ShallowWater equations for Overland Flow, Journal of Open Source Software, 2, 448, https://doi.org/10.21105/joss.00448, 2017. a
Demeshko, I., Watkins, J., Tezaur, I. K., Guba, O., Spotz, W. F., Salinger, A. G., Pawlowski, R. P., and Heroux, M. A.: Toward performance portability of the Albany finite element analysis code using the Kokkos library, Int. J. High Perform. C., 33, 332–352, https://doi.org/10.1177/1094342017749957, 2018. a
Djemame, K. and Carr, H.: Exascale Computing Deployment Challenges, in: Economics of Grids, Clouds, Systems, and Services, Springer International Publishing, https://doi.org/10.1007/9783030630584_19, pp. 211–216, 2020. a
Dullo, T. T., Darkwah, G. K., Gangrade, S., MoralesHernández, M., Sharif, M. B., Kalyanapu, A. J., Kao, S.C., Ghafoor, S., and Ashfaq, M.: Assessing climatechangeinduced flood risk in the Conasauga River watershed: an application of ensemble hydrodynamic inundation modeling, Nat. Hazards Earth Syst. Sci., 21, 1739–1757, https://doi.org/10.5194/nhess2117392021, 2021a. a
Dullo, T. T., Gangrade, S., MoralesHernández, M., Sharif, M. B., Kao, S.C., Kalyanapu, A. J., Ghafoor, S., and Evans, K. J.: Simulation of Hurricane Harvey flood event through coupled hydrologichydraulic models: Challenges and next steps, J. Flood Risk Manag., 14, https://doi.org/10.1111/jfr3.12716, 2021b. a
Duran, A., Liang, Q., and Marche, F.: On the wellbalanced numerical discretization of shallow water equations on unstructured meshes, J. Comput. Phys., 235, 565–586, https://doi.org/10.1016/j.jcp.2012.10.033, 2013. a
Echeverribar, I., MoralesHernández, M., Brufau, P., and GarcíaNavarro, P.: 2D numerical simulation of unsteady flows for large scale floods prediction in real time, Adv. Water Resour., 134, 103444, https://doi.org/10.1016/j.advwatres.2019.103444, 2019. a
Echeverribar, I., MoralesHernández, M., Brufau, P., and GarcíaNavarro, P.: Analysis of the performance of a hybrid CPU/GPU 1D2D coupled model for real flood cases, J. Hydroinform., 22, 1198–1216, https://doi.org/10.2166/hydro.2020.032, 2020. a
Edwards, H. C., Trott, C. R., and Sunderland, D.: Kokkos: Enabling manycore performance portability through polymorphic memory access patterns, J. Parallel Distr. Com., 74, 3202–3216, https://doi.org/10.1016/j.jpdc.2014.07.003, DomainSpecific Languages and HighLevel Frameworks for HighPerformance Computing, 2014. a
Fan, Y., Clark, M., Lawrence, D. M., Swenson, S., Band, L. E., Brantley, S. L., Brooks, P. D., Dietrich, W. E., Flores, A., Grant, G., Kirchner, J. W., Mackay, D. S., McDonnell, J. J., Milly, P. C. D., Sullivan, P. L., Tague, C., Ajami, H., Chaney, N., Hartmann, A., Hazenberg, P., McNamara, J., Pelletier, J., Perket, J., RouholahnejadFreund, E., Wagener, T., Zeng, X., Beighley, E., Buzan, J., Huang, M., Livneh, B., Mohanty, B. P., Nijssen, B., Safeeq, M., Shen, C., van Verseveld, W., Volk, J., and Yamazaki, D.: Hillslope Hydrology in Global Change Research and Earth System Modeling, Water Resour. Res., 55, 1737–1772, https://doi.org/10.1029/2018wr023903, 2019. a
Fatichi, S., Vivoni, E. R., Ogden, F. L., Ivanov, V. Y., Mirus, B., Gochis, D., Downer, C. W., Camporese, M., Davison, J. H., Ebel, B., Jones, N., Kim, J., Mascaro, G., Niswonger, R., Restrepo, P., Rigon, R., Shen, C., Sulis, M., and Tarboton, D.: An overview of current applications, challenges, and future trends in distributed processbased models in hydrology, J. Hydrol., 537, 45–60, https://doi.org/10.1016/j.jhydrol.2016.03.026, 2016. a
FernándezPato, J. and GarcíaNavarro, P.: A 2D zeroinertia model for the solution of overland flow problems in flexible meshes, J. Hydrol. Eng., 21, https://doi.org/10.1061/(asce)he.19435584.0001428, 2016. a
FernándezPato, J., CaviedesVoullième, D., and GarcíaNavarro, P.: Rainfall/runoff simulation with 2D full shallow water equations: sensitivity analysis and calibration of infiltration parameters, J. Hydrol., 536, 496–513, https://doi.org/10.1016/j.jhydrol.2016.03.021, 2016. a
FernándezPato, J., MartínezAranda, S., and GarcíaNavarro, P.: A 2D finite volume simulation tool to enable the assessment of combined hydrological and morphodynamical processes in mountain catchments, Adv. Water Resour., 141, 103617, https://doi.org/10.1016/j.advwatres.2020.103617, 2020. a
Gan, L., Fu, H., and Yang, G.: Translating novel HPC techniques into efficient geoscience solutions, J. Comput. Sci.Neth., 52, 101212, https://doi.org/10.1016/j.jocs.2020.101212, 2020. a
GarcíaAlén, G., GonzálezCao, J., FernándezNóvoa, D., GómezGesteira, M., Cea, L., and Puertas, J.: Analysis of two sources of variability of basin outflow hydrographs computed with the 2D shallow water model Iber: Digital Terrain Model and unstructured mesh size, J. Hydrol., 612, 128182, https://doi.org/10.1016/j.jhydrol.2022.128182, 2022. a
GarcíaFeal, O., GonzálezCao, J., GómezGesteira, M., Cea, L., Domínguez, J., and Formella, A.: An Accelerated Tool for Flood Modelling Based on Iber, Water, 10, 1459, https://doi.org/10.3390/w10101459, 2018. a
GarcíaNavarro, P., Murillo, J., FernándezPato, J., Echeverribar, I., and MoralesHernández, M.: The shallow water equations and their application to realistic cases, Environ. Fluid Mech., 19, 1235–1252, https://doi.org/10.1007/s10652018096577, 2019. a, b
George, D. L.: Adaptive finite volume methods with wellbalanced Riemann solvers for modeling floods in rugged terrain: Application to the Malpasset dambreak flood (France, 1959), Int. J. Numer. Meth. Fl., 66, 1000–1018, https://doi.org/10.1002/fld.2298, 2010. a
Giardino, J. R. and Houser, C.: Introduction to the critical zone, in: Developments in Earth Surface Processes, vol. 19, chap. 1, edited by: J. R. Giardino, C. H., Elsevier B. V., Amsterdam, the Netherlands, https://doi.org/10.1016/b9780444633699.00001x, 2015. a
Ginting, B. M.: Centralupwind scheme for 2D turbulent shallow flows using highresolution meshes with scalable wall functions, Comput. Fluids, 179, 394–421, https://doi.org/10.1016/j.compfluid.2018.11.014, 2019. a
Gottardi, G. and Venutelli, M.: An accurate time integration method for simplified overland flow models, Adv. Water Resour., 31, 173–180, https://doi.org/10.1016/j.advwatres.2007.08.004, 2008. a
Govindaraju, R. S., Kavvas, M. L., and Jones, S. E.: Approximate Analytical Solutions for Overland Flows, Water Resour. Res., 26, 2903–2912, https://doi.org/10.1029/WR026i012p02903, 1990. a, b
Grant, R. and the Ecosys development team: The Ecosys Modelling Project, https://ecosys.ualberta.ca/, last access: 22 August 2022. a
Grant, R. F., Barr, A. G., Black, T. A., GaumontGuay, D., Iwashita, H., Kidson, J., McCaughey, H., Morgenstern, K., Murayama, S., Nesic, Z., Saigusa, N., Shashkov, A., and Zha, T.: Net ecosystem productivity of boreal jack pine stands regenerating from clearcutting under current and future climates, Glob. Change Biol., 13, 14231440, https://doi.org/10.1111/j.13652486.2007.01363.x, 2007. a
Grete, P., Glines, F. W., and O'Shea, B. W.: KAthena: A Performance Portable Structured Grid Finite Volume Magnetohydrodynamics Code, IEEE T. Parall. Distr., 32, 85–97, https://doi.org/10.1109/tpds.2020.3010016, 2021. a
Halver, R., Meinke, J. H., and Sutmann, G.: Kokkos implementation of an Ewald Coulomb solver and analysis of performance portability, J. Parallel Distr. Com., 138, 48–54, https://doi.org/10.1016/j.jpdc.2019.12.003, 2020. a
Hartanto, I., Beevers, L., Popescu, I., and Wright, N.: Application of a coastal modelling code in fluvial environments, Environ. Modell. Softw., 26, 1685–1695, https://doi.org/10.1016/j.envsoft.2011.05.014, 2011. a
Hervouet, J.M. and Petitjean, A.: Malpasset dambreak revisited with twodimensional computations, J. Hydraul. Res., 37, 777–788, https://doi.org/10.1080/00221689909498511, 1999. a, b
Hiver, J.: AdverseSlope and Slope (bump), in: Concerted Action on Dam Break Modelling: Objectives, Project Report, Test Cases, Meeting Proceedings, edited by: SoaresFrazão, S., Morris, M., and Zech, Y., vol. CDROM, Université Catholique de Louvain, Civil Engineering Department, Hydraulics Division, LouvainlaNeuve, Belgium, 2000. a
Hou, J., Liang, Q., Simons, F., and Hinkelmann, R.: A stable 2D unstructured shallow flow model for simulations of wetting and drying over rough terrains, Comput. Fluids, 82, 132–147, https://doi.org/10.1016/j.compfluid.2013.04.015, 2013a. a, b
Hou, J., Simons, F., Mahgoub, M., and Hinkelmann, R.: A robust wellbalanced model on unstructured grids for shallow water flows with wetting and drying over complex topography, Comput. Method. Appl. M., 257, 126–149, https://doi.org/10.1016/j.cma.2013.01.015, 2013b. a, b
Hou, J., Liang, Q., Zhang, H., and Hinkelmann, R.: An efficient unstructured MUSCL scheme for solving the 2D shallow water equations, Environ. Modell. Softw., 66, 131–152, https://doi.org/10.1016/j.envsoft.2014.12.007, 2015. a, b
Hou, J., Wang, R., Liang, Q., Li, Z., Huang, M. S., and Hinkelmann, R.: Efficient surface water flow simulation on static Cartesian grid with local refinement according to key topographic features, Comput. Fluids, 176, 117–134, https://doi.org/10.1016/j.compfluid.2018.03.024, 2018. a
Hou, J., Kang, Y., Hu, C., Tong, Y., Pan, B., and Xia, J.: A GPUbased numerical model coupling hydrodynamical and morphological processes, Int. J. Sediment Res., 35, 386–394, https://doi.org/10.1016/j.ijsrc.2020.02.005, 2020. a, b
Hubbard, S. S., Williams, K. H., Agarwal, D., Banfield, J., Beller, H., Bouskill, N., Brodie, E., Carroll, R., Dafflon, B., Dwivedi, D., Falco, N., Faybishenko, B., Maxwell, R., Nico, P., Steefel, C., Steltzer, H., Tokunaga, T., Tran, P. A., Wainwright, H., and Varadharajan, C.: The East River, Colorado, Watershed: A Mountainous Community Testbed for Improving Predictive Understanding of Multiscale HydrologicalBiogeochemical Dynamics, Vadose Zone J., 17, 180061, https://doi.org/10.2136/vzj2018.03.0061, 2018. a
Jain, M. K. and Kothyari, U. C.: A GIS based distributed rainfallrunoff model, J. Hydrol., 299, 107–135, 2004. a
Jeong, W., Yoon, J.S., and Cho, Y.S.: Numerical study on effects of building groups on dambreak flow in urban areas, J. HydroEnviron. Res., 6, 91–99, https://doi.org/10.1016/j.jher.2012.01.001, 2012. a
Jodhani, K. H., Patel, D., and Madhavan, N.: A review on analysis of flood modelling using different numerical models, Mater. TodayProc., https://doi.org/10.1016/j.matpr.2021.07.405, 2021. a
Kesserwani, G. and Liang, Q.: Wellbalanced RKDG2 solutions to the shallow water equations over irregular domains with wetting and drying, Comput. Fluids, 39, 2040–2050, https://doi.org/10.1016/j.compfluid.2010.07.008, 2010. a
Kesserwani, G. and Liang, Q.: Dynamically adaptive grid based discontinuous Galerkin shallow water model, Adv. Water Resour., 37, 23–39, https://doi.org/10.1016/j.advwatres.2011.11.006, 2012. a, b
Kesserwani, G. and Sharifian, M. K.: (Multi)wavelets increase both accuracy and efficiency of standard Godunovtype hydrodynamic models: Robust 2D approaches, Adv. Water Resour., 144, 103693, https://doi.org/10.1016/j.advwatres.2020.103693, 2020. a, b, c
Kesserwani, G. and Sharifian, M. K.: (Multi)waveletbased Godunovtype simulators of flood inundation: static versus dynamic adaptivity, Adv. Water Resour., 171, 104357, https://doi.org/10.1016/j.advwatres.2022.104357, 2022. a
Kesserwani, G., Shaw, J., Sharifian, M. K., Bau, D., Keylock, C. J., Bates, P. D., and Ryan, J. K.: (Multi)wavelets increase both accuracy and efficiency of standard Godunovtype hydrodynamic models, Adv. Water Resour., 129, 31–55, https://doi.org/10.1016/j.advwatres.2019.04.019, 2019. a
Kim, B., Sanders, B. F., Schubert, J. E., and Famiglietti, J. S.: Mesh type tradeoffs in 2D hydrodynamic modeling of flooding with a Godunovbased flow solver, Adv. Water Resour., 68, 42–61, https://doi.org/10.1016/j.advwatres.2014.02.013, 2014. a
Kirstetter, G., Delestre, O., Lagrée, P.Y., Popinet, S., and Josserand, C.: Bflood 1.0: an opensource SaintVenant model for flashflood simulation using adaptive refinement, Geosci. Model Dev., 14, 7117–7132, https://doi.org/10.5194/gmd1471172021, 2021. a
Kobayashi, K., Kitamura, D., Ando, K., and Ohi, N.: Parallel computing for highresolution/largescale flood simulation using the K supercomputer, Hydrological Research Letters, 9, 61–68, https://doi.org/10.3178/hrl.9.61, 2015. a
Kollet, S., Sulis, M., Maxwell, R. M., Paniconi, C., Putti, M., Bertoldi, G., Coon, E. T., Cordano, E., Endrizzi, S., Kikinzon, E., Mouche, E., Mügler, C., Park, Y.J., Refsgaard, J. C., Stisen, S., and Sudicky, E.: The integrated hydrologic model intercomparison project, IHMIP2: A second set of benchmark results to diagnose integrated hydrology and feedbacks, Water Resour. Res., 53, 867–890, https://doi.org/10.1002/2016wr019191, 2017. a
Kuffour, B. N. O., Engdahl, N. B., Woodward, C. S., Condon, L. E., Kollet, S., and Maxwell, R. M.: Simulating coupled surface–subsurface flows with ParFlow v3.5.0: capabilities, applications, and ongoing development of an opensource, massively parallel, integrated hydrologic model, Geosci. Model Dev., 13, 1373–1397, https://doi.org/10.5194/gmd1313732020, 2020. a
Lacasta, A., MoralesHernández, M., Murillo, J., and GarcíaNavarro, P.: An optimized GPU implementation of a 2D free surface simulation model on unstructured meshes, Adv. Eng. Softw., 78, 1–15, https://doi.org/10.1016/j.advengsoft.2014.08.007, 2014. a, b, c
Lacasta, A., MoralesHernández, M., Murillo, J., and GarcíaNavarro, P.: GPU implementation of the 2D shallow water equations for the simulation of rainfall/runoff events, Environ. Earth. Sci., 74, 7295–7305, https://doi.org/10.1007/s126650154215z, 2015. a, b
Lawrence, B. N., Rezny, M., Budich, R., Bauer, P., Behrens, J., Carter, M., Deconinck, W., Ford, R., Maynard, C., Mullerworth, S., Osuna, C., Porter, A., Serradell, K., Valcke, S., Wedi, N., and Wilson, S.: Crossing the chasm: how to develop weather and climate models for next generation computers?, Geosci. Model Dev., 11, 1799–1821, https://doi.org/10.5194/gmd1117992018, 2018. a
Leiserson, C. E., Thompson, N. C., Emer, J. S., Kuszmaul, B. C., Lampson, B. W., Sanchez, D., and Schardl, T. B.: There's plenty of room at the Top: What will drive computer performance after Moore's law?, Science, 368, 6495, https://doi.org/10.1126/science.aam9744, 2020. a, b
Li, Z., ÖzgenXian, I., and Maina, F. Z.: A massconservative predictorcorrector solution to the 1D Richards equation with adaptive time control, J. Hydrol., 592, 125809, https://doi.org/10.1016/j.jhydrol.2020.125809, 2021. a
Liang, D., Lin, B., and Falconer, R. A.: A boundaryfitted numerical model for flood routing with shockcapturing capability, J. Hydrol., 332, 477–486, https://doi.org/10.1016/j.jhydrol.2006.08.002, 2007. a
Liang, Q., Hou, J., and Xia, X.: Contradiction between the Cproperty and mass conservation in adaptive grid based shallow flow models: cause and solution, Int. J. Numer. Meth. Fl., 78, 17–36, https://doi.org/10.1002/fld.4005, 2015. a
Liang, Q., Smith, L., and Xia, X.: New prospects for computational hydraulics by leveraging highperformance heterogeneous computing techniques, J. Hydrodyn Ser. B, 28, 977–985, https://doi.org/10.1016/S10016058(16)606996, 2016. a, b
Lichtner, P. C., Hammond, G. E., Lu, C., Karra, S., Bisht, G., Andre, B., Mills, R., and Kumar, J.: PFLOTRAN user manual: A massively parallel reactive flow and transport model for describing surface and subsurface processes, Tech. rep., Los Alamos National Laboratory, New Mexico, USA, 2015. a
Liu, P. L. F., Cho, Y.S., Briggs, M. J., Kanoglu, U., and Synolakis, C. E.: Runup of solitary waves on a circular Island, J. Fluid Mech., 302, 259–285, https://doi.org/10.1017/s0022112095004095, 1995. a
Loukili, Y. and Soulaïmani, A.: Numerical Tracking of Shallow Water Waves by the Unstructured Finite Volume WAF Approximation, International Journal for Computational Methods in Engineering Science and Mechanics, 8, 75–88, https://doi.org/10.1080/15502280601149577, 2007. a
Lynett, P. J., Wu, T.R., and Liu, P. L.F.: Modeling wave runup with depthintegrated equations, Coast. Eng., 46, 89–107, https://doi.org/10.1016/s03783839(02)000431, 2002. a
MacDonald, I., Baines, M., Nichols, N., and Samuels, P. G.: Comparison of some Steady StateSaintVenant Solvers forsome Test Problems withAnalytic Solutions, Tech. rep., University of Reading, 1995. a, b
Maneta, M. P. and Silverman, N. L.: A spatially distributed model to simulate water, energy, and vegetation dynamics using information from regional climate models, Earth Interact., 17, 11.1–11.44, 2013. a
Mann, A.: Core Concept: Nascent exascale supercomputers offer promise, present challenges, P. Natl. Acad. Sci. USA, 117, 22623–22625, https://doi.org/10.1073/pnas.2015968117, 2020. a
MartínezAranda, S., FernándezPato, J., CaviedesVoullième, D., GarcíaPalacín, I., and GarcíaNavarro, P.: Towards transient experimental water surfaces: A new benchmark dataset for 2D shallow water solvers, Adv. Water Resour., 121, 130–149, https://doi.org/10.1016/j.advwatres.2018.08.013, 2018. a, b, c
Matsuyama, M. and Tanaka, H.: An experimental study oh the highest runup height in the 1993 Hokkaido Nanseioki earthquake tsunami, ITS Proceedings, 879–889, 2001. a
MoralesHernández, M., GarcíaNavarro, P., and Murillo, J.: A large time step 1D upwind explicit scheme (CFL > 1): Application to shallow water equations, J. Comput. Phys., 231, 6532–6557, https://doi.org/10.1016/j.jcp.2012.06.017, 2012. a, b
MoralesHernández, M., Hubbard, M., and GarcíaNavarro, P.: A 2D extension of a Large Time Step explicit scheme (CFL > 1) for unsteady problems with wet/dry boundaries, J. Comput. Phys., 263, 303–327, https://doi.org/10.1016/j.jcp.2014.01.019, 2014. a
MoralesHernández, M., Sharif, M. B., Gangrade, S., Dullo, T. T., Kao, S.C., Kalyanapu, A., Ghafoor, S. K., Evans, K. J., MadadiKandjani, E., and Hodges, B. R.: Highperformance computing in water resources hydrodynamics, J. Hydroinform., https://doi.org/10.2166/hydro.2020.163, 2020. a, b
MoralesHernández, M., Sharif, M. B., Kalyanapu, A., Ghafoor, S., Dullo, T., Gangrade, S., Kao, S.C., Norman, M., and Evans, K.: TRITON: A MultiGPU open source 2D hydrodynamic flood model, Environ. Modell. Softw., 141, 105034, https://doi.org/10.1016/j.envsoft.2021.105034, 2021. a, b, c
Moulinec, C., Denis, C., Pham, C.T., Rougé, D., Hervouet, J.M., Razafindrakoto, E., Barber, R., Emerson, D., and Gu, X.J.: TELEMAC: An efficient hydrodynamics suite for massively parallel architectures, Comput. Fluids, 51, 30–34, https://doi.org/10.1016/j.compfluid.2011.07.003, 2011. a
Mügler, C., Planchon, O., Patin, J., Weill, S., Silvera, N., Richard, P., and Mouche, E.: Comparison of roughness models to simulate overland flow and tracer transport experiments under simulated rainfall at plot scale, J. Hydrol., 402, 25–40, https://doi.org/10.1016/j.jhydrol.2011.02.032, 2011. a, b, c, d
Murillo, J. and GarcíaNavarro, P.: Weak solutions for partial differential equations with source terms: Application to the shallow water equations, J. Comput. Phys., 229, 4327–4368, https://doi.org/10.1016/j.jcp.2010.02.016, 2010. a, b, c
Murillo, J. and GarcíaNavarro, P.: Augmented versions of the HLL and HLLC Riemann solvers including source terms in one and two dimensions for shallow flow applications, J. Comput. Phys, 231, 6861–6906, https://doi.org/10.1016/j.jcp.2012.06.031, 2012. a, b, c, d, e
Murillo, J., GarcíaNavarro, P., and Burguete, J.: Time step restrictions for wellbalanced shallow water solutions in nonzero velocity steady states, Int. J. Numer. Meth. Fl., 60, 1351–1377, https://doi.org/10.1002/fld.1939, 2009. a, b
NavasMontilla, A. and Murillo, J.: 2D wellbalanced augmented ADER schemes for the Shallow Water Equations with bed elevation and extension to the rotating frame, J. Comput. Phys., 372, 316–348, https://doi.org/10.1016/j.jcp.2018.06.039, 2018. a
Nikolos, I. and Delis, A.: An unstructured nodecentered finite volume scheme for shallow water flows with wet/dry fronts over complex topography, Comput. Method. Appl. M., 198, 3723–3750, https://doi.org/10.1016/j.cma.2009.08.006, 2009. a, b
Özgen, I., Liang, D., and Hinkelmann, R.: Shallow water equations with depthdependent anisotropic porosity for subgridscale topography, Appl. Math. Model., 40, 7447–7473, https://doi.org/10.1016/j.apm.2015.12.012, 2015a. a
Özgen, I., Teuber, K., Simons, F., Liang, D., and Hinkelmann, R.: Upscaling the shallow water model with a novel roughness formulation, Environ. Earth. Sci., 74, 7371–7386, https://doi.org/10.1007/s1266501547267, 2015b. a
ÖzgenXian, I., Kesserwani, G., CaviedesVoullième, D., Molins, S., Xu, Z., Dwivedi, D., Moulton, J. D., and Steefel, C. I.: Waveletbased local mesh refinement for rainfall–runoff simulations, J. Hydroinform., 22, 1059–1077, https://doi.org/10.2166/hydro.2020.198, 2020. a, b, c
ÖzgenXian, I., Xia, X., Liang, Q., Hinkelmann, R., Liang, D., and Hou, J.: Innovations Towards the Next Generation of Shallow Flow Models, Adv. Water Resour., 149, 103867, https://doi.org/10.1016/j.advwatres.2021.103867, 2021. a
Paniconi, C. and Putti, M.: Physically based modeling in catchment hydrology at 50: Survey and outlook, Water Resour. Res., 51, 7090–7129, https://doi.org/10.1002/2015WR017780, 2015. a
Park, S., Kim, B., and Kim, D. H.: 2D GPUAccelerated High Resolution Numerical Scheme for Solving Diffusive Wave Equations, Water, 11, 1447, https://doi.org/10.3390/w11071447, 2019. a
Petaccia, G., SoaresFraz ao, S., Savi, F., Natale, L., and Zech, Y.: Simplified versus Detailed TwoDimensional Approaches to Transient Flow Modeling in Urban Areas, J. Hydraul. Eng., 136, 262–266, https://doi.org/10.1061/(asce)hy.19437900.0000154, 2010. a
Roe, P.: Approximate Riemann solvers, parameter vectors, and difference schemes, J. Comput. Phys., 43, 357–372, https://doi.org/10.1016/00219991(81)901285, 1981. a
Schulthess, T. C.: Programming revisited, Nat. Phys., 11, 369–373, https://doi.org/10.1038/nphys3294, 2015. a
Schwanenberg, D. and Harms, M.: Discontinuous Galerkin FiniteElement Method for Transcritical TwoDimensional Shallow Water Flows, J. Hydraul. Eng., 130, 412–421, https://doi.org/10.1061/(ASCE)07339429(2004)130:5(412), 2004. a
SerranoPacheco, A., Murillo, J., and GarciaNavarro, P.: A finite volume method for the simulation of the waves generated by landslides, J. Hydrol., 373, 273–289, https://doi.org/10.1016/j.jhydrol.2009.05.003, 2009. a
Sharif, M. B., Ghafoor, S. K., Hines, T. M., MoralesHernández, M., Evans, K. J., Kao, S.C., Kalyanapu, A. J., Dullo, T. T., and Gangrade, S.: Performance Evaluation of a TwoDimensional Flood Model on Heterogeneous HighPerformance Computing Architectures, in: Proceedings of the Platform for Advanced Scientific Computing Conference, ACM, https://doi.org/10.1145/3394277.3401852, 2020. a
Shaw, J., Kesserwani, G., Neal, J., Bates, P., and Sharifian, M. K.: LISFLOODFP 8.0: the new discontinuous Galerkin shallowwater solver for multicore CPUs and GPUs, Geosci. Model Dev., 14, 3577–3602, https://doi.org/10.5194/gmd1435772021, 2021. a, b
Simons, F., Busse, T., Hou, J., Özgen, I., and Hinkelmann, R.: A model for overland flow and associated processes within the Hydroinformatics Modelling System, J. Hydroinform., 16, 375–391, https://doi.org/10.2166/hydro.2013.173, 2014. a, b, c, d
Singh, J., Altinakar, M. S., and Ding, Y.: Numerical Modeling of RainfallGenerated Overland Flow Using Nonlinear ShallowWater Equations, J. Hydrol. Eng., 20, 04014089, https://doi.org/10.1061/(ASCE)HE.19435584.0001124, 2015. a
Sivapalan, M.: From engineering hydrology to Earth system science: milestones in the transformation of hydrologic science, Hydrol. Earth Syst. Sci., 22, 1665–1693, https://doi.org/10.5194/hess2216652018, 2018. a
Sætra, M. L., Brodtkorb, A. R., and Lie, K.A.: Efficient GPUImplementation of Adaptive Mesh Refinement for the ShallowWater Equations, J. Sci. Comput., 63, 23–48, https://doi.org/10.1007/s1091501498834, 2015. a
Smith, L. S. and Liang, Q.: Towards a generalised GPU/CPU shallowflow modelling tool, Comput. Fluids, 88, 334–343, https://doi.org/10.1016/j.compfluid.2013.09.018, 2013. a
SoaresFrazāo, S.: Experiments of dambreak wave over a triangular bottom sill, J. Hydraul. Res., 45, 19–26, https://doi.org/10.1080/00221686.2007.9521829, 2007. a
SoaresFrazāo, S. and Zech, Y.: Dambreak flow through an idealised city, J. Hydraul. Res., 46, 648–658, https://doi.org/10.3826/jhr.2008.3164, 2008. a
Steefel, C. I.: CrunchFlow: Software for modeling multicomponent reactive flow and transport, Tech. rep., Lawrence Berkeley National Laboratory, California, USA, 2009. a
Steffen, L., Amann, F., and Hinkelmann, R.: Concepts for performance improvements of shallow water flow simulations, in: Proceedings of the 1st IAHR Young Professionals Congress, online, ISBN 9789082484663, 2020. a
Stoker, J.: Water Waves: The Mathematical Theory with Applications, New York Interscience Publishers, Wiley, ISBN 9780471570349, 1957. a
Su, B., Huang, H., and Zhu, W.: An urban pluvial flood simulation model based on diffusive wave approximation of shallow water equations, Hydrol. Res., 50, 138–154, https://doi.org/10.2166/nh.2017.233, 2017. a
Suarez, E., Eicker, N., and Lippert, T.: Modular Supercomputing Architecture: From Idea to Production, in: Contemporary High Performance Computing, CRC Press, blackboxPlease add the place of publication., https://doi.org/10.1201/97813510368639, pp. 223–255, 2019. a
Tatard, L., Planchon, O., Wainwright, J., Nord, G., FavisMortlock, D., Silvera, N., Ribolzi, O., Esteves, M., and Huang, C. H.: Measurement and modelling of highresolution flowvelocity data under simulated rainfall on a lowslope sandy soil, J. Hydrol., 348, 1–12, https://doi.org/10.1016/j.jhydrol.2007.07.016, 2008. a
Thacker, W.: Some exact solutions to the nonlinear shallowwater wave equations, J. Fluid Mech., 107, 499–508, https://doi.org/10.1017/S0022112081001882, 1981. a, b, c, d, e
The third international workshop on longwave runup models: http://isec.nacse.org/workshop/2004_cornell/bmark2.html (last access: 22 August 2022), 2004. a
Toro, E.: ShockCapturing Methods for FreeSurface Shallow Flows, Wiley, ISBN 9780471987666, 2001. a
Trott, C., BergerVergiat, L., Poliakoff, D., Rajamanickam, S., LebrunGrandie, D., Madsen, J., Awar, N. A., Gligoric, M., Shipman, G., and Womeldorff, G.: The Kokkos EcoSystem: Comprehensive Performance Portability for High Performance Computing, Comput. Sci. Eng., 23, 10–18, https://doi.org/10.1109/mcse.2021.3098509, 2021. a, b
Turchetto, M., Palu, A. D., and Vacondio, R.: A general design for a scalable MPIGPU multiresolution 2D numerical solver, IEEE T. Parall. Distr., 31, https://doi.org/10.1109/tpds.2019.2961909, 2019. a
Vacondio, R., Palù, A. D., and Mignosa, P.: GPUenhanced Finite Volume Shallow Water solver for fast flood simulations, Environ. Modell. Softw., 57, 60–75, https://doi.org/10.1016/j.envsoft.2014.02.003, 2014. a, b
Vacondio, R., Palù, A. D., Ferrari, A., Mignosa, P., Aureli, F., and Dazzi, S.: A nonuniform efficient grid type for GPUparallel Shallow Water Equations models, Environ. Modell. Softw., 88, 119–137, https://doi.org/10.1016/j.envsoft.2016.11.012, 2017. a, b
Valiani, A., Caleffi, V., and Zanni, A.: Case Study: Malpasset DamBreak Simulation using a TwoDimensional Finite Volume Method, J. Hydraul. Eng., 128, 460–472, https://doi.org/10.1061/(ASCE)07339429(2002)128:5(460), 2002. a
Vanderbauwhede, W.: Making legacy Fortran code type safe through automated program transformation, J. Supercomput., 78, 2988–3028, 2021. a
Vanderbauwhede, W. and Davidson, G.: Domainspecific acceleration and autoparallelization of legacy scientific code in FORTRAN 77 using sourcetosource compilation, Comput. Fluids, 173, 1–5, 2018. a
Vanderbauwhede, W. and Takemi, T.: An investigation into the feasibility and benefits of GPU/multicore acceleration of the weather research and forecasting model, in: 2013 International Conference on High Performance Computing and Simulation (HPCS), Helsinki, Finland, IEEE, https://doi.org/10.1109/hpcsim.2013.6641457, 2013. a
Vater, S., Beisiegel, N., and Behrens, J.: A limiterbased wellbalanced discontinuous Galerkin method for shallowwater flows with wetting and drying: Triangular grids, Int. J. Numer. Meth. Fl., 91, 395–418, https://doi.org/10.1002/fld.4762, 2019. a
Wang, Y., Liang, Q., Kesserwani, G., and Hall, J. W.: A 2D shallow flow model for practical dambreak simulations, J. Hydraul. Res., 49, 307–316, https://doi.org/10.1080/00221686.2011.566248, 2011. a
Wang, Z., Walsh, K., and Verma, B.: OnTree Mango Fruit Size Estimation Using RGBD Images, Sensors, 17, 2738, https://doi.org/10.3390/s17122738, 2017. a
Watkins, J., Tezaur, I., and Demeshko, I.: A Study on the Performance Portability of the Finite Element Assembly Process Within the Albany Land Ice Solver, Springer International Publishing, Cham, 177–188, https://doi.org/10.1007/9783030307059_16, 2020. a
Weill, S.: Modélisation des échanges surface/subsurface à l'échelle de la parcelle par une approche darcéenne multidomaine, PhD thesis, Ecole des Mines de Paris, 2007. a
Wittmann, R., Bungartz, H.J., and Neumann, P.: High performance shallow water kernels for parallel overland flow simulations based on FullSWOF2D, Comput. Math. Appl., 74, 110–125, https://doi.org/10.1016/j.camwa.2017.01.005, 2017. a
Xia, J., Falconer, R. A., Lin, B., and Tan, G.: Numerical assessment of flood hazard risk to people and vehicles in flash floods, Environ. Modell. Softw., 26, 987–998, https://doi.org/10.1016/j.envsoft.2011.02.017, 2011. a
Xia, X. and Liang, Q.: A new efficient implicit scheme for discretising the stiff friction terms in the shallow water equations, Adv. Water Resour., 117, 87–97, https://doi.org/10.1016/j.advwatres.2018.05.004, 2018. a
Xia, X., Liang, Q., Ming, X., and Hou, J.: An efficient and stable hydrodynamic model with novel source term discretization schemes for overland flow and flood simulations, Water Resour. Res., 53, 3730–3759, https://doi.org/10.1002/2016WR020055, 2017. a, b
Xia, X., Liang, Q., and Ming, X.: A fullscale fluvial flood modelling framework based on a HighPerformance Integrated hydrodynamic Modelling System (HiPIMS), Adv. Water Resour., 132, 103392, https://doi.org/10.1016/j.advwatres.2019.103392, 2019. a
Yu, C. and Duan, J.: Twodimensional depthaveraged finite volume model for unsteady turbulent flow, J. Hydraul. Res., 50, 599–611, https://doi.org/10.1080/00221686.2012.730556, 2012. a
Yu, C. and Duan, J.: Simulation of Surface Runoff Using Hydrodynamic Model, J. Hydrol. Eng., 22, 04017006, https://doi.org/10.1061/(asce)he.19435584.0001497, 2017. a, b, c
Zhao, J., Özgen Xian, I., Liang, D., Wang, T., and Hinkelmann, R.: An improved multislope MUSCL scheme for solving shallow water equations on unstructured grids, Comput. Math. Appl., 77, 576–596, https://doi.org/10.1016/j.camwa.2018.09.059, 2019. a, b
Zhou, F., Chen, G., Huang, Y., Yang, J. Z., and Feng, H.: An adaptive moving finite volume scheme for modeling flood inundation over dry and complex topography, Water Resour. Res., 49, 1914–1928, https://doi.org/10.1002/wrcr.20179, 2013. a, b
 Abstract
 Introduction
 Mathematical and numerical model of SERGHEISWE
 HPC implementation of the SERGHEI framework
 Verification and validation
 Laboratoryscale experiments
 Plotscale to catchmentscale experiments
 Performance and scaling
 Conclusions and outlook
 Appendix A: Additional validation test cases
 Appendix B: Glossary
 Code and data availability
 Author contributions
 Competing interests
 Disclaimer
 Acknowledgements
 Financial support
 Review statement
 References
 Abstract
 Introduction
 Mathematical and numerical model of SERGHEISWE
 HPC implementation of the SERGHEI framework
 Verification and validation
 Laboratoryscale experiments
 Plotscale to catchmentscale experiments
 Performance and scaling
 Conclusions and outlook
 Appendix A: Additional validation test cases
 Appendix B: Glossary
 Code and data availability
 Author contributions
 Competing interests
 Disclaimer
 Acknowledgements
 Financial support
 Review statement
 References