Development and technical paper 11 Jun 2021
Development and technical paper  11 Jun 2021
LISFLOODFP 8.0: the new discontinuous Galerkin shallowwater solver for multicore CPUs and GPUs
 ^{1}Department of Civil and Structural Engineering, The University of Sheffield, Western Bank, Sheffield, UK
 ^{2}School of Geographical Sciences, University of Bristol, Bristol, UK
 ^{1}Department of Civil and Structural Engineering, The University of Sheffield, Western Bank, Sheffield, UK
 ^{2}School of Geographical Sciences, University of Bristol, Bristol, UK
Correspondence: James Shaw (js102@zepler.net) and Georges Kesserwani (g.kesserwani@sheffield.ac.uk)
Hide author detailsCorrespondence: James Shaw (js102@zepler.net) and Georges Kesserwani (g.kesserwani@sheffield.ac.uk)
LISFLOODFP 8.0 includes secondorder discontinuous Galerkin (DG2) and firstorder finitevolume (FV1) solvers of the twodimensional shallowwater equations for modelling a wide range of flows, including rapidly propagating, supercritical flows, shock waves or flows over very smooth surfaces. The solvers are parallelised on multicore CPU and Nvidia GPU architectures and run existing LISFLOODFP modelling scenarios without modification. These new, fully twodimensional solvers are available alongside the existing local inertia solver (called ACC), which is optimised for multicore CPUs and integrates with the LISFLOODFP subgrid channel model. The predictive capabilities and computational scalability of the new DG2 and FV1 solvers are studied for two Environment Agency benchmark tests and a realworld fluvial flood simulation driven by rainfall across a 2500 km^{2} catchment. DG2's secondorderaccurate, piecewiseplanar representation of topography and flow variables enables predictions on coarse grids that are competitive with FV1 and ACC predictions on 2–4 times finer grids, particularly where river channels are wider than half the grid spacing. Despite the simplified formulation of the local inertia solver, ACC is shown to be spatially secondorderaccurate and yields predictions that are close to DG2. The DG2CPU and FV1CPU solvers achieve nearoptimal scalability up to 16 CPU cores and achieve greater efficiency on grids with fewer than 0.1 million elements. The DG2GPU and FV1GPU solvers are most efficient on grids with more than 1 million elements, where the GPU solvers are 2.5–4 times faster than the corresponding 16core CPU solvers. LISFLOODFP 8.0 therefore marks a new step towards operational DG2 flood inundation modelling at the catchment scale. LISFLOODFP 8.0 is freely available under the GPL v3 license, with additional documentation and case studies at https://www.seamlesswave.com/LISFLOOD8.0 (last access: 2 June 2021).
LISFLOODFP is a freely available rasterbased hydrodynamic model that has been applied in numerous studies from smallscale (Sampson et al., 2012) and reachscale (Liu et al., 2019; Shustikova et al., 2019; O'Loughlin et al., 2020) to continental and global flood forecasting applications (Wing et al., 2020; Sampson et al., 2015). LISFLOODFP has been coupled to several hydrological models (Hoch et al., 2019; Rajib et al., 2020; Li et al., 2020), and it offers simple text file configuration and commandline tools to facilitate DEM preprocessing and sensitivity analyses (Sosa et al., 2020). LISFLOODFP includes extension modules to provide efficient rainfall routing (Sampson et al., 2013), modelling of hydraulic structures (Wing et al., 2019; Shustikova et al., 2020), and coupling between twodimensional floodplain solvers and a onedimensional subgrid channel model (Neal et al., 2012a).
LISFLOODFP already includes a local inertia (or “gravity wave”) solver, LISFLOODACC, and a diffusive wave (or “zeroinertia”) solver, LISFLOODATS. The LISFLOODACC solver simplifies the full shallowwater equations by neglecting convective acceleration, while LISFLOODATS neglects both convective and inertial acceleration. The LISFLOODACC solver is recommended for simulating fluvial, pluvial and coastal flooding, involving gradually varying, subcritical flow over sufficiently rough surfaces with Manning's coefficient of at least 0.03 ${\mathrm{sm}}^{\mathrm{1}/\mathrm{3}}$ (Neal et al., 2012b; de Almeida and Bates, 2013). For such flows, LISFLOODACC was reported to be up to 67 times faster than LISFLOODATS, which has a stricter, quadratic CFL constraint (Neal et al., 2011; Hunter et al., 2006), and about 3 times faster than a full shallowwater solver (Neal et al., 2012b). However, given the theoretical limitations of the local inertia equations (de Almeida and Bates, 2013; Martins et al., 2016; Cozzolino et al., 2019), a full shallowwater solver is still required for simulating dam breaks (Neal et al., 2012b) and flash floods in steep catchments (Kvočka et al., 2017), involving rapidly varying, supercritical flows, shock waves or flows over very smooth surfaces.
The potential benefits of a secondorder discontinuous Galerkin (DG2) shallowwater solver for flood inundation modelling have recently been demonstrated by Ayog et al. (2021): DG2 alleviates numerical diffusion errors associated with firstorder finitevolume (FV1) methods, meaning DG2 can capture finescale transients in flood hydrographs on relatively coarse grids over longduration simulations thanks to its piecewiseplanar representation of topography and flow variables. Within a computational element on a raster grid, each locally planar variable is represented by three coefficients – the elementaverage, xslope and yslope coefficients – which are updated by a twostage Runge–Kutta timestepping scheme. Due to its secondorder formulation, DG2 can be 4–12 times slower per element than an FV1 solver depending on the test case (Kesserwani and Sharifian, 2020), though substantial speedups have already been achieved: switching from a standard tensorproduct stencil to a simplified, slopedecoupled stencil of Kesserwani et al. (2018) achieved a 2.6fold speedup, and avoiding unnecessary local slope limiting achieved an additional 2fold speedup (Ayog et al., 2021), while preserving accuracy, conservation and robustness properties for shockless flows.
Secondorder finitevolume (FV2) methods offer an alternative approach to obtain secondorder accuracy, with many FV2 models adopting the Monotonic Upstreamcentred Scheme for Conservation Laws (MUSCL) method. While FV2MUSCL solvers can achieve secondorder convergence (Kesserwani and Wang, 2014), the MUSCL method relies on global slope limiting and nonlocal, linear reconstructions across neighbouring elements that can affect energy conservation properties (Ayog et al., 2021) and affect wave arrival times when the grid is too coarse (Kesserwani and Wang, 2014). Hence, although FV2MUSCL is typically 2–10 times faster than DG2 per element (Ayog et al., 2021), DG2 can improve accuracy and conservation properties on coarse grids, which is particularly desirable for efficient, longduration continental or globalscale simulations that rely on DEM products derived from satellite data (Bates, 2012; Yamazaki et al., 2019).
Parallelisation is the next step towards making DG2 flood modelling operational on largescale, highresolution domains. Existing LISFLOODFP solvers are parallelised using OpenMP for multicore CPUs, which have been tested on domains with up to 23 million elements on a 16core CPU (Neal et al., 2009, 2018). But as flood models are applied to increasingly large domains at increasingly fine resolutions, a greater degree of parallelism can be achieved using GPU accelerators (Brodtkorb et al., 2013). For example, GarcíaFeal et al. (2018) compared Iber+ hydrodynamic model runs on a GPU against a 16core CPU and obtained a 4–15fold speedup depending on the test case. Running in a multiGPU configuration, the TRITON model has been applied on a 6800 km^{2} domain with 68 million elements to simulate a 10 d storm event in under 30 min (MoralesHernández et al., 2021), and the HiPIMS model was applied on a 2500 km^{2} domain with 100 million elements to simulate a 4 d storm event in 1.5 d (Xia et al., 2019).
This paper presents a new LISFLOODDG2 solver of the full shallowwater equations, which is integrated into LISFLOODFP 8.0 and freely available under the GNU GPL v3 license (LISFLOODFP developers, 2020). LISFLOODFP 8.0 also includes an updated FV1 solver obtained by simplifying the DG2 formulation. Both solvers support standard LISFLOODFP configuration parameters and model outputs, meaning that many existing LISFLOODFP modelling scenarios can run without modification. Since the new DG2 and FV1 solvers are purely twodimensional and parallelised for multicore CPU and GPU architectures, the new solvers do not currently integrate with the LISFLOODFP subgrid channel model (Neal et al., 2012a) or incorporate the CPUspecific optimisations available to the ACC solver (Neal et al., 2018).
The paper is structured as follows: Sect. 2 presents the LISFLOODDG2 and FV1 formulations and the parallelisation strategies using OpenMP for multicore CPU architectures and CUDA for Nvidia GPU architectures. Section 3 evaluates the DG2, FV1 and ACC solvers across three flood inundation test cases. The first two cases reproduce Environment Agency benchmark tests (Néelz and Pender, 2013): the first case simulates a slowly propagating wave over a flat floodplain, measuring computational scalability on multicore CPU and GPU architectures and comparing the spatial grid convergence of DG2, FV1 and ACC predictions; the second case simulates a rapidly propagating wave along a narrow valley with irregular topography, assessing the solver capabilities for modelling supercritical flow. The final case reproduces fluvial flooding over the 2500 km^{2} Eden catchment in northwest England, caused by Storm Desmond in December 2015 (Xia et al., 2019). This is the first assessment of a DG2 hydrodynamic model in simulating a realworld storm event at catchment scale, with overland flow driven entirely by spatially and temporally varying rainfall data. Concluding remarks are made in Sect. 4. Additional LISFLOODFP 8.0 documentation and further test cases are available at https://www.seamlesswave.com/LISFLOOD8.0 (last access: 2 June 2021).
LISFLOODFP 8.0 includes a new secondorder discontinuous Galerkin (DG2) solver and an updated firstorder finitevolume (FV1) solver that simulate twodimensional shallowwater flows. The new DG2 and FV1 formulations and the existing LISFLOODACC formulation are described in the following subsections.
2.1 The new LISFLOODDG2 solver
The LISFLOODDG2 solver implements the DG2 formulation of Kesserwani et al. (2018) that adopts a simplified “slopedecoupled” stencil compatible with rasterbased Godunovtype finitevolume solvers. Piecewiseplanar topography, water depth and discharge fields are modelled by an elementaverage coefficient and dimensionally independent xslope and yslope coefficients. This DG2 formulation achieves wellbalancedness for all discharge coefficients in the presence of irregular, piecewiseplanar topography with wetting and drying (Kesserwani et al., 2018). A piecewiseplanar treatment of the friction term is applied to all discharge coefficients prior to each time step, based on the split implicit friction scheme of Liang and Marche (2009). Informed by the findings of Ayog et al. (2021), the automatic local slope limiter option in LISFLOODDG2 is deactivated for the floodlike test cases presented in Sect. 3. This slopedecoupled, nolimiter approach can achieve a 5fold speedup over a standard tensorproduct stencil with local slope limiting (Kesserwani et al., 2018; Ayog et al., 2021), meaning this DG2 formulation is expected to be particularly efficient for flood modelling applications.
The DG2 formulation discretises the twodimensional shallowwater equations, written in conservative vectorial form as
where ∂_{t}, ∂_{x} and ∂_{t} denote partial derivatives in the horizontal spatial dimensions x and y and temporal dimension t. In Eq. (1), U is the vector of flow variables, F(U) and G(U) are flux vectors in the x and y directions, and S_{b}, S_{f} and R are source terms representing the topographic slope, frictional force and rainfall:
with water depth h [L], unitwidth discharges q_{x}=hu and q_{y}=hv [${\mathrm{L}}^{\mathrm{3}}/\mathrm{T}$], and depthaveraged horizontal velocities u and v [$\mathrm{L}/\mathrm{T}$] in the x and y directions respectively. Units are notated in square brackets [⋅], where L denotes unit length and T denotes unit time. The twodimensional topographic elevation data are denoted z [L], and g is the gravitational acceleration [$\mathrm{L}/{\mathrm{T}}^{\mathrm{2}}$]. The frictional forces in the x and y directions are ${S}_{\mathrm{f}x}={C}_{\mathrm{f}}u\sqrt{{u}^{\mathrm{2}}+{v}^{\mathrm{2}}}$ and ${S}_{\mathrm{f}y}={C}_{\mathrm{f}}v\sqrt{{u}^{\mathrm{2}}+{v}^{\mathrm{2}}}$, where the friction function is ${C}_{\mathrm{f}}=g{n}_{M}^{\mathrm{2}}/{h}^{\mathrm{1}/\mathrm{3}}$ and n_{M}(x,y) is Manning's coefficient [$\mathrm{T}/{\mathrm{L}}^{\mathrm{1}/\mathrm{3}}$]. The prescribed rainfall rate is given by $R(x,y,t)$ [$\mathrm{L}/\mathrm{T}$].
The DG2 discretisation of Eq. (1) is compatible with existing LISFLOODFP data structures, being formulated on a raster grid of uniform rectangular elements. A rectangular element is shown in Fig. 1, centred at $({x}_{i,j},{y}_{i,j})$ with horizontal dimensions (Δx,Δy). Within the element the discrete flow vector U_{h}(x,y) and topography z_{h}(x,y) are represented by locally planar fields. Expressed as a scaled Legendre basis expansion (Kesserwani and Sharifian, 2020), the flow vector U_{h}(x,y) is written as
where U_{i,j} is the matrix of flow coefficients:
in which subscript 0 denotes the elementaverage coefficients and subscripts 1x and 1y denote the linear slope coefficients in the x and y directions. The topography coefficients are
which are initialised from a DEM raster file as described later in Sect. 2.1.1. Assembling all elements onto a raster grid yields piecewiseplanar representations of topography and flow variables that intrinsically capture smooth, linear variations within each element, while simultaneously allowing flow discontinuities – such as hydraulic jumps and shock waves – to be captured at element interfaces.
By adopting the slopedecoupled form of Kesserwani et al. (2018) that uses the local stencil shown in Fig. 1, the locally planar solution is easily evaluated at the four face centres (denoted N, S, E, W),
and at the four Gaussian quadrature points (denoted Gx1, Gx2, Gy1 and Gy2),
A standard splitting approach is adopted such that the friction source term S_{f} and rainfall source term R in Eq. (1) are applied separately at the beginning of each time step. By adopting a splitting approach, friction or rainfall source terms are only applied as required by the particular test case, for better runtime efficiency. The discretisation of the friction source term is described later in Sect. 2.1.2 and the rainfall source term in Sect. 2.1.3. The remaining terms are the spatial fluxes and topographic slope terms, which are discretised by an explicit secondorder twostage Runge–Kutta scheme (Kesserwani et al., 2010) to evolve the flow coefficients U_{i,j} from time level n to n+1:
where element indices (i,j) are omitted for clarity of presentation. The initial time step Δt is a fixed value specified by the user, and the time step is updated thereafter according to the CFL condition using the maximum stable Courant number of 0.33 (Cockburn and Shu, 2001). The spatial operator $\mathit{L}=[{\mathbf{L}}_{\mathrm{0}},{\mathit{L}}_{\mathrm{1}x},{\mathit{L}}_{\mathrm{1}y}]$ is
in which variables with an overline denote temporary modifications to the original variables that ensure wellbalancedness and nonnegative water depths (Kesserwani et al., 2018; Liang and Marche, 2009) and ${\stackrel{\mathrm{\u0303}}{\mathit{F}}}_{\mathrm{W}}$, ${\stackrel{\mathrm{\u0303}}{\mathit{F}}}_{\mathrm{E}}$, ${\stackrel{\mathrm{\u0303}}{\mathit{G}}}_{\mathrm{S}}$, ${\stackrel{\mathrm{\u0303}}{\mathit{G}}}_{\mathrm{N}}$ denote HLL approximate Riemann fluxes across western, eastern, northern and southern interfaces. Each Riemann solution resolves the discontinuity between the flow variables evaluated at the limits of the locally planar solutions adjacent to the interface. Because of the locally planar nature of the DG2 solutions, such a discontinuity is likely to be very small when the flow is smooth – as is often the case for flood inundation events – and will not be significantly enlarged by grid coarsening.
While LISFLOODDG2 is equipped with a generalised minmod slope limiter (Cockburn and Shu, 2001) localised by the Krivodonova shock detector (Krivodonova et al., 2004), the automatic local slope limiter was deactivated for the sake of efficiency: none of the test cases presented in Sect. 3 involve shock wave propagation since all waves propagate over an initially dry bed and are rapidly retarded by frictional forces (Néelz and Pender, 2013; Xia et al., 2019). The lack of shock wave propagation means that all LISFLOODFP solvers – DG2, FV1 and ACC – are capable of realistically simulating all test cases presented in Sect. 3.
2.1.1 Initialisation of piecewiseplanar topography coefficients from a DEM raster file
The topography coefficients $[{z}_{i,j,\mathrm{0}},{z}_{i,j,\mathrm{1}x},{z}_{i,j,\mathrm{1}y}]$ are initialised to ensure the resulting piecewiseplanar topography is continuous at face centres, where Riemann fluxes are calculated and the wettinganddrying treatment is applied under the wellbalancedness property (Kesserwani et al., 2018). The topographic elevations at the N, S, E and W face centres are calculated by averaging the DEM raster values taken at the NW, NE, SW and SE vertices (Fig. 1) such that ${z}_{i,j}^{\mathrm{N}}=({z}_{i,j}^{\mathrm{NW}}+{z}_{i,j}^{\mathrm{NE}})/\mathrm{2}$ and similarly for ${z}_{i,j,}^{\mathrm{E}}$, ${z}_{i,j,}^{\mathrm{S}}$ and ${z}_{i,j,}^{\mathrm{W}}$. The elementaverage coefficient ${z}_{i,j,\mathrm{0}}$ is then calculated as
while the slope coefficients ${z}_{i,j,\mathrm{1}x}$ and ${z}_{i,j,\mathrm{1}y}$ are calculated as the gradients across opposing face centres:
LISFLOODFP 8.0 includes a utility application, generateDG2DEM, that loads an existing DEM raster file and outputs new raster files containing the elementaverage, xslope and yslope topography coefficients, ready to be loaded by the LISFLOODDG2 solver.
2.1.2 Discretisation of the friction source term
The discretisation of the friction source term is based on the split implicit scheme of Liang and Marche (2009). Without numerical stabilisation, the friction function ${C}_{\mathrm{f}}=g{n}_{M}^{\mathrm{2}}/{h}^{\mathrm{1}/\mathrm{3}}$ can grow exponentially as the water depth vanishes at a wet–dry front, but the scheme adopted here is designed to ensure numerical stability by limiting the frictional force to prevent unphysical flow reversal.
The implicit friction scheme is solved directly (see Liang and Marche, 2009, Sect. 3.4) such that frictional forces are applied to the xdirectional discharge component q_{x} over a time step Δt, yielding a retarded discharge component q_{fx}:
where the denominator 𝒟_{x} is
To update the elementaverage discharge coefficient ${{q}_{x}}_{i,j,\mathrm{0}}$, Eq. (11) is evaluated at the element centre:
while the slope coefficients ${{q}_{x}}_{i,j,\mathrm{1}x}$ and ${{q}_{x}}_{i,j,\mathrm{1}y}$ are updated by calculating the x and y gradients using evaluations of Eq. (11) at Gaussian quadrature points Gx1, Gx2, and Gy1, Gy2 (Fig. 1):
Similarly, frictional forces are applied to the ydirectional discharge component q_{y} yielding a retarded discharge q_{fy}:
While this friction scheme has been successfully adopted in finitevolume and discontinuous Galerkin settings for modelling dam break flows and urban flood events (Wang et al., 2011; Kesserwani and Wang, 2014), it can exhibit spuriously large velocities and correspondingly small time steps for largescale, rainfallinduced overland flows, involving widespread, very thin water layers flowing down hill slopes and over steep riverbanks, as demonstrated by Xia et al. (2017). Due to the involvement of the slope coefficients, water depths at Gaussian quadrature points can be much smaller (and velocities much larger) than the elementaverage values. Therefore, for overland flow simulations, the LISFLOODDG2 time step size is expected to be substantially reduced compared to LISFLOODFV1, which only involves elementaverage values.
2.1.3 Discretisation of the rainfall source term
The discretisation of the rainfall source term evolves the water depth elementaverage coefficients ${h}_{i,j,\mathrm{0}}$:
where ${R}_{i,j}^{n}$ denotes the prescribed rainfall rate at element (i,j) and time level n. Equation (14) is firstorderaccurate in space and time, which is deemed sufficient since rainfall data are typically available at far coarser spatial and temporal resolutions than the computation grid, leading to zero elementwise slope coefficients for the rainfall source term.
Recall that the rainfall source term, friction source term, and remaining flux and bed slope terms are treated separately such that, at each time step, the flow variables updated by Eq. (14) are subsequently updated by Eq. (12), and finally by Eqs. (8)–(9). The complete DG2 model workflow is summarised by the flow chart in Fig. 2, wherein each operation is parallelised using the CPU and GPU parallelisation strategies discussed next.
2.1.4 OpenMP parallelisation for multicore CPUs
The LISFLOODDG2CPU solver adopts OpenMP to process rows of the computational grid in parallel using the nested loop structure in Fig. 3a, which is applied to each operation in the flow chart in Fig. 2. The global time step Δt is found by calculating the minimum value across all elements using an OpenMP reduction. The same parallelisation strategy is already adopted in existing LISFLOODFP solvers (Neal et al., 2009) because it is straightforward to implement with minimal code changes for any explicit numerical scheme involving local, elementwise operations. While some LISFLOODFP solvers implement more sophisticated OpenMP parallelisation and dry cell optimisation (Neal et al., 2018), this can introduce additional code complexity and runtime overhead (MoralesHernández et al., 2020), so it has not been adopted for the new LISFLOODDG2CPU solver.
2.1.5 CUDA parallelisation for Nvidia GPUs
The LISFLOODDG2GPU solver adopts a different parallelisation strategy using nested CUDA gridstride loops (Fig. 3b), which is a recommended technique for parallel processing of raster data on GPUs (Harris, 2013). Using this strategy, a 16 × 16element region of the computational grid is mapped to a CUDA block of 16 × 16 threads. Threads within each block execute in parallel, and multiple blocks also execute in parallel, thanks to the twolayer parallelism in the CUDA programming model. Nested gridstride loops are applied to each operation in Fig. 2. Thanks to the localisation of DG2, almost all operations are evaluated elementwise and only require data available locally within the element. The only nonlocal operations are (i) the global time step, which is calculated using a min() reduction operator from the CUB library (Merrill, 2015), and (ii) the Riemann fluxes that connect flow discontinuities across interfaces between neighbouring elements, which are discussed next.
To process Riemann fluxes efficiently, the LISFLOODDG2GPU solver adopts a new dimensionally split form that allows expensive Riemann flux evaluations to be stored temporarily in lowlatency shared memory on the GPU device (Qin et al., 2019). The new dimensionally split form is derived by decomposing the spatial operator (Eq. 9) and the twostage Runge–Kutta scheme (Eq. 8) into separate x and ydirectional updates. The slopedecoupled form allows a straightforward splitting of the spatial operator L in Eq. (9) into an xdirectional operator ${\mathbf{L}}_{x}=[{\mathit{L}}_{\mathrm{0}x},{\mathit{L}}_{\mathrm{1}x},\mathbf{0}]$ and a ydirectional operator ${\mathbf{L}}_{y}=[{\mathit{L}}_{\mathrm{0}y},\mathbf{0},{\mathit{L}}_{\mathrm{1}y}]$ such that $\mathbf{L}={\mathbf{L}}_{x}+{\mathbf{L}}_{y}$. The L_{1x} and L_{1y} operators are given in Eqs. (9b) and (9c), and L_{0x} and L_{0y} are defined as
Similarly, each of the two Runge–Kutta stages in Eq. (8) is split into two substages: the first updates the flow in the x direction by applying L_{x}; the second updates the flow in the y direction by applying L_{y}:
Each substage of Eq. (17) is evaluated elementwise within a nested gridstride loop. Within the xdirectional spatial operator L_{x}, the xdirectional Riemann fluxes, ${\stackrel{\mathrm{\u0303}}{\mathit{F}}}_{\mathrm{E}}$ and ${\stackrel{\mathrm{\u0303}}{\mathit{F}}}_{\mathrm{W}}$, are calculated as follows:

thread (i,j) calculates the Riemann flux across the eastern face of element (i,j), ${\stackrel{\mathrm{\u0303}}{\mathit{F}}}_{\mathrm{E}}$, storing the result in a local variable and in a shared memory array;

a synchronisation barrier waits for all threads in the CUDA block to complete;

thread (i,j) then loads ${\stackrel{\mathrm{\u0303}}{\mathit{F}}}_{\mathrm{W}}$ from shared memory, which is the same as ${\stackrel{\mathrm{\u0303}}{\mathit{F}}}_{\mathrm{E}}$ already calculated by thread $(i\mathrm{1},j)$; and

finally, with ${\stackrel{\mathrm{\u0303}}{\mathit{F}}}_{\mathrm{E}}$ already stored as a local variable and ${\stackrel{\mathrm{\u0303}}{\mathit{F}}}_{\mathrm{W}}$ loaded from shared memory, thread (i,j) can evaluate the xdirection operator L_{x}.
The ydirectional Riemann fluxes ${\stackrel{\mathrm{\u0303}}{\mathit{G}}}_{\mathrm{S}}$ and ${\stackrel{\mathrm{\u0303}}{\mathit{G}}}_{\mathrm{N}}$, within the ydirectional operator L_{y} are calculated in the same way. By caching flux evaluations in lowlatency shared memory, this dimensionally split approach minimises the number of expensive Riemann flux evaluations and only requires a single synchronisation barrier within each CUDA block.
2.2 The new FV1 solver
While LISFLOODFP already includes a firstorder finitevolume solver called LISFLOODRoe (Villanueva and Wright, 2006; Neal et al., 2012b), LISFLOODFP 8.0 includes an updated FV1 solver that is parallelised for multicore CPU and GPU architectures. The new FV1 formulation is obtained by simplifying the DG2 formulation (Sect. 2.1) to remove the slope coefficients and associated L_{1x} and L_{1y} spatial operators, yielding piecewiseconstant representations of topography and flow variables. Like DG2, flow discontinuities at element interfaces are captured by FV1's piecewiseconstant representation but, unlike DG2, smooth solutions cannot be captured without introducing artificial discontinuities, due to the lack of slope information within each element. Hence, FV1 is more vulnerable to grid coarsening since artificial discontinuities between elements tend to be enlarged as the grid becomes coarser, leading to increased numerical diffusion errors.
The LISFLOODFV1 formulation uses a standard firstorder forward Euler timestepping scheme (Eq. 8a with ${\mathbf{U}}^{n+\mathrm{1}}={\mathbf{U}}^{\mathrm{int}})$. The wellbalanced wetting and drying treatment necessitates a maximum stable Courant number of 0.5 (Kesserwani and Liang, 2012).
2.3 The existing LISFLOODACC local inertia solver
The LISFLOODACC solver (Bates et al., 2010) adopts a hybrid finitevolume and finitedifference discretisation of the local inertia equations, which simplify the full shallowwater equations by neglecting convective acceleration. Like LISFLOODFV1, LISFLOODACC adopts the finitevolume method to provide a piecewiseconstant representation of water depth, evolved elementwise via the discrete mass conservation equation:
where the time step Δt is calculated using the default Courant number of 0.7.
Unlike LISFLOODFV1, LISFLOODACC adopts a finitedifference method to simplify the representation of interelemental fluxes by storing a single, continuous discharge component at each interface, leading to the socalled Arakawa Cgrid staggering (Arakawa and Lamb, 1977) shown in Fig. 4. The discharge components are evolved via a simplified form of the momentum conservation equation coupled to the Manning friction formula: the q_{x} discharge component at interface $(i\mathrm{1}/\mathrm{2},j)$ is evolved as (Bates et al., 2010; de Almeida et al., 2012):
where the numerical flow depth at the interface is ${h}_{\mathrm{f}}=max({\mathit{\eta}}_{i,j}^{n},{\mathit{\eta}}_{i\mathrm{1},j}^{n})max({z}_{i,j},{z}_{i\mathrm{1},j})$. The q_{y} discharge component is evolved in the same way.
As seen in Eq. (19), the evolution of the continuous q_{x} value at the interface only relies on a local reconstruction of the water surface gradient, $({\mathit{\eta}}_{i,j}^{n}{\mathit{\eta}}_{i\mathrm{1},j}^{n})/\mathrm{\Delta}x$. This formulation could make LISFLOODACC less sensitive than LISFLOODFV1 to grid coarsening for modelling flood inundation events, when the water surface varies smoothly. The Arakawa Cgrid staggering adopted by LISFLOODACC is commonly used in numerical weather prediction models (Collins et al., 2013) because it yields secondorder accuracy in space on a compact, local stencil. The secondorder spatial accuracy of LISFLOODACC is confirmed based on the numerical analysis of de Almeida et al. (2012), as presented in Appendix B.
Three simulations are performed to assess the computational scalability and predictive capability of LISFLOODDG2 compared with LISFLOODFV1 and LISFLOODACC. The optimised LISFLOODACC solver specified by Neal et al. (2018) implements a subgrid channel model (Neal et al., 2012a) and CPUspecific optimisations that do not translate naturally to GPU architectures. Additionally, at the time that model runs were performed, the optimised ACC solver did not yet support the rainongrid features used later in Sect. 3.3.^{1} To facilitate a likeforlike intercomparison between solvers, the LISFLOODACC solver used here is the version specified by Neal et al. (2012b), which already supports the necessary rainongrid features and shares the same algorithmic approach as the FV1 and DG2 solvers.
The CPU solvers were run on a 2GHz Intel Xeon Gold 6138 using up to 16 CPU cores (with hyperthreading disabled), which is the maximum number of cores used in the LISFLOODFP parallelisation study of Neal et al. (2018). The GPU solvers were run on an Nvidia Tesla V100. LISFLOODFP is configured with double precision for all calculations. Simulation results are openly available on Zenodo (Shaw et al., 2021).
3.1 Slowly propagating wave over a flat floodplain
This synthetic test, known as test 4 in Néelz and Pender (2013), is widely used to assess flood model predictions of slowly propagating flow over a flat floodplain with high roughness (Neal et al., 2012b; Jamieson et al., 2012; Martins et al., 2015; Guidolin et al., 2016; Huxley et al., 2017). Since the floodplain is flat, the test setup is independent of grid resolution, which can be successively refined or coarsened to study the spatial convergence and computational scalability of the DG2, FV1 and ACC solvers on multicore CPU and GPU architectures.
As specified by Néelz and Pender (2013), the test is initialised on a rectangular 1000 m × 2000 m flat, dry floodplain with a standard grid spacing of Δx=5 m. A semicircular flood wave emanates from a narrow, 20 m breach at the centre of the western boundary as given by the inflow discharge hydrograph shown in Fig. 5b. The test is ended after 5 h. Manning's coefficient n_{M} is fixed at 0.05 ${\mathrm{sm}}^{\mathrm{1}/\mathrm{3}}$ leading to Froude numbers below 0.25, making the test suitable for all solvers including LISFLOODACC. For each solver, water depth and velocity hydrographs are measured at four standard gauge point locations marked in Fig. 5a, and the water depth cross section is measured after 1 h along the centre of the domain at y=1000 m.
3.1.1 Water depth and velocity hydrographs
Predicted hydrographs are obtained for the ACC, FV1CPU, FV1GPU, DG2CPU and DG2GPU solvers (Fig. 6). FV1CPU and FV1GPU solutions are identical and are named collectively as FV1 (similarly, DG2CPU and DG2GPU are named collectively as DG2). While no exact solution is available, DG2, FV1 and ACC predictions of water depth and velocity agree closely with existing industrial model results (Fig. 4.10 and 4.11 in Néelz and Pender, 2013). ACC and DG2 water depth predictions are almost identical at all gauge points (Fig. 6a–d). FV1 predictions are nearly identical, except that the wave front is slightly smoother and arrives several minutes earlier than ACC or DG2, as seen at point 5 (Fig. 6c) and point 6 (Fig. 6d).
Differences in velocity predictions are more pronounced (Fig. 6e–h). The biggest differences are seen at point 1 (Fig. 6e), located only 50 m from the breach, since the flow at this point is dominated by strong inflow discharge with negligible retardation by frictional forces. At point 1, ACC and DG2 velocity predictions agree closely with the majority of industrial models (Fig. 4.11 in Néelz and Pender, 2013). LISFLOODFV1 predicts faster velocities up to 0.5 ms^{−1}, which is close to the prediction of TUFLOW FV1 (Huxley et al., 2017, Table 11). Further away from the breach at point 3 (Fig. 6f), point 5 (Fig. 6g) and point 6 (Fig. 6h), velocity predictions agree more closely, except at the time of wave arrival. At this time, DG2 predicts the sharpest velocity variations while ACC velocity predictions are slightly smoother. FV1 predicts even smoother velocity variations with slightly lower peak velocities.
3.1.2 Spatial grid convergence
Spatial grid convergence is studied by modelling at grid resolutions of Δx=5, 1 and 0.5 m. Since the floodplain is flat, no topographic resampling is required. On each grid, the water depth cross section is measured along the centre of the domain (Fig. 7). DG2, FV1 and ACC crosssectional profiles at the standard grid spacing of Δx=5 m agree well with industrial model results (Fig. 4.13 in Néelz and Pender, 2013). Differences are most apparent in the vicinity of the wave front, near x=400 m. At the standard resolution of Δx=5 m, FV1 predicts a wave front about 50 m ahead of ACC or DG2, and the FV1 solution is much smoother. A TUFLOW modelling study reported similar findings, with TUFLOW FV1 predicting a smoother wave front about 50 m ahead of other TUFLOW solvers (Huxley et al., 2017, Table 12). At a 5 times finer resolution of Δx=1 m, all solvers predict a steeper wave front, although the FV1 wavefront prediction at Δx=1 m is still relatively smooth, being closer to the ACC prediction at Δx=5 m. A 10 times finer resolution of Δx=0.5 m is required for FV1 to predict a steep wave front in agreement with DG2 at Δx=5 m, while ACC only requires a resolution of Δx=2 m to obtain similar agreement.
These differences can be attributed to the order of accuracy of the solvers: DG2 is formally secondorderaccurate and exhibits the least sensitivity to grid resolution; FV1 is formally firstorderaccurate and exhibits the greatest sensitivity, with numerical diffusion errors leading to a spuriously smooth wave. Despite its simplified formulation, ACC predictions are close to DG2 because ACC is secondorderaccurate in space (Sect. 2.3).
3.1.3 Solver runtimes for a varying number of elements
To assess the relative runtime cost of the solvers, the test is run for a range of grid resolutions from Δx=10 m (yielding 2×10^{4} elements) to Δx=0.5 m (8×10^{6} elements). Each of the ACC, FV1CPU and DG2CPU solvers are run using 16 CPU cores, and FV1GPU and DG2GPU are run on a single GPU. To ensure reliable measurements, each solver is run twice on each grid, and the fastest runtime is recorded. Runs that have not completed within 24 h are aborted and excluded from the results. Solver runtimes are shown in Fig. 8a on a log–log scale.
On the coarsest grid with 2×10^{4} elements, FV1CPU and FV1GPU both take 5 s to complete – just 2 s more than ACC. As the grid is refined and the number of elements increases, FV1CPU remains slightly slower than ACC, while FV1GPU becomes faster than ACC when Δx<5 m and the number of elements exceeds 10^{5}. The runtime cost relative to ACC is shown in Fig. 8b: FV1CPU is about 1.5–2.5 times slower than ACC, gradually becoming less efficient as the number of elements increases. In contrast, FV1GPU becomes about 2 times faster than ACC (relative runtime ≈ 0.5) once the number of elements exceeds 10^{6} (Δx∼1 m), when the high degree of GPU parallelisation is exploited most effectively.
Similar trends are found with DG2CPU and DG2GPU: on the coarsest grid DG2CPU is about twice as fast as DG2GPU, but DG2GPU becomes increasingly efficient as the number of elements increases, being twice as fast as DG2CPU at Δx=2 m with 5×10^{5} elements (Fig. 8c). At Δx=1 m with 2×10^{6} total elements, DG2GPU completes in about 3.5 h while the DG2CPU run is aborted, having failed to complete within 24 h (Fig. 8a).
As seen earlier in the inset panel of Fig. 7, similar wave fronts were predicted by DG2 at Δx=5 m, ACC at Δx=2 m and FV1 at Δx=0.5 m. At these resolutions, DG2CPU, DG2GPU and ACC achieved a similar solution quality for a similar runtime cost, with all solvers completing in about 4 min (Fig. 8a). Meanwhile, the DG2 solvers on a 10 times coarser grid were 140 times faster than FV1CPU (10 h 42 min) and 28 times faster than FV1GPU (1 h 47 min).
3.1.4 Multicore CPU scalability
To assess the computational scalability of the multicore CPU solvers, the test is run using 1–16 CPU cores, while FV1GPU and DG2GPU are run on a single GPU device. A grid spacing of Δx=2 m (5×10^{5} elements) is chosen so that the grid has sufficient elements for effective GPU parallelisation (informed by the GPU runtimes in Fig. 8b–c) but has few enough elements so that all model runs complete within the 24 h cutoff. Measured runtimes for ACC, FV1CPU and DG2CPU are shown in Fig. 9 on a log–log scale, with all solver runtimes decreasing as the number of CPU cores increases. To facilitate a likeforlike comparison with FV1 and DG2, ACC solver runtimes were obtained for the ACC implementation of Neal et al. (2012a). Theoretical “perfect scaling” lines are marked by thin dotted lines for each solver: perfect scaling means that doubling the number of CPU cores would halve the runtime. ACC solver scalability falls somewhat below perfect scaling, with a 16fold increase in CPU cores only yielding a 7fold decrease in runtime (Fig. 9). In contrast, the DG2CPU and FV1CPU solvers achieve closetoperfect scaling up to 4 CPU cores, with synchronisation overheads causing only a small decrease in scalability thereafter. It is expected that additional performance can be gained by using the alternative, CPUoptimised ACC implementation (Neal et al., 2018), and these CPUspecific optimisations are also under consideration for future enhancement of the DG2 and FV1 solvers. For intercomparison with the CPU solvers, FV1GPU and DG2GPU runtimes are also marked by dashed horizontal lines (since the number of GPU cores is not configurable). Both GPU solvers are substantially faster than their counterparts on a 16core CPU.
Overall, the FV1, ACC and DG2 solvers converged on similar water depth solutions with successive grid refinement. Owing to its firstorder accuracy, FV1 requires a very fineresolution grid to match the solution quality of DG2 or ACC, though FV1GPU enables runtimes up to 6 times faster than the 16core FV1CPU solver. Thanks to its secondorder accuracy, DG2 water depth predictions are spatially converged at coarser resolutions (Fig. 7). Hence, DG2 is able to replicate the modelling quality of FV1 at a much coarser resolution, and the multicore DG2CPU solver is a competitive choice for grids with fewer than 100 000 elements.
3.2 Rapidly propagating wave along a valley
This test, known as test 5 in Néelz and Pender (2013), is employed to assess the capabilities of the DG2, FV1 and ACC solvers for modelling rapidly propagating flow over realistic terrain. As specified by Néelz and Pender (2013), the narrow valley (Fig. 10a) is initially dry, and Manning's coefficient n_{M} is fixed at 0.04 ${\mathrm{sm}}^{\mathrm{1}/\mathrm{3}}$. A synthetic dam break event near the southern boundary is modelled by prescribing a short inflow discharge hydrograph along a 260 m long line near the southern edge of the domain, with a peak flow of 3000 m^{3}s^{−1} (Fig. 10b). The test is ended after 30 h once the water has ponded near the closed boundary at the eastern edge of the domain.
LISFLOODFP is run using the ACC, FV1 (CPU and GPU) and DG2 (CPU and GPU) solvers at the standard DEM resolution of Δx=50 m used in most existing studies (Cohen et al., 2016; Huxley et al., 2017; Neal et al., 2018) and at the finest available DEM resolution of Δx=10 m. Water level and velocity hydrographs are measured at the five standard gauge point locations marked in Fig. 10a.
3.2.1 Water level and velocity hydrographs
Predicted water level and velocity hydrographs are shown in Fig. 11. The water level hydrographs show that water ponds in small topographic depressions at point 1 (Fig. 11a), point 3 (Fig. 11b) and point 5 (Fig. 11c). Point 7 is positioned near the steep valley slope and is only inundated between t=1 h and t=8 h (Fig. 11d). At both resolutions, water levels predicted by all solvers agree closely with existing industrial model results at points 1, 3 and 7 (Fig. 4.16 in Néelz and Pender, 2013). Small water level differences accumulate as water flows downstream, and at point 5, positioned farthest downstream of the dam break, differences of about 0.5 m are found depending on the choice of resolution and solver (Fig. 11c). Similar water level differences have been found amongst the suite of TUFLOW solvers (Huxley et al., 2017) and amongst other industrial models (Néelz and Pender, 2013). Bigger differences are found in velocity predictions, particularly at locations farther downstream at point 3 (Fig. 11f), point 4 (Fig. 11g) and point 7 (Fig. 11h). At point 3, DG2 predicts small, transient velocity variations at Δx=50 m starting at t=1 h; these variations are not captured by the FV1 or ACC solvers but have been captured by a FV2MUSCL solver at the finest resolution of Δx=10 m, as reported by Ayog et al. (2021). At point 7, ACC overpredicts peak velocities by about 0.5 ms^{−1} compared to FV1 and DG2 (Fig. 11h), and compared to other industrial models (Fig. 4.17 in Néelz and Pender, 2013). Otherwise, ACC, FV1 and DG2 velocity predictions are within the range of existing industrial model predictions.
3.2.2 Flood inundation and Froude number maps
While hydrograph predictions are often studied for this test case (Néelz and Pender, 2013; Cohen et al., 2016; Huxley et al., 2017; Neal et al., 2018), flood inundation maps taken at time instants provide a more detailed picture of flood wave propagation. Accordingly, two sets of flood inundation maps are obtained: one set at t=15 min during the short period of peak inflow and another set at t=3 h once the flood water has almost filled the valley. Flood maps are obtained at the finest resolution of Δx=10 m and with DG2 at Δx=50 m.
After 15 min, water has travelled about 1.5 km northeast along the valley away from the inflow region near the southern edge of the domain, with ACC, FV1 and DG2 water depth predictions shown in Fig. 12a–d. Behind the wave front, an abrupt change in water depth is predicted by FV1 (Fig. 12b) and DG2 (Fig. 12c, d), but this discontinuity induces spurious, smallscale oscillations in the ACC solver that propagate downstream (Fig. 12a). This numerical instability is understood by studying the Froude number, as shown in Fig. 12e–h. The rapidly propagating flow becomes supercritical across the region of shallower water, with a maximum Froude number of around 1.5. The local inertia equations are not physically valid for modelling abrupt changes in water depth or supercritical flows (de Almeida and Bates, 2013), leading to the observed numerical instability in the ACC solver.
After 3 h, the flood water has filled most of the valley and the wave front has almost reached point 5. As shown in Fig. 12i–l, ACC, FV1 and DG2 water depth predictions are in close agreement. The flow is now predominantly subcritical (Fig. 12m–p), although a small region of supercritical flow is found upstream of point 3 with a maximum Froude number of about 1.2 and a corresponding jump in water depth at the same location. Nevertheless, numerical instabilities in the ACC prediction at t=15 min are no longer evident at t=3 h (Fig. 12m), and ACC predictions remain stable at all gauge points for the duration of the simulation (Fig. 11). As seen in the fourth column of Fig. 12, DG2 flood maps at Δx=50 m are in close agreement with the ACC, FV1 and DG2 flood maps at Δx=10 m.
3.2.3 Runtime cost
Simulation runtimes are summarised in Table 1, with FV1CPU and DG2CPU solvers run on a 16core CPU and FV1GPU and DG2GPU solvers run on a single GPU. Similar to runtime measurements presented earlier in Sect. 3.1.3, the GPU solvers become more efficient on grids with a larger number of elements: in this test, DG2GPU is 1.8 times faster than DG2CPU at the standard resolution of Δx=50 m, becoming 2.5 times faster at the finest resolution of Δx=10 m; similarly, FV1GPU is between 1.2 times and 5.1 times faster than FV1CPU.
DG2CPU and DG2GPU at Δx=50 m outperform ACC, FV1CPU and FV1GPU at Δx=10 m, while still achieving similarly accurate flood map predictions at a 5 times coarser resolution (Fig. 12). DG2CPU at Δx=50 m is 2 times faster than ACC at Δx=10 m, while DG2GPU is twice as fast again. DG2GPU flood maps at an improved resolution of Δx=20 m are obtained at a runtime cost of 38 min, which is still competitive with ACC at Δx=10 m (with a runtime cost of 30 min).
In summary, all solvers predicted similar water depth and velocity hydrographs, though ACC experienced a short period of numerical instability in a localised region where the Froude number exceeded the limit of the local inertia equations. The shockcapturing FV1 and DG2 shallowwater solvers yield robust predictions throughout the entire simulation. with FV1GPU being consistently faster than ACC on a 16core CPU. As found earlier in Sect. 3.1.3, DG2 at a 2–5 times coarser resolution is a competitive alternative to ACC and FV1, with the GPU implementation being preferable when running DG2 on a grid with more than 100 000 elements.
3.3 Catchmentscale rainongrid simulation
In December 2015, Storm Desmond caused extensive fluvial flooding across the Eden catchment in northwest England (Szönyi et al., 2016). This storm event has previously been simulated using a firstorder finitevolume hydrodynamic model (Xia et al., 2019), with overland flow and fluvial flooding driven entirely by spatially and temporally varying rainfall data over the 2500 km^{2} catchment. As such, this simulation is ideally suited to assess the new rainongrid capabilities in LISFLOODFP 8.0 and represents one of the first DG2 hydrodynamic modelling studies of rainfallinduced overland flow across a large catchment. At this large scale, grid coarsening is often desirable to ensure model runtimes remain feasible (Falter et al., 2013), but coarsening the DEM can affect the quality of water depth predictions (Savage et al., 2016). Therefore, the three LISFLOODFP solvers were run at a range of grid resolutions, and their predictions were analysed with respect to observed river levels and flood extent survey data.
The Eden catchment and its four major rivers are shown in Fig. 13a. The DEM is available at a finest resolution of Δx=5 m covering the entire catchment. The largest city in the Eden catchment is Carlisle, situated at the confluence of the rivers Irthing, Petteril, Caldew and Eden (Fig. 13b). In the Carlisle area, the 5 m DEM incorporates channel cross section and flood defence data (Xia et al., 2019) and manual hydroconditioning to remove bridge decks that would otherwise block river flows.
As specified by Xia et al. (2019), the simulation comprises a spinup phase and subsequent analysis phase. The spinup phase starts at 00:00, 3 December 2015, from an initially dry domain. Water is introduced into the domain via the rainfall source term (Eq. 14), using Met Office rainfall radar data at a 1 km resolution updated every 5 min (Met Office, 2013). The spinup phase ends and the analysis phase begins at 12:00, 4 December 2015, once the memory of the dry initial condition has disappeared and water depths and discharges in all river channels have reached a physically realistic initial state (Xia et al., 2019). The simulation ends at 12:00, 8 December 2015, after a total of 5.5 simulated days. Manning's coefficient n_{M} is 0.035 ${\mathrm{sm}}^{\mathrm{1}/\mathrm{3}}$ for river channels and 0.075 ${\mathrm{sm}}^{\mathrm{1}/\mathrm{3}}$ elsewhere.
The following modelling assumptions are made as specified by Xia et al. (2019). Zero infiltration is assumed due to fully saturated antecedent soil moisture. An open boundary condition is imposed along the irregularshaped catchment perimeter by adjusting the terrain elevation of elements lying outside the catchment such that their elevation is below mean sea level, thereby allowing water to drain out of the river Eden into the Solway Firth. At each time step, water flowing out of the Solway Firth is removed by zeroing the water depth in elements lying outside the catchment. While rainfall data errors can influence model outputs, Ming et al. (2020) found that a prescribed 10 % rainfall error lead to only a 5 % relative mean error in predicted water depth hydrographs. As such, modelling uncertainties due to rainfall errors are not quantified in these deterministic model runs.
Model input data are prepared by upscaling the finest 5 m DEM to resolutions of Δx=40, 20 and 10 m. In previous studies, a grid spacing of Δx=10 m was sufficient to simulate observed flood extent and river levels (Xia et al., 2019; Ming et al., 2020), so LISFLOODFP runs are not performed on the finest 5 m DEM. Given the large number of elements (25 million elements at Δx=10 m) and informed by the computational scalability results in Sect. 3.1.3 and 3.2.3, DG2 and FV1 runs are only performed on a GPU, while ACC is run on a 16core CPU. Due to its relatively high runtime cost, DG2GPU is only run at Δx=40 m.
For each model run, hydrographs of freesurface elevation above mean sea level are measured in river channels at 16 gauging stations as marked in Fig. 13. Approximate gauging station coordinates are provided by the Environment Agency (2020), but these are often positioned near the riverbank and not in the channel itself. Hence, gauging station coordinates must be adjusted to ensure model results are measured in the channel. Here, a simple approach is adopted to manually reposition each gauging station based on the finestresolution DEM, with the amended positions given in Table 2. It is also important to measure hydrographs of freesurface elevation, since variation in freesurface elevation is minimal across the river channel. DG2, FV1 and ACC solver predictions are compared in the following three subsections: first, predicted freesurface elevation hydrographs are compared against gauging station measurements; second, predicted maximum flood extents are compared against a postevent flood extent survey (McCall, 2016); finally, predicted flood inundation maps are intercompared.
3.3.1 River channel freesurface elevation hydrographs
Freesurface elevation hydrographs at the 16 river gauging stations are shown in Fig. 14. Observed freesurface elevation hydrographs are calculated from Environment Agency measurements of water depth and riverbed elevation above mean sea level (Environment Agency, 2020). While water depths can be measured to an accuracy of ∼0.01 m (Bates et al., 2014), discrepancies between in situ, pointwise riverbed elevation measurements and the remotely sensed, twodimensional DEM can result in systematically biased freesurface elevations, as reported by Xia et al. (2019).
As seen in the observed hydrographs, river levels begin to rise across the catchment around 00:00, 5 December. A flashy response is seen in the headwaters of the river Eden, at Temple Sowerby, Great Musgrave Bridge and Kirkby Stephen, with water levels rising rapidly by 2–3 m and returning almost as rapidly to base flow conditions around 00:00, 7 December. Similar responses are found at the other gauging stations located further downstream, where river levels vary more gradually. The largest river level changes are found in the Carlisle area, particularly at Sheepmount and Sands Centre, which are located farthest downstream.
Timings of the rising and falling limbs are wellpredicted by all three solvers for the majority of hydrographs. At coarser grid resolutions, river levels are overpredicted and the difference between base flow and peak flow levels is underpredicted.^{2} These findings are consistent with those of Xia et al. (2019). Hydrograph inaccuracies are primarily due to DEM coarsening, which artificially smooths river channel geometries, reducing the elevation difference between riverbed and riverbank. Consequently, the terrain elevation at gauging points on the 40 m DEM is between 1.12 and 6.06 m higher than the same points on the 10 m DEM, depending on the local river channel geometry. These terrain elevation errors are shown in Table 2, which are calculated as the difference in local elementaverage topography elevations between the 40 m DEM and 10 m DEM.
The impact of DEM coarsening is most evident at Sebergham gauging station where the largest terrain elevation error of 6.06 m is found. At Δx=40 m, the DEM diverts the flow away from the true location of the river, and the FV1 and ACC Sebergham hydrographs remain flat at 99.4 m. At Δx=20 m, the terrain is only 1.4 m higher than at Δx=10 m and the FV1 and ACC hydrographs are closer to observations, though the difference between base flow and peak flow levels is still underpredicted. At Δx=10 m, predicted hydrographs accurately capture observed base flow and peak flow levels. The same behaviour is evident at Skew Bridge (with a terrain elevation error of 5.01 m) and, to a lesser extent, at other locations including Cummersdale (2.30 m) and Greenholme (2.23 m). In general, the greater the terrain elevation error at a given point, the greater the discrepancy between observed hydrographs and model predictions.
Next, the predictive capability of DG2 on the coarsest grid is benchmarked against hydrograph observations and against FV1 and ACC predictions on the 4 times finer grid. To measure the average discrepancy between predictions and observations, the RMSE is calculated as
where t_{start}= 12:00, 4 December, t_{end}= 00:00, 8 December, ${\mathit{\eta}}_{\mathrm{obs}}^{n}$ is the freesurface elevation calculated from observation data and N is the total number of observations. At most gauging stations, predictions converge towards observations, with RMSEs becoming smaller as the grid is refined. But at some gauging stations, including Linstock and Great Corby, the falling limb is underpredicted on finer grids, so RMSEs increase as predictions diverge from observations. Similar behaviour was also found at some gauging stations in the original study of Xia et al. (2019).
At most gauging stations, DG2 alleviates the freesurface elevation overprediction found in FV1 and ACC hydrographs at the same resolution, leading to better agreement between DG2 predictions and observations at Sheepmount, Sands Centre, Skew Bridge, Denton Holme, Greenholme and Sebergham, as indicated by the RMSEs in Fig. 15. The reduced overprediction is attributable to DG2's locally planar representation of terrain within each computational element, which enables DG2 to better capture terrain gradients between the riverbed and riverbank on a coarse grid.
DG2 predictions are also closer to FV1 and ACC hydrographs on 2 times and 4 times finer grids, depending on the river width at each gauging station, which ranges between 8 and 71 m (Table 2). The widest locations are at Sheepmount, Sands Centre, Linstock, Great Corby, Temple Sowerby and Great Musgrave Bridge; locations with moderate river widths of 13–21 m are found at Denton Holme, Greenholme, Skew Bridge and Sebergham; at most other locations rivers are narrower. At the widest locations, DG2 predictions are close to FV1 and ACC hydrographs on the 4 times finer grid; at locations with moderate river widths, DG2 predictions are closer to FV1 and ACC hydrographs on the 2 times finer grid. At other locations, DG2 predictions are closer to FV1 and ACC hydrographs at the coarsest grid resolution. Overall, when river channel geometries are larger than $\mathrm{\Delta}x/\mathrm{2}$, then the predictive capability of DG2 is substantially enhanced thanks to its secondorderaccurate, piecewiseplanar representation of terrain and flow variables.
Where river channel widths are close to or smaller than the grid spacing Δx, hydrograph predictions are especially sensitive to the channel geometry as resolved on the computational grid. At such locations, hydrograph predictions can be improved by running the model with an ensemble of possible sampling positions within a 100 m radius of each gauging station and then choosing the best fit between predictions and observations. However, this approach relies on the availability of observation data, and, due to modelling sensitivities at the scale of the grid, optimal positions can vary depending on the choice of solver and grid resolution. Spatially adaptive solvers (Kesserwani and Sharifian, 2020; ÖzgenXian et al., 2020) and nonuniform meshing techniques (Kolega and Syme, 2019) offer another alternative to improve flow predictions by selectively capturing finescale channel geometries, and such methods are under development for inclusion in a future LISFLOODFP release. Subgrid channel modelling can also improve hydrograph and flood inundation predictions, and LISFLOODFP already provides a subgrid channel model (Neal et al., 2012a) that could be integrated with the DG2 and FV1 solvers in a future release.
3.3.2 Maximum flood extent over Carlisle
Maximum flood extents are obtained for ACC and FV1 runs at resolutions of Δx=40, 20, and 10 m; due to its relatively high runtime cost, DG2GPU is only run at Δx=40 m only with flood maps at 20 and 10 m being inferred by downscaling the 40 m solution. The downscaling procedure adopted here exploits the full, DG2 piecewiseplanar solution by constructing the piecewiseplanar maximum water depth on the Δx=40 m grid and then sampling at the element centres on the higherresolution grid.
As shown in Fig. 16, the postevent survey outlined in pink marks the maximum extent of flooding across Carlisle. The surveyed flood extent is wellpredicted by all solvers. Predicted flood extents are largely insensitive to grid resolution, except for the region around Denton Holme gauging station on the river Caldew, which is protected by flood defence walls. Xia et al. (2019) added these flood defence walls by hand in the original 5 m DEM, but the coarsened DEMs were upscaled with no further handediting. As such, the steep, narrow walls become smeared out at coarse resolutions, with all solvers overpredicting flood extents at Δx=40 m in the Denton Holme region. The representation of these flood defences could be improved by adopting the recently developed LISFLOODFP levee module (Wing et al., 2019; Shustikova et al., 2020)^{3} or by implementing a spatially adaptive multiresolution method that selectively refines the grid resolution around river channels and other finescale features (Kesserwani and Sharifian, 2020).
Further qualitative differences are apparent in predicted water depths in regions south of Linstock and north of Botcherby Bridge, as indicated by arrows in Fig. 16. At Δx=40 m, DG2 and ACC yield almost identical predictions with regions of 0.1–2 m water depth south of Linstock and depths of 2–4 m north of Botcherby Bridge. In contrast, FV1 predicts wider areas of water depths of 2–4 m south of Linstock and depths of 4–6 m north of Botcherby Bridge. These regions of deep water become smaller as the grid is refined, but FV1 flood inundation predictions remain wider and deeper than ACC even at Δx=10 m.
The DG2 and FV1 predictions of maximum flood extent can be quantified against the ACC prediction at Δx=10 m, which is treated as the reference solution. The hit rate measures flood extent underprediction as the proportion of wet elements in the reference solution that were also predicted as wet. The false alarm ratio measures flood extent overprediction as the proportion of predicted wet elements that were dry in the reference solution. The critical success index measures both over and underprediction. All three metrics range between 0 and 1, and further details are provided by Wing et al. (2017).
The hit rate (H), false alarm ratio (F) and critical success index (C) are given in Table 3. At Δx=40 m, the critical success index is 0.59 for both DG2 and FV1, but DG2 has a higher hit rate and false alarm ratio, suggesting that DG2 predicts a wider flood extent than ACC or FV1. At Δx=20 m and Δx=10 m, the false alarm ratio and critical success index for DG2 deteriorate, but a hit rate of 0.83–0.86 is maintained, which is acceptable given that highresolution predictions are downscaled from the DG2 piecewiseplanar solution at Δx=40 m. FV1 predictions at Δx=20 m and Δx=10 m are obtained directly without downscaling, and FV1 predictions converge towards ACC predictions with successive grid refinement. This convergence is evidenced in all three metrics, with FV1 at Δx=10 m achieving a high hit rate (0.93), low false alarm ratio (0.03) and high critical success index (0.90).
3.3.3 Flood inundation maps at 12:00, 5 December
While some differences between solver predictions were evident in maximum flood depths, these differences become clearer in flood inundation maps taken at a single time instant. Accordingly, flood inundation maps shown in Fig. 17 are taken at 12:00, 5 December, over Carlisle city centre, during the rising limb of the Sheepmount and Sands Centre hydrographs, where river level rises were largest (Fig. 14). At the coarsest resolution of Δx=40 m, DG2 and ACC predictions are almost identical (Fig. 17a and g). Both solvers accurately capture the flood extent and water depths predicted by FV1 and ACC at a 4 times finer resolution of Δx=10 m (Fig. 17f and i). In contrast, FV1 predicts greater water depths and a slightly wider flood extent, particularly at coarser resolutions of Δx=40 m (Fig. 17d) and Δx=20 m (Fig. 17e). But once the grid is refined to a resolution of Δx=10 m, FV1 and ACC solutions are almost converged (Fig. 17f and i). DG2 predictions at Δx=20 m (Fig. 17b) and 10 m (Fig. 17c) are downscaled from the DG2 prediction at Δx=40 m. The downscaled DG2 predictions are not expected to resolve all finescale features visible in the FV1 and ACC predictions. Nevertheless, compared to the DG2 Δx=40 m flood map, the downscaled DG2 flood maps better represent the deeper waters in the river Eden (flowing east to west) and in the river Caldew (flowing south to north).
To quantify the spatial convergence of the three solvers, water depth RMSEs are calculated at 12:00, 5 December, over the entire catchment (Table 4). Since water depth observations are unavailable, the FV1 prediction at Δx=10 m is taken as the reference solution. At Δx=40 m, DG2 and ACC RMSEs are almost identical, while the FV1 error is about 10 % larger. At Δx=20 m, FV1 errors are again about 10 % larger than ACC, with the ACC solver converging more rapidly towards the FV1 reference solution than FV1 itself, despite ACC's simplified numerical formulation (Sect. 2.3).
In this catchmentscale simulation and earlier in the simulation of a slowly propagating wave over a flat floodplain (Sect. 3.1.2), FV1 was seen to converge more slowly and predict a flood extent wider than DG2 or ACC. Once again, these differences can be attributed to the order of accuracy of the solvers: FV1 is formally firstorderaccurate and exhibits the greatest sensitivity to grid resolution, while DG2 and ACC are both secondorderaccurate in space.
3.3.4 Runtime cost
Solver runtimes for the entire 5.5 d simulation are shown in Table 5. On the same grid, FV1GPU is about 2 times faster than ACC on a 16core CPU, which is consistent with earlier findings in Sect. 3.1 and 3.2. FV1GPU and ACC runtimes scale as expected: halving the grid spacing quadruples the total number of elements and halves the time step due to the CFL constraint. Hence, halving the grid spacing multiplies the runtime by a factor of about 8. At all grid spacings between 40 and 10 m, FV1GPU and ACC simulations run faster than realtime and complete in less than 5.5 d, indicating that these solvers are suitable for realtime flood forecasting applications.
The DG2GPU solver runtime is substantially slower than other solvers on the same, coarse grid. Unlike the tests presented earlier in Sect. 3.1 and 3.2, this test involves widespread overland flow driven by continual rainfall. Overland flow is characterised by thin layers of water only centimetres deep, which continually flow downhill, driven by gravity and balanced by frictional forces. These frictional forces become strongly nonlinear for such thin water layers, and the DG2 friction scheme imposes an additional restriction on the time step size to maintain stability in the discharge slope coefficients. This challenge has recently been addressed in finitevolume hydrodynamic modelling using an improved friction scheme that calculates the physically correct equilibrium state between gravitational and frictional forces (Xia and Liang, 2018). Extending this friction scheme into a discontinuous Galerkin formulation is expected to alleviate the time step restriction and reduce DG2 solver runtimes for overland flow simulations. For simulations without widespread overland flow, including those presented earlier in Sect. 3.1 and 3.2, the current DG2 formulation imposes no additional time step restriction and DG2 solver runtimes are substantially faster.
This paper presented new secondorder discontinuous Galerkin (LISFLOODDG2) and firstorder finitevolume (LISFLOODFV1) solvers that are parallelised for multicore CPU and Nvidia GPU architectures. The new solvers are compatible with existing LISFLOODFP test cases and available in LISFLOODFP 8.0, alongside the existing local inertia solver, LISFLOODACC. LISFLOODFP 8.0 also supports spatially and temporally varying rainfall data to enable realworld rainongrid simulations.
The predictive capabilities and computational scalability of the new solvers was studied across two Environment Agency (EA) benchmark tests, and for a realworld fluvial flood simulation driven by rainfall across a 2500 km^{2} catchment. The secondorder spatial convergence of LISFLOODDG2 on coarse grids was demonstrated by its ability to sharply resolve moving wet–dry fronts in EA benchmark tests, and in the catchmentscale flood simulation, DG2 alleviated the impact of DEM coarsening on river level hydrograph predictions due to its secondorder, piecewiseplanar representation of river channel geometries.
By analysing the LISFLOODACC local inertia solver, its hybrid finitedifference and finitevolume scheme was found to be spatially secondorderaccurate thanks to its grid staggering of water depth and discharge variables. As a result, ACC predictions in all tests were close to those of DG2, despite ACC's simplified governing equations and simplified numerical scheme. The ACC solver also exhibited less numerical diffusion at wet–dry fronts and predicted more accurate hydrographs and flood inundation maps than FV1 on coarse grids. Meanwhile, the FV1 and DG2 solvers provided the most robust predictions of a rapidly propagating wave in an EA benchmark test involving supercritical flow and abrupt water depth changes.
The multicore FV1CPU and DG2CPU demonstrated nearoptimal computational scalability up to 16 CPU cores. Multicore CPU runtimes were most efficient on grids with fewer than 0.1 million elements, while FV1GPU and DG2GPU solvers were most efficient on grids with more than 1 million elements, where the high degree of GPU parallelisation was best exploited. On such grids, GPU solvers were 2.5–4 times faster than the corresponding 16core CPU solvers, and FV1GPU runtimes were highly competitive with those of ACC. DG2GPU was also found to be more efficient than FV1GPU and ACC: DG2GPU delivered the same level of accuracy on 2–4 times coarser grids while remaining faster to run.
For the catchmentscale flood simulation, the DG2GPU runtime was less competitive due to widespread overland flow, involving frictional forces acting on thin water layers, which imposed an additional time step restriction in the current DG2 formulation. It is expected that this restriction could be lifted by formulating an improved DG2 friction scheme based on the finitevolume friction scheme of Xia and Liang (2018). Overland flow does not feature in the EA benchmark tests, where DG2GPU runtimes remain competitive, being only 5–8 times slower than ACC on the same grid. However, FV1 and DG2 are the first solvers in LISFLOODFP to gain a dynamic rainongrid capability, with this capability being added to the optimised ACC solver in a future release. To further improve efficiency and accuracy at coarse resolutions over large catchments, one future direction would be to port the subgrid channel model – currently integrated with the CPUoptimised ACC solver – to GPU architectures. Another useful direction would be to enable a multiresolution solver based on Kesserwani and Sharifian (2020), and introduce a hybrid DG2/FV1 solver that downgraded the DG2 formulation to FV1 in regions of very thin water layer, or in regions of finest grid resolution, to further reduce the computational cost. Both directions are being investigated for inclusion in future LISFLOODFP releases.
Overall, the LISFLOODDG2, FV1 and ACC solvers all demonstrated reliable predictions in good agreement with existing industrial model results and realworld observation data. Despite its simplified numerical formulation, ACC predictions were close to those of DG2 since both solvers are spatially secondorderaccurate. DG2 achieved the best spatial convergence, and its piecewiseplanar representation of river channels wider than $\mathrm{\Delta}x/\mathrm{2}$ facilitated improved river level hydrograph and flood inundation predictions that were typically close to those of FV1 and ACC on 2–4 times finer grids. Hence, for simulations where highresolution DEM data are unavailable or largescale highresolution modelling is infeasible, LISFLOODDG2GPU is a promising choice for flood inundation modelling.
To run a simulation, specify the LISFLOODFP parameter file, ea4.par
, ea5.par
or eden.par
along with the appropriate solver parameters.
For example, to run test 4 at Δx=50 m with the FV1GPU solver:
lisflood DEMfile ea450m.dem \ dirroot ea450mfv1gpu \ fv1 cuda ea4.par
Model outputs are written in ESRI ASCII format to the specified dirroot
directory: .wd
is a water depth field, and .Vx
and .Vy
denote u and v components of velocity. Water depth and velocity hydrographs are written to .stage
and .velocity
files respectively.
Model output ESRI ASCII files can be postprocessed using the Python 3 scripts in the postprocess
directory in the LISFLOODFP 8.0 software package (LISFLOODFP developers, 2020):
downsample.py
andupsample.py

Downsample or upsample a given ESRI ASCII file by power of 2.

speed.py

Calculate the magnitude of velocity from u and v components.

froude.py

Calculate the Froude number from given water depth and speed files.

sampleline.py

Extract a horizontal or vertical cross section at a given i or j index.

mask.py

Mask a model output by imposing `NoData` values from the DEM onto the model output file.

diff.py

Calculate the difference between two model outputs.

stats.py

Calculate global statistics including min and max values and root mean square error.
To convert a raw DEM (.dem.raw
file) to a DG2 DEM (comprising .dem
, .dem1x
and .dem1y
files), run the generateDG2DEM application provided with the LISFLOODFP 8.0 software package. For further details on configuring and running the model, consult the user manual (LISFLOODFP developers, 2020).
The formal order of accuracy of LISFLOODACC is determined by a numerical analysis of the discrete local inertia equations (de Almeida et al., 2012). To begin, the local inertia equations are linearised by assuming small perturbations in freesurface elevation η about a constant reference depth H [L], leading to the linearised frictionless onedimensional local inertia equations:
This linear assumption is valid for gradually varying, quasisteady flows (de Almeida et al., 2012) and ensures that the remainder of the analysis is tractable. Equation (B1) is then discretised using the same staggeredgrid finitedifference approximation as Eq. (19), before performing a Taylor series expansion of the discrete equations to obtain (de Almeida et al., 2012)
where the first and secondorder discretisation error terms appear on the righthand side and higherorder terms are neglected. Considering only the leadingorder discretisation errors, Eq. (B2) simplifies to
where 𝒪(⋅) denotes the leadingorder discretisation errors. Therefore, the LISFLOODACC formulation is formally firstorderaccurate in time but secondorderaccurate in space.
LISFLOODFP 8.0 source code (LISFLOODFP developers, 2020, https://doi.org/10.5281/zenodo.4073011) and simulation results (Shaw et al., 2021, https://doi.org/10.5281/zenodo.4066823) are available on Zenodo. Instructions for running the simulations are provided in Appendix A. Due to access restrictions, readers are invited to contact the Environment Agency for access to the DEM used in Sect. 3.2 and to refer to Xia et al. (2019) for access to the Eden catchment model data used in Sect. 3.3.
JS coded the numerical solvers in collaboration with JN and conducted simulations and drafted the initial paper with assistance from MKS. GK secured project funding and provided supervision. All authors contributed to conceptualisation, paper review and editing.
The authors declare that they have no conflict of interest.
The authors are most grateful to Xiaodong Ming (Newcastle University), Qiuhua Liang and Xilin Xia (Loughborough University) for providing model data and valuable expertise for the Storm Desmond Eden catchment simulation. The authors thank colleagues at The University of Sheffield: Peter Heywood and Robert Chisholm for sharing their CUDA expertise, Janice Ayog for her expertise with Environment Agency benchmark tests, and Alovya Chowdhury for recording accompanying LISFLOODFP online video tutorials. This work is part of the SEAMLESSWAVE project (SoftwarE infrAstructure for Multipurpose fLood modElling at variouS scaleS based on WAVElets). For information about the project visit https://www.seamlesswave.com (last access: 2 June 2021).
James Shaw, Georges Kesserwani and Mohammad Kazem Sharifian were supported by the UK Engineering and Physical Sciences Research Council (EPSRC) grant EP/R007349/1. Paul Bates was supported by a Royal Society Wolfson Research Merit award.
This paper was edited by Bethanna Jackson and reviewed by two anonymous referees.
Arakawa, A. and Lamb, V. R.: Computational design of the basic dynamical processes of the UCLA general circulation model, Methods in Computational Physics: Advances in Research and Applications, 17, 173–265, https://doi.org/10.1016/B9780124608177.500094, 1977. a
Ayog, J. L., Kesserwani, G., Shaw, J., Sharifian, M. K., and Bau, D.: Secondorder discontinuous Galerkin flood model: comparison with industrystandard finite volume models, J. Hydrol., 594, 125924, https://doi.org/10.1016/j.jhydrol.2020.125924, 2021. a, b, c, d, e, f, g
Bates, P. D.: Integrating remote sensing data with flood inundation models: how far have we got?, Hydrol. Process., 26, 2515–2521, https://doi.org/10.1002/hyp.9374, 2012. a
Bates, P. D., Horritt, M. S., and Fewtrell, T. J.: A simple inertial formulation of the shallow water equations for efficient twodimensional flood inundation modelling, J. Hydrol., 387, 33–45, https://doi.org/10.1016/j.jhydrol.2010.03.027, 2010. a, b
Bates, P. D., Pappenberger, F., and Romanowicz, R. J.: Uncertainty in flood inundation modelling, in: Applied uncertainty analysis for flood risk management, 232–269, https://doi.org/10.1142/9781848162716_0010, 2014. a
Brodtkorb, A. R., Hagen, T. R., and Sætra, M. L.: Graphics processing unit (GPU) programming strategies and trends in GPU computing, J. Parallel Distr. Com., 73, 4–13, https://doi.org/10.1016/j.jpdc.2012.04.003, 2013. a
Cockburn, B. and Shu, C.W.: Runge–Kutta discontinuous Galerkin methods for convectiondominated problems, J. Sci. Comput., 16, 173–261, https://doi.org/10.1023/A:1012873910884, 2001. a, b
Cohen, R., Hilton, J., and Prakash, M.: Benchmark testing the Swift flood modelling solver: Version I, Tech. Rep. EP151977, CSIRO, available at: https://publications.csiro.au/rpr/download?pid=csiro:EP151977&dsid=DS2 (last access: 2 June 2021), 2016. a, b
Collins, S. N., James, R. S., Ray, P., Chen, K., Lassman, A., and Brownlee, J.: Grids in numerical weather and climate models, in: Climate change and regional/local responses, IntechOpen, 256, https://doi.org/10.5772/55922, 2013. a
Cozzolino, L., Cimorelli, L., Della Morte, R., Pugliano, G., Piscopo, V., and Pianese, D.: Flood propagation modeling with the Local Inertia Approximation: Theoretical and numerical analysis of its physical limitations, Adv. Water Resour., 133, 103422, https://doi.org/10.1016/j.advwatres.2019.103422, 2019. a
de Almeida, G. A. and Bates, P.: Applicability of the local inertial approximation of the shallow water equations to flood modeling, Water Resour. Res., 49, 4833–4844, https://doi.org/10.1002/wrcr.20366, 2013. a, b, c
de Almeida, G. A., Bates, P., Freer, J. E., and Souvignet, M.: Improving the stability of a simple formulation of the shallow water equations for 2D flood modeling, Water Resour. Res., 48, W05528, https://doi.org/10.1029/2011WR011570, 2012. a, b, c, d, e
Environment Agency: Realtime and Nearrealtime river level data, available at: https://data.gov.uk/dataset/0cbf22516eb24c4eaf7cd318da9a58be/realtimeandnearrealtimeriverleveldata (last access: 2 June 2021), 2020. a, b
Falter, D., Vorogushyn, S., Lhomme, J., Apel, H., Gouldby, B., and Merz, B.: Hydraulic model evaluation for largescale flood risk assessments, Hydrol. Process., 27, 1331–1340, https://doi.org/10.1002/hyp.9553, 2013. a
GarcíaFeal, O., GonzálezCao, J., GómezGesteira, M., Cea, L., Domínguez, J. M., and Formella, A.: An accelerated tool for flood modelling based on Iber, Water, 10, 1459, https://doi.org/10.3390/w10101459, 2018. a
Guidolin, M., Chen, A. S., Ghimire, B., Keedwell, E. C., Djordjević, S., and Savić, D. A.: A weighted cellular automata 2D inundation model for rapid flood analysis, Environ. Modell. Softw., 84, 378–394, https://doi.org/10.1016/j.envsoft.2016.07.008, 2016. a
Harris, M.: CUDA pro tip: write flexible kernels with gridstride loops, available at: https://developer.nvidia.com/blog/cudaprotipwriteflexiblekernelsgridstrideloops/ (last access: 2~June~2021), 2013. a
Hoch, J. M., Eilander, D., Ikeuchi, H., Baart, F., and Winsemius, H. C.: Evaluating the impact of model complexity on flood wave propagation and inundation extent with a hydrologic–hydrodynamic model coupling framework, Nat. Hazards Earth Syst. Sci., 19, 1723–1735, https://doi.org/10.5194/nhess1917232019, 2019. a
Hunter, N., Bates, P., Horritt, M., and Wilson, M.: Improved simulation of flood flows using storage cell models, P. I. Civil Eng. Wat. M., 159, 9–18, https://doi.org/10.1680/wama.2006.159.1.9, 2006. a
Huxley, C., Syme, B., and Symons, E.: UK Environment Agency 2D Hydraulic Model Benchmark Tests, 201709 TUFLOW release update, Tech. rep., BMT WBM Pty Ltd., Level 8, 200 Creek Street, Brisbane Qld 4000, Australia, PO Box 203, Spring Hill 400, available at: https://downloads.tuflow.com/_archive/Publications/UK%20EA%202D%20Benchmarking%20Results.TUFLOW%20Products%20201709.pdf (last access: 2 June 2021), 2017. a, b, c, d, e, f
Jamieson, S. R., Lhomme, J., Wright, G., and Gouldby, B.: A highly efficient 2D flood model with subelement topography, P. I. Civil Eng. Wat. M., 165, 581–595, https://doi.org/10.1680/wama.12.00021, 2012. a
Kesserwani, G. and Liang, Q.: Locally limited and fully conserved RKDG2 shallow water solutions with wetting and drying, J. Sci. Comput., 50, 120–144, https://doi.org/10.1007/s1091501194764, 2012. a
Kesserwani, G. and Sharifian, M. K.: (Multi)wavelets increase both accuracy and efficiency of standard Godunovtype hydrodynamic models: Robust 2D approaches, Adv. Water Resour., 144, 103693, https://doi.org/10.1016/j.advwatres.2020.103693, 2020. a, b, c, d, e
Kesserwani, G. and Wang, Y.: Discontinuous Galerkin flood model formulation: Luxury or necessity?, Water Resour. Res., 50, 6522–6541, https://doi.org/10.1002/2013WR014906, 2014. a, b, c
Kesserwani, G., Liang, Q., Vazquez, J., and Mosé, R.: Wellbalancing issues related to the RKDG2 scheme for the shallow water equations, Int. J. Numer. Meth. Fl., 62, 428–448, https://doi.org/10.1002/fld.2027, 2010. a
Kesserwani, G., Ayog, J. L., and Bau, D.: Discontinuous Galerkin formulation for 2D hydrodynamic modelling: Tradeoffs between theoretical complexity and practical convenience, Comput. Method. Appl. M., 342, 710–741, https://doi.org/10.1016/j.cma.2018.08.003, 2018. a, b, c, d, e, f, g
Kolega, A. and Syme, B.: Evolution in flood modelling based on the example of the Eudlo Creek crossing over the Bruce Highway, Institute of Public Works Engineering Australasia Queensland, available at: http://ipweaq.intersearch.com.au/ipweaqjspui/handle/1/5386 (last access: 2 June 2021), 2019. a
Krivodonova, L., Xin, J., Remacle, J.F., Chevaugeon, N., and Flaherty, J. E.: Shock detection and limiting with discontinuous Galerkin methods for hyperbolic conservation laws, Appl. Numer. Math., 48, 323–338, https://doi.org/10.1016/j.apnum.2003.11.002, 2004. a
Kvočka, D., Ahmadian, R., and Falconer, R. A.: Flood inundation modelling of flash floods in steep river basins and catchments, Water, 9, 705, https://doi.org/10.3390/w9090705, 2017. a
Li, D., Andreadis, K. M., Margulis, S. A., and Lettenmaier, D. P.: A data assimilation framework for generating spacetime continuous daily SWOT river discharge data products, Water Resour. Res., 56, e2019WR026999, https://doi.org/10.1029/2019WR026999, 2020. a
Liang, Q. and Marche, F.: Numerical resolution of wellbalanced shallow water equations with complex source terms, Adv. Water Resour., 32, 873–884, https://doi.org/10.1016/j.advwatres.2009.02.010, 2009. a, b, c, d
LISFLOODFP developers: LISFLOODFP 8.0 hydrodynamic model, Zenodo, https://doi.org/10.5281/zenodo.4073011, 2020. a, b, c, d
Liu, Z., Merwade, V., and Jafarzadegan, K.: Investigating the role of model structure and surface roughness in generating flood inundation extents using oneand twodimensional hydraulic models, J. Flood Risk Manag., 12, e12347, https://doi.org/10.1111/jfr3.12347, 2019. a
Martins, R., Leandro, J., and Djordjević, S.: A well balanced Roe scheme for the local inertial equations with an unstructured mesh, Adv. Water Resour., 83, 351–363, https://doi.org/10.1016/j.advwatres.2015.07.007, 2015. a
Martins, R., Leandro, J., and Djordjević, S.: Analytical solution of the classical dambreak problem for the gravity wave–model equations, ASCE J. Hydraul. Eng., 142, 06016003, https://doi.org/10.1061/(ASCE)HY.19437900.0001121, 2016. a
McCall, I.: Carlisle Flood Investigation Report, Flood Event 5–6th December 2015, Tech. rep., Environment Agency, available at: https://www.cumbria.gov.uk/eLibrary/Content/Internet/536/6181/42494151257.pdf (last access: 2 June 2021), 2016. a
Merrill, D.: CUB software package, available at: https://nvlabs.github.io/cub/ (last access: 2 June 2021), 2015. a
Met Office: Met Office Rain Radar Data from the NIMROD System, available at: https://catalogue.ceda.ac.uk/uuid/82adec1f896af6169112d09cc1174499 (last access: 2 June 2021), 2013. a
Ming, X., Liang, Q., Xia, X., Li, D., and Fowler, H. J.: Realtime flood forecasting based on a highperformance 2D hydrodynamic model and numerical weather predictions, Water Resour. Res., 56, e2019WR025583, https://doi.org/10.1029/2019WR025583, 2020. a, b
MoralesHernández, M., Sharif, M. B., Gangrade, S., Dullo, T. T., Kao, S.C., Kalyanapu, A., Ghafoor, S., Evans, K., MadadiKandjani, E., and Hodges, B. R.: Highperformance computing in water resources hydrodynamics, J. Hydroinform., 22, 1217–1235, https://doi.org/10.2166/hydro.2020.163, 2020. a
MoralesHernández, M., Sharif, M. B., Kalyanapu, A., Ghafoor, S. K., Dullo, T. T., Gangrade, S., Kao, S.C., Norman, M. R., and Evans, K. J.: TRITON: A MultiGPU open source 2D hydrodynamic flood model, Environ. Modell. Softw., 141, 105034, https://doi.org/10.1016/j.envsoft.2021.105034, 2021. a
Neal, J., Fewtrell, T., and Trigg, M.: Parallelisation of storage cell flood models using OpenMP, Environ. Modell. Softw., 24, 872–877, https://doi.org/10.1016/j.envsoft.2008.12.004, 2009. a, b
Neal, J., Schumann, G., Fewtrell, T., Budimir, M., Bates, P., and Mason, D.: Evaluating a new LISFLOODFP formulation with data from the summer 2007 floods in Tewkesbury, UK, J. Flood Risk Manag., 4, 88–95, https://doi.org/10.1111/j.1753318X.2011.01093.x, 2011. a
Neal, J., Schumann, G., and Bates, P.: A subgrid channel model for simulating river hydraulics and floodplain inundation over large and data sparse areas, Water Resour. Res., 48, W11506, https://doi.org/10.1029/2012WR012514, 2012a. a, b, c, d, e, f, g, h
Neal, J., Villanueva, I., Wright, N., Willis, T., Fewtrell, T., and Bates, P.: How much physical complexity is needed to model flood inundation?, Hydrol. Process., 26, 2264–2282, https://doi.org/10.1002/hyp.8339, 2012b. a, b, c, d, e, f
Neal, J., Dunne, T., Sampson, C., Smith, A., and Bates, P.: Optimisation of the twodimensional hydraulic model LISFOODFP for CPU architecture, Environ. Modell. Softw., 107, 148–157, https://doi.org/10.1016/j.envsoft.2018.05.011, 2018. a, b, c, d, e, f, g, h
Néelz, S. and Pender, G.: Benchmarking the latest generation of 2D hydraulic modelling packages, Tech. Rep. SC120002, Environment Agency, Horizon House, Deanery Road, Bristol, BS1 9AH, available at: https://www.gov.uk/government/publications/benchmarkingthelatestgenerationof2dhydraulicfloodmodellingpackages (last access: 2 June 2021), 2013. a, b, c, d, e, f, g, h, i, j, k, l, m
O'Loughlin, F., Neal, J., Schumann, G., Beighley, E., and Bates, P.: A LISFLOODFP hydraulic model of the middle reach of the Congo, J. Hydrol., 580, 124203, https://doi.org/10.1016/j.jhydrol.2019.124203, 2020. a
ÖzgenXian, I., Kesserwani, G., CaviedesVoullième, D., Molins, S., Xu, Z., Dwivedi, D., Moulton, J. D., and Steefel, C. I.: Waveletbased local mesh refinement for rainfall–runoff simulations, J. Hydroinform., 22, 1059–1077, https://doi.org/10.2166/hydro.2020.198, 2020. a
Qin, X., LeVeque, R. J., and Motley, M. R.: Accelerating an Adaptive Mesh Refinement Code for DepthAveraged Flows Using GPUs, J. Adv. Model. Earth Sy., 11, 2606–2628, https://doi.org/10.1029/2019MS001635, 2019. a
Rajib, A., Liu, Z., Merwade, V., Tavakoly, A. A., and Follum, M. L.: Towards a largescale locally relevant flood inundation modeling framework using SWAT and LISFLOODFP, J. Hydrol., 581, 124406, https://doi.org/10.1016/j.jhydrol.2019.124406, 2020. a
Sampson, C. C., Fewtrell, T. J., Duncan, A., Shaad, K., Horritt, M. S., and Bates, P. D.: Use of terrestrial laser scanning data to drive decimetric resolution urban inundation models, Adv. Water Resour., 41, 1–17, https://doi.org/10.1016/j.advwatres.2012.02.010, 2012. a
Sampson, C. C., Bates, P. D., Neal, J. C., and Horritt, M. S.: An automated routing methodology to enable direct rainfall in high resolution shallow water models, Hydrol. Process., 27, 467–476, https://doi.org/10.1002/hyp.9515, 2013. a
Sampson, C. C., Smith, A. M., Bates, P. D., Neal, J. C., Alfieri, L., and Freer, J. E.: A highresolution global flood hazard model, Water Resour. Res., 51, 7358–7381, https://doi.org/10.1002/2015WR016954, 2015. a
Savage, J. T. S., Pianosi, F., Bates, P., Freer, J., and Wagener, T.: Quantifying the importance of spatial resolution and other factors through global sensitivity analysis of a flood inundation model, Water Resour. Res., 52, 9146–9163, https://doi.org/10.1002/2015WR018198, 2016. a
Shaw, J., Kesserwani, G., Neal, J., Bates, P., and Sharifian, M. K.: LISFLOODFP 8.0 results of Environment Agency and Storm Desmond simulations, Zenodo, https://doi.org/10.5281/zenodo.4066823, 2021. a, b
Shustikova, I., Domeneghetti, A., Neal, J. C., Bates, P., and Castellarin, A.: Comparing 2D capabilities of HECRAS and LISFLOODFP on complex topography, Hydrolog. Sci. J., 64, 1769–1782, https://doi.org/10.1080/02626667.2019.1671982, 2019. a
Shustikova, I., Neal, J. C., Domeneghetti, A., Bates, P. D., Vorogushyn, S., and Castellarin, A.: Levee Breaching: A New Extension to the LISFLOODFP Model, Water, 12, 942, https://doi.org/10.3390/w12040942, 2020. a, b
Sosa, J., Sampson, C., Smith, A., Neal, J., and Bates, P.: A toolbox to quickly prepare flood inundation models for LISFLOODFP simulations, Environ. Modell. Softw., 123, 104561, https://doi.org/10.1016/j.envsoft.2019.104561, 2020. a
Szönyi, M., May, P., and Lamb, R.: Flooding after Storm Desmond, Tech. rep., Zurich Insurance Group Ltd, available at: http://repo.floodalliance.net/jspui/handle/44111/2252 (last access: 2 June 2021), 2016. a
Villanueva, I. and Wright, N.: Linking Riemann and storage cell models for flood prediction, P. I. Civil Eng. Wat. M., 159, 27–33, https://doi.org/10.1680/wama.2006.159.1.27, 2006. a
Wang, Y., Liang, Q., Kesserwani, G., and Hall, J. W.: A 2D shallow flow model for practical dambreak simulations, J. Hydraul. Res., 49, 307–316, https://doi.org/10.1080/00221686.2011.566248, 2011. a
Wing, O. E., Bates, P. D., Sampson, C. C., Smith, A. M., Johnson, K. A., and Erickson, T. A.: Validation of a 30 m resolution flood hazard model of the conterminous United States, Water Resour. Res., 53, 7968–7986, https://doi.org/10.1002/2017WR020917, 2017. a
Wing, O. E., Bates, P. D., Neal, J. C., Sampson, C. C., Smith, A. M., Quinn, N., Shustikova, I., Domeneghetti, A., Gilles, D. W., Goska, R., and Krajewski, W. F.: A new automated method for improved flood defense representation in largescale hydraulic models, Water Resour. Res., 55, 11007–11034, https://doi.org/10.1029/2019WR025957, 2019. a, b
Wing, O. E., Quinn, N., Bates, P. D., Neal, J. C., Smith, A. M., Sampson, C. C., Coxon, G., Yamazaki, D., Sutanudjaja, E. H., and Alfieri, L.: Toward Global Stochastic River Flood Modeling, Water Resour. Res., 56, e2020WR027 692, https://doi.org/10.1029/2020WR027692, 2020. a
Xia, X. and Liang, Q.: A new efficient implicit scheme for discretising the stiff friction terms in the shallow water equations, Adv. Water Resour., 117, 87–97, https://doi.org/10.1016/j.advwatres.2018.05.004, 2018. a, b
Xia, X., Liang, Q., Ming, X., and Hou, J.: An efficient and stable hydrodynamic model with novel source term discretization schemes for overland flow and flood simulations, Water Resour. Res., 53, 3730–3759, https://doi.org/10.1002/2016WR020055, 2017. a
Xia, X., Liang, Q., and Ming, X.: A fullscale fluvial flood modelling framework based on a highperformance integrated hydrodynamic modelling system (HiPIMS), Adv. Water Resour., 132, 103392, https://doi.org/10.1016/j.advwatres.2019.103392, 2019. a, b, c, d, e, f, g, h, i, j, k, l, m, n, o
Yamazaki, D., Ikeshima, D., Sosa, J., Bates, P. D., Allen, G. H., and Pavelsky, T. M.: MERIT Hydro: a highresolution global hydrography map based on latest topography dataset, Water Resour. Res., 55, 5053–5073, https://doi.org/10.1029/2019WR024873, 2019. a
Rainongrid features have since been added to the optimised ACC solver and will be available in a future LISFLOODFP release.
This is the case except for anomalous predictions found at Great Musgrave Bridge, where the observed hydrograph shape is generally wellcaptured but freesurface elevations are consistently underpredicted. This anomaly is due to localised terrain elevation differences between the finestresolution DEM and Environment Agency riverbed elevation measurements, as documented by Xia et al. (2019).
Not yet available with the FV1 or DG2 solvers.
 Abstract
 Introduction
 The LISFLOODFP model
 Numerical results
 Summary and conclusions
 Appendix A: Running the LISFLOODFP simulations
 Appendix B: LISFLOODACC order of accuracy
 Code and data availability
 Author contributions
 Competing interests
 Acknowledgements
 Financial support
 Review statement
 References
 Abstract
 Introduction
 The LISFLOODFP model
 Numerical results
 Summary and conclusions
 Appendix A: Running the LISFLOODFP simulations
 Appendix B: LISFLOODACC order of accuracy
 Code and data availability
 Author contributions
 Competing interests
 Acknowledgements
 Financial support
 Review statement
 References