GLOFRIM v1.0 – A globally applicable computational framework for integrated hydrological–hydrodynamic modelling

We here present GLOFRIM, a globally applicable computational framework for integrated hydrological– hydrodynamic modelling. GLOFRIM facilitates spatially explicit coupling of hydrodynamic and hydrologic models and caters for an ensemble of models to be coupled. It currently encompasses the global hydrological model PCR-GLOBWB as well as the hydrodynamic models Delft3D Flexible Mesh (DFM; solving the full shallow-water equations and allowing for spatially flexible meshing) and LISFLOOD-FP (LFP; solving the local inertia equations and running on regular grids). The main advantages of the framework are its open and free access, its global applicability, its versatility, and its extensibility with other hydrological or hydrodynamic models. Before applying GLOFRIM to an actual test case, we benchmarked both DFM and LFP for a synthetic test case. Results show that for sub-critical flow conditions, discharge response to the same input signal is near-identical for both models, which agrees with previous studies. We subsequently applied the framework to the Amazon River basin to not only test the framework thoroughly, but also to perform a first-ever benchmark of flexible and regular grids on a large-scale. Both DFM and LFP produce comparable results in terms of simulated discharge with LFP exhibiting slightly higher accuracy as expressed by a Kling–Gupta efficiency of 0.82 compared to 0.76 for DFM. However, benchmarking inundation extent between DFM and LFP over the entire study area, a critical success index of 0.46 was obtained, indicating that the models disagree as often as they agree. Differences between models in both simulated discharge and inundation extent are to a large extent attributable to the gridding techniques employed. In fact, the results show that both the numerical scheme of the inundation model and the gridding technique can contribute to deviations in simulated inundation extent as we control for model forcing and boundary conditions. This study shows that the presented computational framework is robust and widely applicable. GLOFRIM is designed as open access and easily extendable, and thus we hope that other large-scale hydrological and hydrodynamic models will be added. Eventually, more locally relevant processes would be captured and more robust model inter-comparison, benchmarking, and ensemble simulations of flood hazard on a large scale would be allowed for.

Abstract. We here present GLOFRIM, a globally applicable computational framework for integrated hydrologicalhydrodynamic modelling. GLOFRIM facilitates spatially explicit coupling of hydrodynamic and hydrologic models and caters for an ensemble of models to be coupled. It currently encompasses the global hydrological model PCR-GLOBWB as well as the hydrodynamic models Delft3D Flexible Mesh (DFM; solving the full shallow-water equations and allowing for spatially flexible meshing) and LISFLOOD-FP (LFP; solving the local inertia equations and running on regular grids). The main advantages of the framework are its open and free access, its global applicability, its versatility, and its extensibility with other hydrological or hydrodynamic models. Before applying GLOFRIM to an actual test case, we benchmarked both DFM and LFP for a synthetic test case. Results show that for sub-critical flow conditions, discharge response to the same input signal is near-identical for both models, which agrees with previous studies. We subsequently applied the framework to the Amazon River basin to not only test the framework thoroughly, but also to perform a first-ever benchmark of flexible and regular grids on a large-scale. Both DFM and LFP produce comparable results in terms of simulated discharge with LFP exhibiting slightly higher accuracy as expressed by a Kling-Gupta efficiency of 0.82 compared to 0.76 for DFM. However, benchmarking inundation extent between DFM and LFP over the entire study area, a critical success index of 0.46 was obtained, indicating that the models disagree as often as they agree. Differences between models in both simulated discharge and inundation extent are to a large extent attributable to the gridding techniques employed. In fact, the results show that both the numerical scheme of the inundation model and the gridding technique can contribute to deviations in simulated inundation extent as we control for model forcing and boundary conditions. This study shows that the presented computational framework is robust and widely applicable. GLOFRIM is designed as open access and easily extendable, and thus we hope that other large-scale hydrological and hydrodynamic models will be added. Eventually, more locally relevant processes would be captured and more robust model inter-comparison, benchmarking, and ensemble simulations of flood hazard on a large scale would be allowed for.

Introduction
In the latter half of the last century, losses due to riverine floods increased greatly, leading to economic losses of more than USD 1 billion and 220 000 casualties since 1980 (Munich Re, 2013;Visser et al., 2012). Much of this increase is thought to be due to continued settlement along rivers and shifts in climate patterns, meaning that this tendency will most likely be exacerbated in the future (Ceola et al., 2014;Hirabayashi et al., 2013;Winsemius et al., 2016). Robust inundation estimates are therefore paramount to enhance our process understanding and to provide better flood hazard estimates for risk models. Since recent research showed that flood inundation can easily affect large areas, in particular neighbouring river basins (Jongman et al., 2014), it is vital that flood hazard models can simulate the relevant processes over large domains. Applying such large-scale models has the additional advantage of facilitating the identification of risk hotspots and providing critical insight into data-scarce areas (Ward et al., 2015). In fact, there are already a number of global-scale inundation models available Pappenberger et al., 2012;Sampson et al., 2015;Winsemius et al., 2013;Yamazaki et al., 2011), differing in their process descriptions and computational engine. While some approaches derive flood hazard from a coarse-scale hydrological model and subsequent downscaling, others force fine-scale hydrodynamic models with globally regionalized discharge data. A first inter-comparison of global flood hazard models by Trigg et al. (2016) for the African continent, however, revealed that they agree for only 30-40 % of aggregated flood extent, thus indicating that the representativeness of local flood risk estimates may depend strongly on the computational engine opted for as well as on the model forcing applied. Identifying the exact reasons for model disagreement was impossible due to the diversity of methods and lack of a systematic approach to the inter-comparison where individual aspects of the modelling frameworks could be isolated.
Employing a global hydrological model such as PCR-GLOBWB (van Beek et al., 2011;van Beek and Bierkens, 2008), WaterGAP (Alcamo et al., 1997;Döll et al., 2003) or VIC (Liang et al., 1994;Wood et al., 1992) has the benefit of providing spatially distributed surface runoff and routed discharge simulations, thereby facilitating direct forcing for spatially distributed inundation models. In addition, these models are usually forced by global meteorological data, hence diminishing the dependency on observed data as well as allowing for easier implementation of future climate scenarios. However, the routing schemes currently implemented in large-scale hydrological models can generally be described as simplistic as they are based on gridded drainage networks at a coarse spatial resolution, with the currently finest spatial resolution of global hydrological models being 5 arcmin or around 10 km × 10 km at the Equator (Bierkens, 2015). Furthermore, discharge accuracy may be reduced in lowgradient catchments since topography at this scale is generally parameterized in distribution functions and river routing is often represented by a simple scheme, such as the kinematic wave approximation.
Hydrodynamic models, on the other hand, can be built in numerous ways for inundation modelling, typically in 1-D, 2-D or combined 1-D/2-D, and are mostly forced with gauged discharge data or synthesized flood waves. While such approaches do not require rainfall-runoff conversion, they are problematic for studies concerning large-scale climate change impacts or the seamless simulation of flood events and their spatial correlation (Jongman et al., 2014). Some models like CaMa-Flood (Yamazaki et al., 2011) route a priori computed hydrology-based surface runoff with 1-D hydrodynamics and parameterized 2-D floodplain storage. Applying such a 1-D/2-D approach, however, does not allow for explicit modelling of floodplain flow pathways as well as channel-floodplain interactions. Explicitly representing these processes would be beneficial as they are known to greatly influence inundation dynamics and patterns (Neal et al., 2012a;Trigg et al., 2009). Compared to hydrological models, hydrodynamic models solving the full shallowwater equation (SWE) or at least a more advanced approximation such as the local inertia equations (LIEs) have the advantage of providing a better representation of backwater effects, which are important flood-triggering processes (Meade et al., 1991;Moussa and Bocquillon, 1996;Paiva et al., 2013). Another difference to global hydrological models is that current applications of hydrodynamic models on the large to global scale can run at spatial resolutions of up to 1 km (Sampson et al., 2015), greatly facilitating the representation of both relevant channel-floodplain interactions (Rudorff et al., 2014a, b) and flow pathways on floodplains (Rudorff et al., 2014a;Tayefi et al., 2007) as well as enhancing the usability for decision-making processes (Beven et al., 2015;Trigg et al., 2016). Notwithstanding these advantages, most hydrodynamic models applied for large-scale inundation modelling lack an advanced implementation of hydrological processes and thus may overpredict both inundation extent and depth as, for instance, groundwater infiltration and evaporation from inundated floodplains are currently not fully accounted for.
Large-scale flood hazard estimates may thus benefit from increased integration of hydrology and hydrodynamics in inundation models to allow for physically more integrated assessments and to compensate for their respective shortcomings. In fact, hydrological-hydrodynamic coupling was already applied in a number of studies (Biancamaria et al., 2009;Kim et al., 2012;Lian et al., 2007;. For example, output from hydrological or landsurface models was used as input to the 1-D/2-D hydrodynamic model LISFLOOD-FP (Bates et al., 2010;Bates and de Roo, 2000) at a number of locations. While such approaches reduce the dependency on gauged data or synthesized flood waves, they cannot fully account for important and spatially distributed hydrological flood-triggering processes within the model domain. This would, however, be advantageous to support the assessment of spatial correlations of flood waves in adjacent river basins, which are shown to increase transnational flood risk (Jongman et al., 2014). A further valuable contribution for promoting the coupling of models from different disciplines was realized by the Community Surface Dynamics Modelling Systems group (CS-DMS) with their development of the Web Modelling Tool (WMT; CSDMS, 2017). This tool enables the user to create a coupled model from a list of readily available models and run it on a server of the CSDMS. Whilst this is an important step towards integrated modelling between disciplines, applicability is hampered by the fact that model code is not openly accessible and that the number of available models is limited and predefined.
Recently, Hoch et al. (2017a) coupled PCR-GLOBWB (hereafter PCR) with the hydrodynamic model Delft3D Flexible Mesh (hereafter DFM; Kernkamp et al., 2011) for the Amazon River basin to integrate the hydrological and hydrodynamic processes occurring over the entire study area. Results indicate that spatially explicit coupling of hydrological and hydrodynamic models can improve the representation of inundation for all river reaches, not only those that are connected to upstream boundary conditions. Findings also corroborate that spatially distributed forcing retrieved from a hydrological model in combination with a sophisticated river routing scheme outperforms results obtained with both models run in stand-alone mode.
Even though these results are promising, it has to be acknowledged that the accuracy of a hydrological and hydrodynamic model can vary strongly, depending on the chosen study area, model parameterization, model structure, numerical scheme or the use of different input data (Li et al., 2015;Trigg et al., 2016). It would hence be advantageous to base the choice of the coupled models on their local performance, potentially outperforming predefined set-ups, or simply on the model schematization at hand. To facilitate such model selection and to further promote the coupling of large-scale hydrological and hydrodynamic models, we developed GLOFRIM, a GLObally applicable computational FRamework for Integrated hydrologicalhydrodynamic Modelling. In addition to the work of Hoch et al. (2017a), it includes the widely used hydrodynamic model LISFLOOD-FP (hereafter LFP; Bates and de Roo, 2000) and an improved as well as extended coupling algorithm, thus catering for a wider range of model schematizations and applications. As we believe that by combining the locally bestperforming hydrological and hydrodynamic models relevant processes can be captured better, GLOFRIM is designed in an expandable way to eventually incorporate more models. Furthermore, the framework is openly available under the GNU 3.0 license 1 to stimulate collaboration and idea exchange within the scientific community. Key assets of the framework are its free and open accessibility, its global applicability, its versatility, and its potential to be further developed to a full two-dimensional coupling scheme between hydrology and hydrodynamics, which would play a particularly crucial role in basins in semi-arid climates as for instance the Niger (Dadson et al., 2010;Mahe et al., 2009).
In the remainder of this paper, we first describe the model components of the framework and thereafter the framework and its functionalities in detail. Subsequently, we compare the two hydrodynamic models in a simple synthetic test case to obtain a first understanding of possible differences, in particular in terms of their numerical schemes. As a means of 1 The code and user manual of GLOFRIM is downloadable at https://doi.org/10.5281/zenodo.597107. benchmarking, we assess simulated discharge along the flow paths as well as run times for a 1-D and 2-D set-up individually. We then apply GLOFRIM to one-directionally couple PCR with both DFM and LFP and benchmark the set-ups for an actual test case in the Amazon River basin, hence also constituting a first comparison of flexible and regular grids for large-scale applications. For model benchmarking, we assess simulated discharge, water levels, run times, and inundation extent. Pearson's correlation r, the root mean square error (RMSE), and the Kling-Gupta efficiency (KGE; Gupta et al., 2009) are determined by comparison to observed discharge data from the Global Runoff Data Centre (GRDC) at Óbidos, Brazil. We opt for GRDC data as the presented approach is merely based on input data sets with global coverage. Simulated water levels are compared at an upstream, midstream, and downstream station to assess (a) whether water-level dynamics are correctly represented and (b) to what extent DFM and LFP differ or agree in their water-level computations. Computational efficiency is assessed by comparing the run times of the coupled set-ups. To benchmark inundation extent from DFM with LFP, we determine the hit rate H, false alarm ratio F, and the critical success index C based on inundation maps of both models at the end of the simulation. No validation of simulated inundation extent was performed as Hoch et al. (2017a) already showed good agreement of results obtained with DFM for the same study domain.
This openly available computational framework makes a valuable contribution to current inundation modelling on the large scale by enhancing the integration of hydrological and hydrodynamic model processes, which eventually may lead to improved decision-making and planning of adaption and mitigation measures.

Models
Currently, GLOFRIM includes the hydrological model PCR-GLOBWB as well as the hydrodynamic models Delft3D Flexible Mesh and LISFLOOD-FP. Hereafter, an overview of the main features of the models is provided. For further details regarding model development and model set-up, we refer to the specific manuals or websites.

PCR-GLOBWB
To generate hydrological input, the global hydrological model PCR-GLOBWB (PCR) is currently incorporated in the framework. It can be applied at 30 arcmin resolution (approximately 55 km × 55 km at the Equator) and at 5 arcmin resolution (approximately 10 km × 10 km at the Equator), which may increase accuracy but also runtime. PCR is entirely coded in PCRaster Python (Karssenberg et al., 2010) and distinguishes between two vertically stacked soil layers, an underlying groundwater layer, and a surface canopy layer. Water can be exchanged vertically, and excess surface water can be routed horizontally along a local drainage direction network employing the kinematic wave approximation. The model is forced with Climate Research Unit (CRU) precipitation and temperature data (Harris et al., 2014) at a 30 arcmin spatial resolution, and evaporation is computed using the Penman-Monteith equation. Data sets are downscaled to daily fields for the period from 1957 to 2010 using ERA40/ERAI Uppala et al., 2005). Besides, PCR is able to account for domestic and industrial water consumption by accounting for water demand data (FAO, 2017). For more detailed information on CRUforcing, its processing, and PCR in general, we refer to van Beek (2008), van Beek and Bierkens (2008), and van Beek et al. (2011). PCR was already applied for a wide range of studies such as flood and drought forecasting (Yossef et al., 2012), human impact on droughts (Wanders and Wada, 2015), global water stress (van Beek et al., 2011), and global groundwater simulations (de Graaf et al., 2015). More relevant to this study, PCR constitutes the computational backbone of the "GLObal Flood Risk with IMAGE Scenarios" framework (GLOFRIS; Winsemius et al., 2013), which is also used as the basis for the Aqueduct Global Flood Analyzer of the World Resources Institute (World Resources Institute, 2017).

Delft3D Flexible Mesh (DFM)
DFM allows the user to schematize the model domain with a flexible mesh in 1-D/2-D/3-D, and therefore supports the computationally efficient schematization of topographically challenging areas such as river bends or irregular slopes. The model solves the full Saint-Venant equations, or SWEs. The main partial differential equations solved by DFM are ζ being the water level, h the water depth, u is the velocity vector, g the gravitational acceleration, v the viscosity, ρ the water mass density, and τ the bottom friction. For 1-D flow, the equations remain the same except that the viscosity v does not contain horizontal eddy viscosity. For further technical details and derivation, we refer to the technical manual (Deltares, 2017a). DFM is an openly accessible model and can be obtained by contacting Deltares (https://www.deltares.nl/en/software/ delft3d-flexible-mesh-suite/). Besides riverine flood hazard modelling, it also caters for a wider range of applications, for instance groundwater flow, sediment transport, and water quality simulations in 1-D, 2-D, and 3-D. For more information regarding the application of DFM, we refer to the user manual (Deltares, 2017b). Due to its very recent publication, only a limited number of published studies using DFM are available. It was, for instance, applied in a global-scale reanalysis for extreme sea levels (Muis et al., 2016). In another study, Castro Gama et al. (2013) applied DFM to model flood hazard at the Yellow River, and concluded that applying a flexible mesh reduces computation time by a factor of 10 compared to square grids with equal quality of model output.

LISFLOOD-FP (LFP)
LFP is a widely used, raster-based model to compute floodplain inundation. Since its first version (Bates and de Roo, 2000), it has regularly been adapted and improved (Bates et al., 2010), for instance by adding a sub-gridding scheme to account for channel flow within cells (Neal et al., 2012a). It is possible to run LFP with different set-ups: a 2-D only, a 1-D, a 1-D/2-D or a sub-grid model, with the latter being the most accurate for large-scale inundation modelling approaches as it greatly increases floodplain connectivity (Neal et al., 2012a).
When using the sub-grid scheme, LFP solves the subsequent equation for channel flow that is based on a simplification of the SWE ignoring advection (Bates et al., 2010;Neal et al., 2012a). Here q denotes the flow per unit width, g the gravitational acceleration, ζ the water level, R the hydraulic radius, n Manning's surface roughness, and ∇ the gradients in xand y-directions as described in Eq. (3): Mass conservation is implemented as whereby t denotes the time step, x the cell size and i, j the cell indices. For further information about model development, derivation of numerical solutions, assumptions, and validations, we refer you to the above-mentioned papers. LFP is specifically developed to model floodplain inundation and has been used in a wide range of studies. Most notable in the context of large-scale flood hazard modelling is the work by Sampson et al. (2015), who applied LFP to compute global estimates of flood hazard and risk and by Schumann et al. (2013) and Biancamaria et al. (2009), who used LFP to simulate inundation in the Zambezi and Ob rivers, respectively, forced with lateral input from a land surface model.
The basic model interface (BMI) adapter (see subsequent section) was implemented for LFP version 5.9, which pro-vides all relevant features, in particular the sub-gridding scheme, to model large-scale inundation.

Basic model interface (BMI)
Generally, the BMI has several functions that can be called from external applications like, as in this case, a Python script. To make these functions available for a model, a BMI adapter needs to be developed for each model with respect to the specific internal model structure and programming language. Whilst PCR is already written in Python and its BMI implementation is hence straightforward, DFM offers a native C-compliant BMI implementation. For LFP, which is written in C++, the code and file structure had to be slightly adapted to agree with the requirements for the BMI. Once a BMI adapter is developed, it is possible to execute a set of functions: first, the user can initialize the models by using the BMI adapter. Second, the BMI adapter allows for retrieving a set of variables from memory. The variables exposed through the BMI adapter can be defined during the development of the BMI adapter and is thus not limited to a preset range. Third, the manipulated variables can be set back to the original model or can be used to overwrite variables in one or multiple other models, given that they agree to the internal data structure of those models. Fourth, models connected to a BMI adapter can be updated at a userspecified time step, hence enabling online coupling of models. In this way it is possible to get, change, and set variables during the execution of the models in use on a time step basis. Last, models can be finalized to end the computations. It is noteworthy that implementing the BMI functions does not alter any functionality or routines in the models. Both DFM and LFP, although not being coded in Python, can be called from within Python using the BMI-python package (see https://github.com/openearth/bmi-python). For further information regarding the BMI, we refer to Peckham et al. (2013) and the related website (CSDMS, 2016).

The computational framework GLOFRIM
The computational framework presented here consists of two key elements, (a) the actual code and (b) a settings-file. Here, a brief overview is given of their main properties. More detailed information and an outline is provided in the files themselves.
The computational backbone of GLOFRIM is entirely written in Python 2.7 and was developed and tested on Ubuntu systems. By means of a python file ("couplingFrame-work_v1.py" in the downloadable data), the steps for model coupling are executed (see Fig. 1 for a flow chart). The models are first initialized: the model configuration files of each model are read and the internal steps required to obtain an initial state of the models are prompted by the BMI adapter. Thereafter, the BMI adapter is used to retrieve all required model variables, especially geometry information. This information is subsequently used to construct the grids of the models and to spatially couple them by overlay and grid-togrid assignment. A many-to-one assignment based on raster indices is performed and the routing computations in PCR are turned off for all cells signalled as coupled. In case no 1-D or 2-D hydrodynamic cells are located within a PCR cell, this cell is therefore not considered to be coupled and the routing scheme as implemented in PCR prevails. Further information about the spatial coupling can be found in Hoch et al. (2017a). Once the models are spatially coupled, the update loop commences. During execution of this loop, PCR will be updated at each time step -typically 1 dayand surface runoff and discharge output will be retrieved and adapted to agree with the data structure of the chosen hydrodynamic model. Subsequently, either the water depth or a flux variable in the hydrodynamic model will be overwritten, and finally the hydrodynamic model will be updated until it reaches the same simulation time as PCR. The loop is exited once a user-specified number of time steps is reached. It should be noted that in the current version of the framework, only one-directional coupling from hydrology to hydrodynamics is supported, possibly leading to local overprediction of simulated discharge as there is, for instance, no re-infiltration of water going overbank. Future research will thus focus on extending this to a full two-directional coupling scheme with feedback loops from hydrodynamics to hydrology. Such two-way coupling would, for instance, contain explicit modelling of hydrological processes over inundated areas in the hydrodynamic model.
To specify all relevant information about the coupling run to be performed, a configuration file is needed ("default.ini" in the downloadable data). Besides all critical paths to model data, other model settings can be defined in the configuration file, for example the number of model time steps. In general, settings defined in the INI file overrule those specified for the individual models. In the current version of GLOFRIM, three options need to be specified to realize model coupling: by activating the so-called "River-Floodplain-Scheme" (RFS), by specifying the variables to be updated, and by choosing for hydrodynamic models in either spherical or projected coordinate systems.
First, the RFS defines where output from PCR is coupled to. If RFS is activated, water volume of one PCR cell is directly coupled to the 1-D channels of the hydrodynamic model within the corresponding PCR cell. If RFS is inactive, water is distributed over all 2-D grid cells within the corresponding PCR cell. Applying the RFS has two major advantages: first, it reduces run times as data exchange and computations need to be performed for a smaller number of cells; second, using RFS in large-scale applications with sufficient channel information reduces the dependency on the accuracy of remotely sensed 2-D elevation data such as Shuttle Rader Topographic Mission (SRTM) data (Farr et al., 2007). Recent research showed that such global data sets contain strong vertical bias as well as systematic and random noise (Yamazaki et al., 2017). In particular, simulating flow over vertically irregular terrain resulting in supercritical regimes is contraindicated for LFP because of its use of the LIE. In case overland flow needs to be modelled by LFP, we advise to take measures accordingly, for instance by limiting flow velocities. For DFM we found that runs are more stable, yet slower, when deactivating the RFS.
Second, it is possible to force the hydrodynamic models by updating the water depth variable (in metres) or by updating fluxes, which are expressed in LFP as discharge (in m 3 s −1 ) and in DFM as precipitation (in mm d −1 ). For DFM, added daily water depth is divided into a number of user-specified time steps hence reducing the computational load, while fluxes are daily constants. We found that updating fluxes reduces run times compared to states, and hence advise opting for this option. While it is also possible to perform state-updating in LFP, test runs showed that this option should be used carefully as it easily increases run times. This is because it is currently not possible to update LFP at a user-specified time step due to the Courant-Friedrichs-Lewy condition. It may hence happen that gradients between added daily water depths are too steep, increasing the risk of model instability. We therefore recommend applying flux-updating in LFP instead.
Third, it is possible to use the hydrodynamic models with Cartesian coordinates, although PCR runs in non-Cartesian coordinates. By providing the projected coordinate system the model is based on, the computational framework can translate the grid into spherical coordinates and perform the grid overlay and cell assignment, thus guaranteeing the applicability of all already existing hydrodynamic schematizations. All other computations remain unaffected by the coordinate system in use as the coordinate information is solely required for spatially coupling the grids.
As expressed before, GLOFRIM employs the BMI's functionalities to couple hydrological to hydrodynamic processes. Even though the current version of GLOFRIM only supports one-directional coupling, basing it upon the BMI yields strong advantages for future two-directional coupling as coupled models do not get unnecessarily entangled, socalled "integronsters" (Voinov and Shugart, 2013). Such two-directional coupling is currently not yet available for GLOFRIM due to on-going testing as well as concept development and will be provided in a future version of the framework.
Besides being openly accessible and thus adaptable and extendable to the user's preferences or individual modelling requirements, GLOFRIM contains a number of additional advantages: first, by having PCR-GLOBWB, or any other global hydrological model, as the hydrological output creator, the framework can easily be applied anywhere on the globe given a hydrodynamic schematization; second, models to be coupled may be selected depending on their local performance, thus possibly capturing more relevant processes; third, the spatially explicit coupling scheme can be extended to a full feedback loop between hydrology and hydrodynamic steps, also incorporating important groundwater infiltration and evaporation processes; fourth, by guaranteeing identical hydrological forcing, applying the computational framework facilitates benchmarking of hydrodynamic models by eliminating sources of difference, potentially supporting hydrodynamic ensemble modelling approaches.

Set-up
To gain insight into possible differences in model behaviour between LFP and DFM, we created two synthetic test cases, one being set-up as 1-D only (STC 1-D) and the other as 2-D only (STC 2-D). For the latter, both models were schematized such that they cover a domain of 11 cells by 500 cells, with the cell resolution being 1 km. For the 1-D only design, the channel had a length of 500 cells with a 1 km resolution, a uniform channel width of 500 m, and a uniform channel depth of 3 m. As default settings, we applied Manning's surface roughness coefficients of 0.04 s m −1/3 for the 1-D only run and 0.07 s m −1/3 for the 2-D only run. Both synthetic test cases were forced with an artificial upstream discharge boundary spanning 1 year and consisting of 2 peak flow moments to introduce variability in model dynamics, thus not employing GLOFRIM for those test cases. As downstream boundary condition, a constant water level of 0 m was set. The entire simulation period was 3 years to ensure it exceeds the time of concentration. To assess model output, 7 cross sections were defined, hence capturing the downstream propagation of the artificial flood waves and facilitating the assessment of possible attenuation and dampening effects. For benchmarking the models, we then compared discharge along the cross sections and run times to obtain a first indication of how the different computational schemes might vary (Fig. 2).

Results and discussion
Assessing the results for both 2-D and 1-D, we find that both models simulate the same responses to the input signal applied (Fig. 3). Due to the higher friction coefficient and the wider flow area, it takes the 2-D schematization almost the entire simulation period to entirely convey the water volumes to the downstream boundary. In the 1-D schematization, however, all water is already drained after around 30 % of the entire simulation period. The similarity of simulated discharge between LFP and DFM is, despite the models' differences in complexity and design, in line with the findings made by Neal et al. (2012b) and De Almeida and . In the latter study, differences in governing equations were assessed analytically for various flow regimes ranging from sub-to supercritical flow. It was concluded that for applications with low Froude numbers (Fr 0.5), such as the synthetic test case used here, no significant differences occur between models solving the LIEs and those solving the full dynamics of the SWEs. Also, Neal et al. (2012b) showed that it seems unnecessary to employ models solving the SWEs for flow gradually varying in time and for subcritical flow regimes. In addition, the study showed that for those applications, run times of local inertia models are shorter than those of models solving the full SWEs. The run times measured for the various synthetic test cases used here underpin this finding as LFP exhibits shorter run times, especially for the 2-D schematization (Table 1). To facilitate comparability, we a priori set the maximum solver time step in DFM to the average of the time steps required by LFP. It is noteworthy that the differences in run times may not merely be attributable to varying solver complexity but partially also to the programming language and compiler used as well as to general model complexity and level of code optimization applied.
5 Test case: the Amazon River basin

Set-up
To test GLOFRIM in an actual test case and to benchmark the flexible and regular grids, the framework was applied in the Amazon River basin with DFM and LFP being schematized as a flexible mesh and regular grid, respectively. The methods applied to derive the hydrodynamic schematization of the Amazon River basin for DFM are explained in detail in Hoch et al. (2017a). First, a regular 2-D grid at 10 km × 10 km resolution refined until a grid size of 2 km × 2 km was locally obtained, based on the Height Above Nearest Drainage algorithm (HAND; Rennó et al., 2008). Thereby areas with low HAND values were more strongly refined than those with higher values, resulting in a finer mesh along and next to river channels. This implies a major difference to the synthetic test case above, as we now employ a flexible mesh instead of a regular grid for DFM. As input elevation, canopyfree elevation data at 15 arcsec spatial resolution was applied (Baugh et al., 2013;O'Loughlin et al., 2016) and subsequently smoothed to eliminate local depressions and other residues due to vertical errors of SRTM data (Yamazaki et al., 2012). Elevation data were then assigned to the flexible mesh by spatial averaging. For the 1-D channel network and bathymetry, river width data of the Global Width Database for Large Rivers (GWD-LR; Yamazaki et al., 2014) was employed which was combined with the equations from Paiva To obtain a LFP schematization equivalent to the DFM schematization, elevation data as well as both river width and river depth information were processed to agree with the requirements of LFP. For river channel properties, the depth and width information stored in the vector data used for DFM were rasterized, and for the elevation data the smoothed canopy-free elevation data were upscaled to a 2 km spatial resolution, employing the nearest neighbour technique, to match the finest spatial resolution of the DFM schematization (Fig. 4). From Fig. 4 it is visible that LFP contains a greater level of detail in areas farther upstream due to the finer spatial resolution uniformly applied. Consequently, the total number of cells in LFP exceeds the number of 2-D cells in DFM by a factor of 4 (Table 2). Furthermore, only around 10 % of the entire schematization represents 1-D channels in LFP, while the channel network of DFM was based on around 30 % of all DFM cells. For both DFM and LFP, Manning's surface roughness coefficient was uniformly set to 0.03 s m −1/3 for channel and floodplains, which is consistent with other case studies in the Amazon (Paiva et al., 2013;Rudorff et al., 2014a, b;Trigg et al., 2009;Yamazaki et al., 2011). As downstream boundary, we imposed a constant water level of 0 m at the river's delta. It is noteworthy that GLOFRIM supports the coupling of any hydrodynamic schematization, not only those bordering at a delta but also midstream applications, for instance, if the internal hydrodynamic model requirements are satisfied. Additionally, it should be mentioned that the 1-D channels of both schematizations, even with the GWD-LR accounting for islands and thus providing an effective width, do not capture the impact of both braiding and river bifurcation, which may potentially impact model results, especially at the river mouth. This is, however, not due to the inability of the hydrodynamic models to account for them, but merely Table 2. Overview of key properties of hydrodynamic schematizations coupled to PCR-GLOBWB in this study.

2-D cells 1-D cells
Smallest Largest cell size cell size DFM 41 207 12 185 2 km× 2 km 10 km× 10 km LFP 174 982 17 119 2 km× 2 km 2 km× 2 km because the chosen algorithm to derive 1-D network properties does not allow for it. For the hydrological model PCR-GLOBWB, the kinematic wave approach was used for routing outside of the coupled domain. This is required as the hydrodynamic schematizations in this test case do not cover the entire extent of the Amazon River basin, even with the kinematic wave approximation potentially introducing an error to the upstream boundary inflow applied. Since simulated discharge from PCR for the Amazon substantially under-predicts observations, we decided to apply an optional regionalized optimization technique facilitating comparison between simulated and measured discharge values (Hoch et al., 2017a). As such an optimization technique is optional and only advisable for catchment studies, a global application is thereby not constrained. In analogy to the hydrodynamic models, the surface roughness coefficient of PCR was uniformly set to 0.03 s m −1/3 . Model output of both set-ups was validated against observed GRDC discharge at Óbidos, the most downstream station of the GRDC network in the Amazon River basin (Fig. 4). To that end, Pearson's r, the RMSE, and the KGE (Gupta et al., 2009) were computed. Possible uncertainties in observed discharge (Clarke et al., 2000) were thereby omitted. Besides, simulated discharge was qualitatively compared at two locations further upstream (Loc1 and Loc2). The model time covers the period from January 1984 until December 1990 with the first year being used for spin-up of the coupled settings. This period had to be chosen due to the limitation of available GRDC data for model validation. As with the synthetic test case, run times were compared. To be able to understand water-level dynamics as simulated by both models, we compared them at three locations throughout the basin (Fig. 4). The locations were chosen such that they represent the downstream (Loc3), midstream (Loc4), and upstream dynamics in the basin (Loc5). Besides, inundation extent was benchmarked by applying three evaluation functions, using the LFP inundation results as the benchmark data set. First, the hit rate H was computed based on the subsequent equation: N LFP and N DFM indicate thereby the number of inundated cells in LFP and DFM at the same moment in time, respectively. To perform consistent benchmarking, the flexible cells of DFM were resampled to the resolution of LFP. The hit rate can vary between 0, signalling that DFM and LFP have no inundated cells in common and 1, indicating that all cells in LFP are also inundated by DFM. In addition, we determined the false alarm ratio F to also consider false positive alarms. The false alarm ratio can be obtained with In the optimal situation, F would be 0 showing that no cells are incorrectly marked as flooded in DFM, whereas a value of 1 indicates that all cells are classified as false alarms. Last, we assessed the critical success index C, which combines both hit rate and false alarm ratios into one parameter which can vary between 0 in the worst and 1 in the best scenario, indicating perfect match between both inundation For both set-ups, the River-Floodplain-Scheme was activated and flux-updating was opted for. All simulations were performed in a Linux environment with an Intel i7-4790 core at 3.90 GHz and 16 GB memory.

Results and discussion
Benchmarking discharge results against observation from GRDC at Óbidos shows that both models behave similarly. However, LFP tends to compute earlier peak flow and earlier and lower low flow (Fig. 5). Therefore, obtained coefficients of correlation are lower for LFP, while the model's skill as expressed by KGE is higher for LFP and the RMSEs are comparable (Table 3). The general deviation of simulated results to observations can be due to a range of factors, for example the lack of channel bifurcations in the schematization, the already less-accurate upstream inflow as simulated with the kinematic wave approximation or the general overprediction of discharge by PCR-GLOBWB (Hoch et al., 2017a), but have not been further explored as this would exceed the scope of this study. Even though the discrepancies in simulated discharge between the two models are not remarkable, they require further investigation as they cannot be exhaustively explained with our current process understanding. Based on the results obtained in the synthetic test case and since the hydrological forcing of both models is equal in terms of water volumes, spatial distribution, and timing, we decided to evaluate the impact of the following parameters: the actual river length and dimension in LFP compared to DFM and the sensitivity of LFP to Manning's surface roughness coefficient over large areas.
Since the routing scheme of LFP is based on a D4 system where water can flow in a southerly, northerly, easterly or westerly direction, channel length and dimension in LFP tend to differ from other hydrodynamic models that are not based on such a system, for example DFM. Reducing or increasing the unitless meandering coefficient in LFP to river-scale dimensions, however, did not show any significant impact on simulated discharge (Fig. 6a). After investigating how changes in surface roughness values in LFP may close the gap to DFM, we indeed found a more pronounced response, yet it cannot satisfactorily explain the difference in simulated discharge either (Fig. 6b). Since in the synthetic example both models can produce near-identical results if using the same friction coefficient and, because the flow regime in the Amazon basin can be described as sub-critical, different sensitivities to surface roughness over large areas can thus also be disregarded as the cause for discharge discrepancies. For the remaining gap in simulated discharge, we can at this point only make assumptions about the cause. Possible reasons include differences in internal processing of 1-D channel bathymetry, channel-floodplain interaction, and input elevation assignment due to the different grid schematizations of a flexible mesh and regular grid.
For a further first-order assessment of a possible impact of spatial resolution, we compared simulated discharge at two stations further upstream, Loc1 and Loc2 (Fig. 4). Results indeed suggest that the differences in upstream spatial resolution result in different flood wave propagations (Fig. 6c); Table 4. Local properties of water level observation stations; input elevation refers to values obtained after hydraulic conditioning of canopy-free SRTM elevation data at 15 arcsec spatial resolution.

Loc3
Loc4 Loc5 Input elevation ( Cell area DFM (m 2 ) 7.7 × 10 6 7.7 × 10 6 30.9 × 10 6 covered flow distance and peak discharge in LFP is increasingly delayed compared with DFM, presumably due to the larger floodplain cells in DFM. Besides, the timing of the rising and falling limb is affected. Higher simulated discharge by LFP than DFM at Loc1 does not only indicate that the impact of cell resolution is reduced with downstream distance and additional tributaries contributing to the flood wave but also that discharge computations in upstream areas can be easily affected as the discrepancy in cell size is largest there. Assessing differences in simulated water-level dynamics at the observation locations, we cannot find any particularly prevailing difference between the models' response to hydrological forcing (Fig. 6d). In general, we observe that modelled water levels are comparable, yet with locally differing patterns. While at the most upstream station, Loc5, DFM simulates lower water levels than LFP, this is opposite at the most downstream station Loc3, and at Loc4 both models provide comparable results. Besides differences in actual water levels, both models show a comparable response to model input, yet LFP tends to yield earlier peak water levels than DFM, which concurs with the discharge dynamics observable. The reason for differences in simulated water levels and Figure 6. Results of the sensitivity analysis of (a) the meandering coefficient and (b) both 1-D and 2-D surface roughness coefficients in LFP. Since the D4 system in LFP can both decrease and increase effective river dimension, the dimensionless meandering coefficient was not only reduced from default (1.0) to 0.09, 0.08, and 0.07, but also increased to 1.1, 1.2, and 1.3. As default Manning's surface roughness is already low (0.03 s m −1/3 ), coefficients were increased to 0.05, 0.07, and 0.09; (c) Comparison of simulated discharge across basin to assess impact of spatial resolution on simulated discharge; to that end two additional observations upstream of Óbidos were introduced, Loc1 (most upstream) and Loc2 (intermediate upstream); (d) Comparison of simulated water depth at three different locations (Loc3, Loc4, and Loc5) randomly picked within the domain. their dynamics could not be fully attributed to one specific cause. For example, the more pronounced difference in water levels at Loc1 may be a local effect due to spatial feedback dynamics between neighbouring cells of an observation station (Hardy et al., 1999), may be related to slight differences in model schematization at the downstream boundary or to backwater effects in the delta regions as a result of different influences of the downstream water level boundary. Furthermore, discrepancies are likely to be related to differences in surface elevation simulated at the observation stations due to the differences in gridding between DFM and LFP. Indeed, assessing the local properties of the observation stations revealed that the surface elevation in DFM is higher than in LFP (Table 4). Last, results indicate that differences in gridding and therefore cell size may thus have locally impacted the overall water levels too since the above-discussed discharge simulations in upstream areas exhibited clear deviations between both models (Fig. 6c).
Regarding the run times of the two coupled set-ups, we find that it takes LFP around 6 h to simulate the entire simula- tion period of 7 years, that is model time plus spin-up, while performing the same simulation with DFM takes around 7 h ( Table 3). The difference in run times is less pronounced than for the synthetic test case, which can be related to the lower number of cells in DFM compared to LFP due to use of a flexible mesh. In addition, a more computationally expensive interaction between the 1-D and 2-D domain in DFM could also affect run times. As DFM is in general a multi-purpose tool whose application is not limited to inundation modelling, it is not unexpected that it may be slightly slower than programmes specifically tailored for efficient large-scale inundation modelling such as LFP.
We find that inundation extents obtained at the end of the simulation runs with DFM and LFP are comparable, yet far from identical (Fig. 7). Due to the larger inundation extent of DFM, a hit rate of 0.85 is obtained, indicating that 85 % of extent as simulated by LFP is also simulated by DFM. Especially differences in inundated extent in upstream areas and along small reaches can explain the obtained false alarm ratio of 0.50 (Table 5). These differences are also responsible for the critical success index of 0.46 corroborating that in bit less than half of the cells inundation extent is simulated by both models. A model agreement of 46 % is slightly higher than the 30-40 % found by Trigg et al. (2016) for a benchmarking study of global flood hazard models. This, in fact, suggests that the choice of numerical scheme and model schematization alone can greatly impact upon inundation, confirming that differences in model forcing and boundary conditions do not act alone as a cause of modelled inundation difference, which could have been the case in the results obtained by Trigg et al. (2016).
A main cause for the differences observed for regions further upstream is that DFM tends to compute larger flood extent than LFP: with DFM having larger cells in upstream areas due to the flexible meshing, a larger 2-D area is instantly marked as inundated for DFM once overbank flow occurs. This loss of level of detail in DFM is the concession to be made for a reduced number of grid cells and hence po- tentially faster computations in the 2-D domain. For more downstream regions, differences in inundation extent are primarily present at small river channels while floodplain inundation is comparable. This, however, can to some extent be attributed to differences in how the 1-D domain is implemented in the models, with DFM using grid-size independent vectors and LFP using grids at the overall spatial resolution of the schematization. Given the overall larger inundation extent simulated by DFM, the above-discussed deviations in simulated discharge and in particular the more pronounced wave attenuation in DFM may be explained as return flows from the floodplain to the channel as they seem to be faster in LFP than in DFM.

Conclusion and recommendations
In this study, we presented GLOFRIM, a GLObally applicable computational FRamework for Integrated hydrologicalhydrodynamic Modelling. In its current version, it provides an environment to one-directionally couple the global hydrological model PCR-GLOBWB with two hydrodynamic models: Delft3D Flexible Mesh solving the full SWEs, and LISFLOOD-FP solving the LIEs. By linking hydrology to hydrodynamics, it is possible to take advantage of the strengths of both while at the same time compensating their weaknesses. We define five main assets of GLOFRIM: (i) it is openly accessible and hence can be directly applied and adapted to specific purposes, and extended with other models; (ii) by employing a global hydrological model to obtain model forcing, the framework can easily be applied globally; (iii) models to be coupled may be selected depending on their local performance and thus more relevant processes can be captured; (iv) the spatially explicit coupling scheme can be extended to a full feedback loop between hydrology and hydrodynamics; (v) thorough benchmarking and ensemble modelling of hydrodynamic models is supported by providing identical hydrological forcing for experiments.
GLOFRIM at present provides a range of options for model coupling. Users can choose between coupling PCR to either the 1-D or 2-D domain, can specify whether to update hydrodynamics through states or fluxes, and can run hydrodynamic models in both non-Cartesian spherical and projected coordinate systems. It is generically written and does not require any a priori knowledge of the code as all important settings are specified in a separate settings file.
Besides PCR as well as DFM and LFP, there are many other global hydrological and hydrodynamic models available which have their individual advantages. As the framework is freely and openly available, its design can easily be extended and adapted to cater for the coupling of other hydrological or hydrodynamic models, merely requiring the implementation of the BMI into each model to be added. Eventually, adding a 1-D continental hydrodynamic model such as CaMa-Flood (Yamazaki et al., 2011) would allow for replacing the kinematic wave approximation of PCR to provide more accurate upstream boundary inflow to the domain with explicit high-resolution 2-D floodplain computations. Employing a BMI does not change the model functionality but it does provide a range of added functions. Furthermore, not all model variables need to be exposed, only those to reproduce model geometry, distinguish between 1-D and 2-D cells, and to update model states. We therefore recommend considering this option for future model developments and will also aim to incorporate other models ourselves. To our knowledge, spatially explicit model coupling on a global scale by means of such a framework is unprecedented. Consequently, user experiences and lessons learnt are still sparse and any initiatives regarding framework extension are therefore kindly received by the authors, as well as feedback and experiences. We also recommend the testing and application of it in other study areas and under different boundary conditions to further evaluate the code, process flow, and applicability.
Before applying GLOFRIM in an actual test case, we performed a simple synthetic test case to obtain a first-order insight into how both models may differ regarding their computational complexity. Thereby both the 1-D and 2-D domain were forced by a synthetic inflow signal and simulated discharge was evaluated along the flow path. Results show that both models produce the same response to the signal despite the difference in solver complexity. The results obtained are in line with previous studies showing that for sub-critical flow regimes discharge results should be similar (De Almeida and Neal et al., 2012b).
Both hydrodynamic models were then applied within GLOFRIM for the Amazon River basin and evaluated regarding simulated discharge, water levels, run time, and inundation extent, also constituting a first comparison of largescale flexible mesh and regular grid applications. Assessing simulated discharge shows that both models exhibit comparable results with LFP tending to compute earlier and slightly increased peak discharge estimates. As thorough testing of plausible causes did not show significant improvements, we speculate that differences in processing of 1-D channel bathymetry, interaction between 1-D channels and 2-D floodplains or assignment of input surface elevation data to the different grids may impact discharge results. The latter is supported by discharge observations made in farther upstream areas where differences in grids are largest. A more in-depth analysis of these differences was, however, outside the scope of this study and thus needs to be performed in a follow-up study. As the general overprediction of observed discharge at Óbidos can partly be attributed to the absence of hydrological processes on inundated floodplains, it is envisaged to extend the current code such that it also caters for a full feedback loop between hydrodynamics and hydrology.
Water levels simulated by both models differ locally, yet only slightly. These discrepancies between both models are most likely due different grid schematizations in DFM and LFP, which results in locally differing elevation values and cell areas and thus influences simulated water levels. Due to differences in model structure and design, downstream boundary conditions had to be implemented slightly differently, possibly also impacting water level results in particular for more downstream stations. As it was the aim of this paper to introduce the computational framework applied, a more elaborated evaluation of causes for water level deviations is future work.
A key parameter for large-scale modelling is run time. In the current study, the schematization of LFP contains more than 4 times the number of 2-D cells than DFM while the number of 1-D cells is 40 % higher in LFP than in DFM. Despite the greater number of cells, LFP has a slightly shorter run time. This is in line with the results obtained in the synthetic test case, yet the relative difference is reduced due to the application of flexible meshes for the 2-D domain and the nature of the coupling algorithm applied: because water was coupled directly into the 1-D channels, flow over the 2-D domain was limited and, as a result, so was the impact of differences in computational efficiency of the models. Differences in run times may also be related to more fundamental factors, such as the degree of code optimization applied. Additionally, DFM was, in contrast to LFP, not explicitly developed for efficient inundation modelling, but as a multipurpose tool including several additional physical processes, such as the potential to simulate 3-D flow, estuarine processes or hydrogeomorphologic dynamics, which could also result in longer run times. To better understand causes of run time discrepancies, further model development, testing, and evaluation is therefore recommended.
To benchmark LFP and DFM in terms of simulated inundation extent in the Amazon River basin, the hit rate H, the false alarm ratio F, and the critical success index C were determined. In general, both models agree about as often as they disagree, indicating that both DFM and LFP predict simulation extent for around half of all cells. This level of agreement is slightly higher than the one obtained by Trigg et al. (2016) and is a strong indication that the model geometry and numerical scheme play a similarly strong role in influencing model accuracy as the boundary conditions and model forcing applied in global flood hazard models. Moreover, a higher value could not be obtained due to the impact of the flexible mesh, especially for upstream areas where DFM runs at cells that are a factor of 25 larger than in LFP. While such large cells contribute strongly to shorter run times, they may also have implications for detailed flood hazard estimates which can be strongly hampered. In the case of employing a flexible mesh, it seems as if an a priori decision has to be made where and to which extent such models are supposed to provide fine-scale results or whether computational efficiency is the main aim -both at the same time does not seem to be feasible from our results. We hence recommend testing the application of flexible meshes for largescale riverine inundation modelling in more detail to obtain a better understanding of the trade-off to be made between grid refinement and model accuracy.
With the presented computational framework GLOFRIM and the satisfactory results obtained, we trust to have contributed to the current development of model coupling and integration, and to have provided an openly accessible tool that facilitates more accurate large-scale flood hazard estimates. We hope that, eventually, the integration of hydrological and hydrodynamic models will lead to improved flood risk assessments and planning of climate change impact mitigation and adaption measures.
Code and data availability. The code of GLOFRIM and the BMI versions of LISFLOOD-FP and PCR-GLOBWB are openly accessible and freely downloadable at https://doi.org/10.5281/zenodo.597107 (Hoch et al., 2017b).
Author contributions. FB and JMH developed the BMI adapter for LISFLOOD-FP. JMH, RvB, and JCN developed the code of the computational framework. JCN, RvB, HCW, PDB, and MFPB supervised the research and provided important advice. JMH designed and executed the research, and also prepared the manuscript as well as code, with contribution from all co-authors.
Competing interests. The authors declare that they have no conflict of interest. Author Jeffrey C. Neal is a topical editor of the journal.
Acknowledgements. This study was financed by the EIT Climate-KIC programme under project title "Global high-resolution database of current and future river flood hazard to support planning, adaption and re-insurance". We also want to acknowledge the contributions of Climate-KIC and University of Bristol to realize a research stay at the University of Bristol. Special thanks are reserved for Arthur van Dam and Herman Kernkamp from Deltares for their support in applying Delft3D Flexible Mesh and Edwin Sutanudjaja for PCR-GLOBWB advice. Last, we want to express our gratitude to an anonymous reviewer and Dai Yamazaki for evaluating a previous version of the manuscript and providing invaluable feedback.
Edited by: Bethanna Jackson Reviewed by: Dai Yamazaki and one anonymous referee