Towards improved Euro-Mediterranean discharge  simulations in regional coupled climate models: a  comparative assessment of hydrologic performance

Hamitouche, Mohamed; Fosser, Giorgia; RafieeiNasab, Arezoo; Anav, Alessandro

doi:10.5194/gmd-19-2881-2026

Articles | Volume 19, issue 7

https://doi.org/10.5194/gmd-19-2881-2026

Articles | Volume 19, issue 7

Model evaluation paper

16 Apr 2026

Model evaluation paper |

| 16 Apr 2026

Towards improved Euro-Mediterranean discharge simulations in regional coupled climate models: a comparative assessment of hydrologic performance

Mohamed Hamitouche, Giorgia Fosser, Arezoo RafieeiNasab, and Alessandro Anav

Abstract

River discharge into the Mediterranean Sea is a vital component of the regional water cycle, influencing ecological and climatic dynamics. Although some regional coupled models that include a river routing component exist for the Mediterranean region, their performance in reproducing river discharge is poor. This study compares the hydrological routing models CaMa-Flood and WRF-Hydro for discharge simulations into the Mediterranean Sea, with the prospect of future coupling into regional and Earth system models such as ENEA-REG. Evaluating their performance across key basins, this study highlights CaMa-Flood's computational efficiency but underperformance in flow variability and high-flow extremes, contrasted by WRF-Hydro's superior timing and bias reduction, especially after calibration. In fact, results indicate that the calibration improved WRF-Hydro's metrics, including Kling–Gupta Efficiency (KGE) and lag times, underscoring its potential for precise discharge predictions at higher computational costs. These findings offer critical insights for advancing regional coupled Earth system models, enhancing hydrological forecasting, and addressing basin-specific hydrological challenges.

Download & links

Article (PDF, 5890 KB)

Supplement (2092 KB)

Download & links

How to cite.

Received: 14 Jun 2025 – Discussion started: 28 Jul 2025 – Revised: 15 Mar 2026 – Accepted: 26 Mar 2026 – Published: 16 Apr 2026

1 Introduction

River discharge into the Mediterranean Sea is a critical component of its water budget, alongside the net inflow from both the Atlantic through the Strait of Gibraltar and the Black Sea via the Dardanelles Strait, as well as evaporation and precipitation (Pinardi et al., 2015; Pinardi and Masetti, 2000). As one of the primary freshwater sources, river discharge plays a pivotal role in shaping the basin's hydrological and ecological dynamics. Not only it does provide essential freshwater input, particularly during spring when plentiful precipitation rates and snowmelt enhance the discharge, but also transports nutrients and minerals that influence coastal and sub-basin ecosystems (Struglia et al., 2004). Moreover, variability in river discharge, whether natural or anthropogenic, can modulate the Mediterranean's thermohaline circulation on decadal scales, affecting salinity, dense water formation, and oxygenation rates across the basin (Zavatarelli et al., 1998).

Timely and accurate river discharge estimates into the Mediterranean are critical for managing water resources and related risks in the region (Cisterna-García et al., 2025). In a broader context, understanding the interplay between regional climate change and hydrological processes is essential, and underscores the need to study the coupling of Earth system components, including atmospheric, hydrological, and oceanic processes. Since the early 21st century, major international research programs such as WCRP (World Climate Research Programme), IGBP (International Geosphere-Biosphere Program), GEWEX (The Global Energy and Water Cycle Experiment) and CORDEX (COordinated Regional climate Downscaling EXperiment) have emphasized the importance of integrating regional climate models (RCMs) with hydrology and ocean models to address these challenges. Coupled atmospheric–hydrological–ocean models provide a framework to simulate the full water cycle and its feedback mechanisms between land, atmosphere and ocean. Such models reveal how terrestrial water flows influence the broader water cycle and regional climate dynamics over seasonal to decadal scales. For example, terrestrial water flow can alter atmospheric boundary layers and modulate convective precipitation during shorter time scales (Amelia, 2022). By enhancing the representation of water cycle processes, these coupled models aim to improve simulation accuracy and forecasting capabilities. This evolution from traditional hydrological models to coupled models reflects the growing need for tools that capture the intricate interactions between land, ocean, and atmosphere, offering enhanced weather forecasts and improved predictions of river flow and extreme events.

Regional coupled models and climate change studies in the Mediterranean require a high-resolution discharge component to correctly reproduce the complex orography and land-sea distribution of the Mediterranean region and thus effectively close the coupling among the Earth system components (Hagemann et al., 2020). However, the first coupled models did not consider the hydrological component and the coupling was limited to atmosphere-ocean-land only (e.g. RegIPSL; Shahi et al., 2022, EBU-POM; Djurdjevic and Rajkovic, 2010 and MORCE; Drobinski et al., 2012), where rivers are inadequately represented. This restricts their ability to make accurate predictions of river flow and forecasts for floods and droughts, necessitating downstream hydrological modelling, independently of land-atmosphere feedback or the advantages of data assimilation. Additionally, while some models incorporate atmosphere–ocean–land–river coupling, they struggle to meet the high-resolution requirements essential for precise discharge simulations.

Recent advances have led to the development of several complex coupled models, aimed at achieving fully integrated hydrological predictions for the Mediterranean region. Despite the sophistication to simulate complex earth system interactions, these models still face significant challenges. In particular, some models systematically underestimate freshwater input from river runoff, leading to inaccuracies in discharge predictions and contributing to surface water salinification in the Mediterranean (Anav et al., 2021; Reale et al., 2020; Storto et al., 2023). This underestimation seems to be with the Hydrological Discharge (HD) model (Hagemann and Dümenil, 1997), a river routing model used to simulate the river discharge with different horizontal resolutions (e.g. 5 min in MESMAR and 0.5° in ENEA–REG coupled models). The HD model uses a pre-parametrization based on a linear reservoir routing concept with pre-defined reservoir numbers and temporal (LSM) (Stacke and Hagemann, 2021), which neglects the energy budget and overestimates runoff. Consequently, replacing the biased routing model in these coupled systems has become essential to ensure a more accurate representation of the water cycle.

At the same time, enhancing discharge simulations would also benefit from selecting the most appropriate land surface runoff model based on its runoff generation mechanism. A recent study by Hamitouche et al. (2025a) analysed the impact of seven different runoff schemes within the Noah-MP LSM on global discharge simulations across diverse climate regions and found that the Schaake runoff scheme performed best in warm temperate regions, including the Mediterranean basins. This study utilized the CaMa-Flood hydrodynamic model (Yamazaki et al., 2011) for discharge simulation, demonstrating overall good performance against observational discharges. However, the analysis also revealed certain limitations, such as delays in capturing seasonal peak flows due to inherent constraints in CaMa-Flood, which were accurately resolved by the WRF-Hydro model (Gochis et al., 2021), tested in the same study. Additionally, the study identified significant biases, particularly in high-flow extremes, emphasizing the need for ongoing calibration of tuneable parameters to improve the accuracy of hydrological predictions.

This highlights the importance of thoroughly evaluating standalone routing models for their performance and limitations, before integrating them into regional coupled climate or Earth system models – such as the ENEA-REG atmosphere–land–river–ocean (ALRO) coupled model, developed within the framework of the Mediterranean CORDEX (Med-CORDEX) initiative. Med-CORDEX aims to advance fully coupled regional climate simulations over the Euro-Mediterranean domain, which encompasses the Mediterranean and Black Seas and their contributing catchments (excluding the Nile), by improving the representation of key Earth system components, including atmospheric processes, land surface, hydrology and ocean dynamics (Ruti et al., 2016). The Med-CORDEX Phase 3 protocol outlines common requirements for domain extent, spatial resolution, and coupling strategies – mandating, for example, a minimum resolution of 12 km for the atmosphere/land and 10 km for the ocean, and requiring river-to-ocean coupling. Further details on the protocol are available at https://doi.org/10.5281/zenodo.11659642 (Somot et al., 2024).

https://gmd.copernicus.org/articles/19/2881/2026/gmd-19-2881-2026-f01

Figure 1Med-CORDEX simulation domain with its drainage network and the gauge stations (red dots) used in this study. The selected basins for evaluation and calibration are given distinct colours. The dashed contours refer to calibration sub-domains.

Considering the poor performance in reproducing accurate discharge estimates as well as some limitations of the existing regional coupled model for the Mediterranean region, the aim of this study is to compare the performances of two process-based hydrological routing models, i.e. CaMa-Flood and WRF-Hydro, driven by a Med-CORDEX regional coupled model (ENEA–REG), in reproducing the discharge for the most important Mediterranean rivers. This evaluation provides valuable information, as the analysed models could be regarded as alternatives to the river component used in current regional coupled models. CaMa-Flood, a global river routing model, is widely recognized for its computational efficiency and ability to simulate river discharge at large scales. However, within the Med-CORDEX domain (Fig. 1), it has been utilized only outside the context of regional coupled models. On the other hand, WRF-Hydro, the hydrological extension of the WRF atmospheric model, is designed for high-resolution hydrological predictions, with multi-scale capabilities, enabling it to represent processes on various spatial scales (Gochis et al., 2021). Over the Mediterranean region, its application has primarily been limited to small isolated and relatively undisturbed basins, and short time periods (Galanaki et al., 2021; Senatore et al., 2015; Sofokleous et al., 2023, 2024). In contrast, in this study, simulations were conducted at a daily time scale over a long-term period (1990–2014) for the entire Med-CORDEX domain. The evaluation focused on several Mediterranean basins as well as the Danube River, which drains into the Black Sea, allowing for robust regional generalizations and comparative analyses across diverse hydrological regimes. Additionally, the study analyzed the role of parameter calibration in improving discharge simulations, with a particular focus on WRF-Hydro, leveraging the capabilities of the NCAR WRF-Hydro calibration package.

This study addresses the following key questions:

Can WRF-Hydro or CaMa-Flood serve as effective alternatives to improve hydrological simulations within Euro-Mediterranean regional coupled models?
Can calibration enhance WRF-Hydro performance, and to what extent?

The paper is structured as follows: after the presentation of the used models and methods (Sect. 2), the first part of result (Sect. 3.1) focuses on comparing the two hydrological models in their default configurations, offering insights into the foundational performance of these models. This approach establishes a basis for comparison and lays the groundwork for future studies to refine and adapt these models for innovative applications, particularly in Mediterranean hydrological and climatic contexts. The second part of the results (Sect. 3.2) evaluates the impact of calibration in further improving discharge simulations, highlighting the potential of parameter optimization to enhance hydrological model performance, focusing on WRF-Hydro, by leveraging the capabilities of the NCAR WRF-Hydro calibration package.

2 Materials and methods

2.1 Study area and river discharge observations

The river basins draining into the Mediterranean Sea encompass over 5 million km², including the Nile basin, but excluding rivers flowing into the Atlantic Ocean from Portugal and Spain (Lionello et al., 2012; Ludwig et al., 2009). Most of these catchments are medium to small-scale, with only a few major basins exceeding 80 000 km² (Lionello et al., 2012). The ten largest rivers contributing to Mediterranean discharge include Rhone, Po, Drin-Buna, Nile, Neretva, Ebro, Tiber, Adige, Seyhan and Ceyhan rivers (Ludwig et al., 2009), with 71 % of the total discharge originating from northern Mediterranean countries, 12 % from eastern regions (Turkey), and 17 % from southern areas, primarily the Nile. Notably, the Rhone and the Po alone contribute 25 % of the northern discharge. Annual freshwater input to the Mediterranean and Black Sea is estimated at 305–737 km³ yr⁻¹ (Struglia et al., 2004).

The ENEA-REG model (Anav et al., 2021), developed within the Med-CORDEX framework, incorporates into its ocean component river discharge from 18 major Mediterranean rivers simulated by the river routing component. Among these are several of the largest Mediterranean rivers – Rhone, Po, Ebro, Adige, Tiber, and Ceyhan – along with additional basins such as Drin, Maritsa, Goeksu, Vjosa, Jucar, Buyuk Menderes, Arno, Kopru, and Struma. For the Nile, a climatological monthly mean is prescribed, as suggested by the Med-CORDEX protocol (https://doi.org/10.5281/zenodo.11659642, Somot et al., 2024), due to the atmospheric model's limited domain coverage of the basin and the significant anthropogenic modifications to its natural discharge.

In this study, the validation focuses on 10 of the ENEA-REG routed rivers, in addition to the Danube River (Fig. 1), chosen based on the availability of at least five consecutive years of daily observations after 1990. These rivers include: Maritsa, Goeksu, Arno, Kopru, Rhone, Po, Ebro, Ceyhan, Adige, and Tiber. Notably, six of these (Rhone, Po, Ebro, Tiber, Adige, and Ceyhan) are also among the ten largest Mediterranean rivers listed earlier, providing a representative mix of major and medium-sized basins. The selected rivers span different climatic and morphologic conditions, ranging from mountainous alpine regions with pluvio-nival hydrological regimes (e.g., Rhone, Po, Adige) to the semi-arid climate of southern Turkey's Ceyhan River. Other river basins were excluded due to insufficient daily observational data or records shorter than five years.

The Júcar and Nile rivers were specifically excluded because their flow is heavily influenced by human interventions, such as reservoirs and water diversions, which are not explicitly represented in the modelling setup. In the case of the Júcar, its relatively small drainage area (22 200 km²) combined with an exceptionally dense and complex regulation system – including major reservoirs for flood control and water supply (e.g., Alarcón, Contreras, Tous, Bellús, Forata), hydropower reservoirs (e.g., Molinar, Cortes, Naranjero, La Muela), additional smaller reservoirs, river–aquifer connections, and inter-basin water transfers (Momblanch et al., 2014; Suárez-Almiñana et al., 2017) – strongly alters both the magnitude and timing of discharge. These extensive modifications suppress the natural hydrological signal, making it difficult for hydrological models to reproduce even the basic flow variability when compared with observations; meaningful validation is therefore not feasible under the present modelling configuration. For the Nile, in addition to the substantial anthropogenic modifications along its course, the Med-CORDEX atmospheric domain does not cover the full basin, which is why the protocol prescribes a monthly climatological discharge. As a result, the Nile cannot be included in the validation of dynamically simulated river flows.

Although the inclusion of additional basins especially from the southern and eastern sides would provide a more spatially complete assessment, their contributions to the total freshwater flux at the Mediterranean land–ocean interface is small and therefore would not modify the main findings of the analysis.

For each selected river, simulated discharge was compared to observations from the nearest available station to the river mouth, covering upstream areas between 1900 and 95 000 km² (807 000 km² for the Danube). Observational data were sourced from the Global Runoff Data Centre (GRDC, 2024), the Hydrographic Studies Center of CEDEX (CEH-CEDEX, 2024), and from Hagemann et al. (2020). To ensure accurate hydrograph construction and validation metrics, simulated discharge values corresponding to missing observational data were removed.

2.2 Model description and experimental setup

This study is based on the use of CaMa-Flood and WRF-Hydro hydrological models for daily discharge simulations, driven by the ENEA-REG atmosphere–land–ocean coupled model run at 12km resolution over the Med-CORDEX domain. Within ENEA-REG, WRF model is used to dynamically downscale ERA5 data (Hersbach et al., 2020) and simulate the atmospheric variables, while the Noah-MP model simulates the runoff. The configurations of both models (WRF and Noah-MP) are provided in Tables S1 and S2 in the Supplement. All the ENEA-REG variables required as input by the two hydrological models are previously regridded to 6 km resolution using conservative remapping. In particular, CaMa-Flood uses as input daily runoff (surface + subsurface runoff, Fig. S1), while WRF-Hydro, which couples Noah–MP with the hydrological routing model, requires a set of atmospheric state variables – such as air temperature, surface pressure, specific humidity, horizontal wind components (10 m), downward shortwave and longwave radiation, and rainfall rate – provided at 6-hourly intervals (Fig. S2).

To guide the reader through the experimental design, we structured the modelling experiments from the simplest to the most complex configuration. First, we evaluate the ENEA-REG–driven CaMa-Flood and WRF-Hydro in their default setups, ensuring a fair comparison between the two models. These default configurations, described in Sect. 2.2.1 and 2.2.2, constitute the core model intercomparison of this study. In a second step (Sect. 2.2.3), we introduce an additional level of complexity by calibrating key hydrological parameters of WRF-Hydro. This supplementary analysis assesses how much calibration can further enhance discharge performance, providing insight into potential improvements when computational resources permit.

Both CaMa-Flood and WRF-Hydro hydrological models are run for the period 1990–2014, after five years of spin-up. None of the simulations considers reservoir operations and lakes, due to the lack of consistent and comprehensive information on reservoir management and characteristics across all Mediterranean basins. This choice ensures a fair spatial evaluation across the study domain, but we acknowledge it as a limitation that may contribute to reduced performance in some rivers.

2.2.1 CaMa-Flood processes and configuration

CaMa-Flood (v4.11, Yamazaki et al., 2011, 2013) is a global-scale distributed hydrodynamics model designed to simulate river discharge, water levels, and floodplain inundation by routing runoff from land surface models through a predefined river network map. The model employs a Manning's friction coefficient of 0.03 for main river channels and automatically adjusts its routing time step to satisfy the Courant–Friedrichs–Lewy condition as used in previous studies (e.g., Bates et al., 2010; Yamazaki et al., 2013).

CaMa-Flood employs the local inertial equation (a simplified form of the Saint-Venant equation that excludes the advection term) to balance computational efficiency with physical accuracy. River basins are represented as unit catchments with subgrid parameters for floodplain topography, allowing floodplain inundation to be modelled as a subgrid-scale process. River discharge is calculated between upstream and downstream catchments based on the grid-vector hybrid river network, while water levels and inundation areas are diagnosed from water storage in each catchment, assuming uniform water surface elevation. Water storage is updated by conserving mass, incorporating upstream inflows, downstream outflows, and local runoff inputs. Floodplains of neighbouring unit catchments can exchange flows through so-called bifurcation channels, making it a quasi-2D model.

The river network map and subgrid topography parameters were extracted using the FLOW upscaling algorithm (Yamazaki et al., 2009), applied to MERIT Hydro hydrography maps (Yamazaki et al., 2019) and MERIT DEM data at 3 arcsec (90 m) resolution. Errors and obstacles in the DEMs, such as vegetation canopy, levees, water surface contamination, and radar speckles, were corrected to ensure consistent downhill flow along streamlines (Yamazaki et al., 2012).

2.2.2 WRF-Hydro processes and configuration

WRF-Hydro (v5.2, Gochis et al., 2021) is a hydrological model designed to simulate surface and subsurface runoff processes, either as a standalone model as in our case or coupled with the WRF atmosphere model. It integrates overland flow, subsurface flow, baseflow, and channel routing to represent river discharge and hydrodynamics. WRF-Hydro solves the Boussinesq equation for saturated subsurface lateral flow, accounting for hydraulic gradients driven by topography, saturated soil depth, and hydraulic conductivity. It incorporates exfiltration from fully saturated cells and infiltration excess from the integrated Noah-MP land surface model (LSM), ensuring a dynamic interaction between surface and subsurface processes that together contribute to the 2D overland flow and the 1D channel routing.

Overland and channel flows are computed using the diffusive wave equation, an efficient approximation of the Saint-Venant equations that captures backwater effects and shallow water dynamics. The current surface water representation assumes one-way water flow into channels, without simulating overbank flow. Additionally, baseflow is represented using a conceptual storage-discharge bucket model, which gradually returns water to downstream channels.

Both the land surface model (LSM) and the hydrological routing model components of WRF-Hydro were run on the same 6 km spatial resolution grid. The land surface simulations were conducted at hourly intervals, while discharge was routed using a conservative terrain and channel routing timestep of 6 min. Streamflow outputs were recorded at a daily frequency.

For a fair comparison, WRF-Hydro was run using the same topography generated for CaMa-Flood. However, because WRF-Hydro requires high-resolution topography to generate the drainage network through its GIS preprocessing tool, the upscaled DEM was reconditioned both automatically and manually. This reconditioning was based on an available validated drainage network at a close resolution of 1/16° (Wu et al., 2012) and involved minimal intervention to avoid significantly altering flow velocity.

2.2.3 WRF-Hydro calibration

Hydrological models, whether physical or conceptual, require calibration to achieve reliable streamflow simulations. This necessity arises from two key challenges: (i) the inability to measure all model parameters at the scale of application, and (ii) the simplification and spatiotemporal discretization of the complex and highly variable rainfall-runoff processes (Beck et al., 2020).

To maintain a fair and balanced comparison between the two models evaluated in this study, the WRF-Hydro calibration is presented as a distinct and complementary analysis rather than as part of the core intercomparison. This decision reflects several methodological and practical constraints. Unlike WRF-Hydro, neither CaMa-Flood nor the Noah-MP LSM currently allows basin-specific or spatially variable parameter calibration, and Noah-MP is not two-way coupled with CaMa-Flood in our framework. Additionally, calibration of WRF-Hydro was feasible due to the availability of the PyWRFHydroCalib package, whereas no equivalent calibration tool exists for CaMa-Flood. Given the substantial computational cost of regional-scale calibration, we therefore present the WRF-Hydro calibration results as an “add-on” analysis, illustrating the potential benefits where resources permit while keeping the main model comparison consistent and unbiased.

The WRF-Hydro modelling system includes various predefined hydrological parameters, some of which are influenced by land use (e.g., vegetation type and density) and soil type (e.g., silt, clay, loam, sand) (Cerbelaud et al., 2022; Verma and J., 2023). These parameters can be adjusted or calibrated to account for the specific regional orographic and climatic characteristics (Lahmers et al., 2019; Yu et al., 2023), such as those of the Mediterranean basin, where default parameter values are often insufficient. Calibration involves determining optimal tuneable coefficients by proportionally adjusting these spatial parameters relative to their default values, ultimately improving simulation accuracy (Cerbelaud et al., 2022; Verma and J., 2023). The WRF-Hydro calibration parameters are categorized into six groups related to: soil properties, runoff processes, groundwater dynamics, vegetation behaviour, snowpack melting, and channel characteristics. In this study, calibration focused on the first four groups, covering 16 parameters (Table S3), while snowmelt parameters were left at their default settings to ensure consistency across basins and fairness in the intercomparison, and for potential future regionalization over other snow-free basins. We acknowledge that this choice may limit performance in snow-influenced basins and that future work could further refine results by including these parameters. Similarly, channel parameters, such as channel roughness, were not calibrated because the channel routing resolution of ∼6 km was considered too coarse for a reliable hydrological calibration.

For the calibration of the WRF-Hydro model, the Dynamically Dimensioned Search (DDS) algorithm, a heuristic single-solution-based global optimization method developed by Tolson and Shoemaker (2007), was employed by running the NCAR's (National Centre for Atmospheric Research) PyWrfHydroCalib package (https://github.com/NCAR/PyWrfHydroCalib, last access: 28 March 2023). This algorithm is particularly suited for complex, high-dimensional hydrological models, enabling efficient exploration of parameter space while minimizing computational demands. DDS begins with a global search that transitions to a more localized search as iterations progress, guided by a scalar neighbourhood size perturbation parameter, which is typically set to 0.2. The dynamic adjustment of search size and probabilistic selection of parameters ensure rapid convergence toward good local or regional global optima, even without explicitly targeting the true global optimum (Tolson and Shoemaker, 2007). The stopping criterion is the number of user-defined iterations, which was set to 350 in this study. The calibration objective function, the Kling–Gupta Efficiency (KGE) (Gupta et al., 2009), is optimized to maximize agreement between simulated and observed streamflow. This method has been shown to yield efficient solutions for WRF-Hydro (Cosgrove et al., 2024; RafieeiNasab et al., 2025), balancing computational efficiency with calibration accuracy.

Additionally, to address the computational challenges posed by the WRF-Hydro model's complexity and the grid-based structure, which results in longer simulation times compared to other hydrological models (Kiliçarslan, 2022), a sub-domain approach was adopted to optimize calibration efficiency. Instead of calibrating the hydrological basins across the entire Med-CORDEX domain, smaller sub-domains (Fig. 1) were created. Each sub-domain focused on one or two basins (when their calibration date ranges overlapped) and was designed to closely match the Med-CORDEX grid, albeit with minor deviations. Calibration was then performed independently for each basin using an HPC system to ensure efficient utilization of computational resources. On average, the calibration simulations required a bit less than 3000 CPU hours per basin, effectively balancing performance and scalability to address the varying sizes and computational demands of the river basins. Following calibration, the optimized parameters for each basin were reintegrated into the larger Med-CORDEX domain for validation.

Before calibration, WRF-Hydro, driven by a regional coupled model (ENEA–REG), was “spun-up” for five years using default model parameters for each selected basin. This spin-up phase was critical for stabilizing soil moisture initial conditions. The model state at the end of this spin-up period served as a “warm start” for the calibration phase, which was performed for five years for each basin. Each calibration iteration included a one-year spin-up as an acclimation phase to align the model state with current conditions and suppress instabilities from parameter changes (Rafieeinasab et al., 2024; RafieeiNasab et al., 2025). The specific spin-upand calibration periods varied across basins and were selected based on the availability of reliable and continuous observational records (i.e. at least 5-year all falling within the 1990–2014 period), and are provided in Table S4. The optimized parameters from the calibration were then used to evaluate the model over the entire available period within 1990–2014, focusing on ensuring consistent performance across the selected basins. Although this evaluation is not fully independent – since the 5-year calibration window is included – it remains largely independent, given the 25-year evaluation window, and was chosen, in line with previous literature (e.g. Cosgrove et al., 2024), to ensure methodological consistency with the default (non-calibrated) WRF-Hydro evaluation, performed over the same full period. This alignment enables a direct, like-for-like assessment of the improvements attributable solely to calibration. Station-observed streamflow data served as the reference for both calibration and validation, providing a robust basis for model evaluation. Figure S3 summarizes the different calibration steps.

2.3 Metrics

The evaluation of the performance of the different hydrological models relied on several complementary metrics to capture various aspects of hydrological behaviour, including correlation, bias, and flow variability. The metrics used in this study are described below:

Spearman's ρ (Spearman, 1904): this non-parametric measure of rank correlation assesses the monotonic relationship between simulated and observed streamflow, providing insight into the consistency of flow ranking without assuming linearity. Spearman's rank-order correlation coefficient, used as an indication of flow timing, was chosen instead of Pearson's coefficient due to its effectiveness in handling nonlinear and non-normally distributed data, such as hydrologic time series (Yue et al., 2002).
Kling–Gupta Efficiency (KGE) (Gupta et al., 2009): the KGE was used as a comprehensive measure of model performance, combining correlation, bias, and variability. A KGE value greater than −0.41 indicates better performance than the average streamflow benchmark, which represents the simplest model where the simulated value is always the mean flow (Knoben et al., 2019). Furthermore, KGE values can be classified to indicate the quality of performance: values ≤ −0.41 are unacceptable, −0.41 < KGE ≤ 0.00 are very poor, 0.00 < KGE ≤ 0.30 are poor, 0.30 < KGE ≤ 0.65 are intermediate, and 0.65 < KGE ≤ 1.00 are good (adapted from Sanchez Lozano et al., 2025).
Relative Standard Deviation (rSD): the rSD compares the variability in simulated and observed streamflow, expressed as a ratio between the standard deviations of the simulated and observed streamflow.
Percent Bias (% Bias): the % Bias quantifies the systematic error in the simulation as a percentage of observed values. This metric, highlights over- or underestimation of the total flow by the model.
Low-flow Percent Bias (Bottom 30 %) (Casper et al., 2012): this metric evaluates the model's ability to simulate long-term low-flow conditions by calculating the bias for the bottom 30 % of observed streamflow values, which are critical for drought analysis and ecological studies.
High-flow Percent Bias (Top 2 %) (Yilmaz et al., 2008): to assess the model's performance in simulating peak discharge events, the bias for the top 2 % of observed streamflow values was computed. This is particularly important for understanding flood dynamics and extreme events.
Time Lag: the lag corresponding to the highest cross-correlation between observed and simulated streamflow was calculated to assess the timing discrepancies. This metric helps identify whether the model captures the correct temporal alignment of flow dynamics with observations.

Together, these metrics provide a comprehensive evaluation framework, addressing different aspects of hydrological performance, from overall state to extreme conditions. Their corresponding value ranges and equations are detailed in Table 1.

Table 1List of evaluation statistical metrics. In particular, d is the difference in independent ranking for simulated and observed values for day i and n is the number of values in each time series. σ_sim and σ_obs are the standard deviations of the simulated and observed streamflow, respectively. QO_i and QS_i are observed and simulated flow values for day i, respectively. r is the Pearson correlation coefficient, β is the relative bias (mean simulated divided by mean observed), and γ is the variability ratio (standard deviation of simulated divided by standard deviation of observed). QO_l and QS_l are the bottom 30 % observed and simulated flow values. Q_min is the minimum flow value in the observed and simulated timeseries. μ_obs and μ_sim are the means of observed and simulated flow values.

Download Print Version | Download XLSX

3 Results and discussion

Before comparing the performances of CaMa-Flood and WRF-Hydro models, driven by a regional coupled model ENEA–REG, we first present their simulated discharge time series, alongside observations and the discharge simulated by the ENEA-REG-driven HD model at both 0.5° and 5 min spatial resolutions. These results are shown in Fig. 2 for a representative subset of basins, with the full set of basins provided in Fig. S4. The higher-resolution HD configuration is intended to improve the representation of the drainage network and catchment area, thereby enhancing discharge estimates. However, both HD versions show a clear underestimation of freshwater inflow to the Mediterranean Sea and fail to reproduce the observed temporal patterns.

https://gmd.copernicus.org/articles/19/2881/2026/gmd-19-2881-2026-f02

Figure 2Observed and simulated daily discharge for the Ebro, Rhone, Tiber, Goeksu and Po rivers for common 10 years from 1995 to 2004. Simulations include ENEA-REG–driven WRF-Hydro and CaMa-Flood, as well as the HD model at 0.5° and 5 min spatial resolutions, evaluated near the corresponding gauge stations.

Download

Quantitatively, the HD model exhibits very poor performance across Mediterranean basins. At 0.5°, the overall average KGE is −0.13, with a mean bias of −37.4 %, a Spearman's correlation of 0.36, and a relative SD of only 0.23. At 5 min, the performance improves slightly (KGE = 0.00; % Bias = −16.6 %; r_s=0.48), but variability remains strongly damped (rSD = 0.25). The most critical shortcoming is the systematic underestimation of extremes: high-flow biases reach −70.3 % (0.5°) and −65.7 % (5 min), while low-flow biases are also negative (−25.3 % and −9.2 %, respectively). Model performances at the basin scale for both HD versions (0.5° and 5 min) are provided in Tables S5 and S6, respectively, offering basin-specific details beyond the aggregated statistics presented here.

These results suggest that the underestimation of freshwater inflow by the HD model is not primarily due to the long-term mean bias – which is moderate in the 5 min configuration – but rather to its inability to capture discharge variability and high-flow peaks.

In contrast, visually WRF-Hydro and CaMa-Flood reproduce observed temporal dynamics much more closely, including variability and extremes, and therefore clearly outperform the HD model. Both models show strong potential for capturing observed patterns when using runoff from the Noah-MP land surface model – the default LSM in many Euro-Mediterranean regional coupled and Earth system models – which has already been validated against ERA5-Land runoff data (Hamitouche et al., 2025a). This indicates that CaMa-Flood and WRF-Hydro are suitable alternatives to replace the HD model in this modelling framework, despite some visual biases in low- and high-flow reproduction. The following section provides a detailed evaluation of their performances.

3.1 WRF-Hydro vs. CaMa-Flood (default configurations)

In this section, we compare the performance of CaMa-Flood and WRF-Hydro in its default configuration, while a detailed discussion of the calibrated WRF-Hydro is provided in the next section.

Overall, the two models, driven by a regional coupled model (i.e. ENEA-REG), show comparable skill in reproducing flow timing, but differ systematically in their ability to capture variability and discharge magnitude. CaMa-Flood tends to underestimate flow variability and both high- and low-flow magnitudes, while WRF-Hydro typically produces more realistic variability, though sometimes with excessive peaks and overestimation. Both models outperform the HD baseline included in ENEA-REG, demonstrating clear added value; however, neither model performs consistently well across all Mediterranean basins, and each shows strengths and weaknesses depending on basin characteristics and scale (Figs. 3–5).

https://gmd.copernicus.org/articles/19/2881/2026/gmd-19-2881-2026-f03

Figure 3Taylor diagram showing the performance of CaMa-Flood (orange) and default WRF-Hydro (Default, blue) models, driven by a regional coupled model (ENEA–REG), in different river basins numerated from 1 to 11. Number 12 refers to the mean value. The size and the orientation of the triangles represent the magnitude and under- or overestimation of the percent bias.

Download

https://gmd.copernicus.org/articles/19/2881/2026/gmd-19-2881-2026-f04

Figure 4KGE values of each model experiment – CaMa-Flood; WRF-Hydro (Default); WRF-Hydro (Calibrated) – across the studied basins.

https://gmd.copernicus.org/articles/19/2881/2026/gmd-19-2881-2026-f05

Figure 5Scatterplot of low-flow bias vs. high-flow bias in each model experiment – CaMa-Flood; WRF-Hydro (Default); WRF-Hydro (Calibrated) – and river basin.

Download

Both models reproduce the seasonal flow timing reasonably well (Fig. 3). Their mean (0.64 for CaMa-Flood vs. 0.63 for WRF-Hydro) and median (0.64 for CaMa-Flood vs. 0.65 for WRF-Hydro) Spearman's ρ values are nearly identical, confirming that neither model has a systematic advantage in timing skill. Differences emerge at the basin level: WRF-Hydro excels in large basins such as the Danube, while CaMa-Flood performs better in smaller or steep catchments such as the Goeksu. The time-lag analysis (Table 2) reinforces this pattern: WRF-Hydro generally produces shorter delays in peak flow (with an average of 4.8 d vs. 8.5 d). However, this difference is strongly influenced by the large lag simulated by CaMa-Flood in the Danube basin. When the Danube is excluded, the average lag becomes comparable between the two models (5.0 d for CaMa-Flood vs. 5.3 d for WRF-Hydro). The striking 43 d lag in the Danube for the CaMa-Flood aligns with findings from by Hamitouche et al. (2025a). By analysing the ENEA-REG and WRF-Hydro simulated runoffs (Fig. S5), we observed a close alignment in timing, indicating that the coupling of the land surface model and routing in WRF-Hydro does not significantly affect runoff timing. This suggests that the substantial lag in CaMa-Flood is primarily due to its routing limitations.

Table 2Summary of time lag values for each model experiment – CaMa-Flood; WRF-Hydro (Default); WRF-Hydro (Calibrated) – and each basin.

Download Print Version | Download XLSX

The two models diverge more substantially in representing flow variability (Fig. 3). CaMa-Flood consistently underestimates variability, often falling below the acceptable 0.8–1.2 range (Lin et al., 2019; Sanchez Lozano et al., 2025). WRF-Hydro displays a broader distribution, with several basins close to observed variability, but also instances of pronounced overestimation, especially in steep alpine basins (e.g., Adige). This contrast reflects inherent differences between a quasi-2D (solely) total runoff routing model (CaMa-Flood) and a process-based distributed system (WRF-Hydro), whose representation of soil–vegetation–atmosphere processes can amplify runoff responses and lead to stronger variability signatures.

Both models show a general tendency to underestimate discharge, but CaMa-Flood does so more strongly (Fig. 3). WRF-Hydro achieves lower overall bias (−12.1 % vs. −27.8 %) but occasionally produces substantial overestimations in some basins – particularly where variability is already high. CaMa-Flood, by contrast, displays more uniform behaviour: predominantly negative biases and limited ability to reproduce high flows. These opposite tendencies explain why WRF-Hydro performs better in some mid-size basins, while CaMa-Flood offers more stable (but typically conservative) estimates.

KGE values across basins fall largely within the intermediate category (0.30 < KGE ≤ 0.65; adapted from Sanchez Lozano et al., 2025) for both ENEA-REG–driven models (Fig. 4), showing that neither system consistently outperforms the other. WRF-Hydro reaches its best performance in large temperate basins, while CaMa-Flood excels in some smaller Mediterranean catchments. Notably, both models improve upon the HD model included in ENEA-REG, demonstrating the benefit of dedicated routing schemes over the simpler baseline.

Regarding low-flow and high-flow biases, these hydrological signatures evaluate performance using two segments of the flow duration curve (FDC) (Smakhtin, 2001; Vogel and Fennessey, 1994), as defined by Yilmaz et al. (2008). The high-flow segment (0–0.02 exceedance probabilities) represents watershed response to large precipitation events (Fig. S6) and is used to calculate high-flow bias (% BiasFHV, Eq. 6). The low-flow segment (0.7–1.0 exceedance probabilities) reflects long-term flow sustainability (Fig. S7) and is used to calculate low-flow bias (% BiasFLV, Eq. 5).

High-flow and low-flow biases reveal complementary strengths (Fig. 5). CaMa-Flood systematically underestimates both extremes (−27.8 % and −16.3 %, respectively), reflecting a more damped hydrological response. WRF-Hydro shows more variable behaviour: it reduces high-flow underestimation on average (−2.1 %) but can overestimate peak flows in certain basins, while its low-flow errors range from strong underestimation to slight overestimation, with an average of −18.0 %. These contrasting patterns indicate that WRF-Hydro generates a more dynamic flow regime, whereas CaMa-Flood tends toward smoother hydrographs.

Although direct comparisons with previous studies are limited due to differences in model configuration, spatial resolution, forcing data, and basin characteristics, the overall behaviour observed here is broadly consistent with the literature. Applications of WRF-Hydro in various hydrological contexts have generally shown that the model can reproduce streamflow dynamics with reasonable skill, particularly after calibration. For example, coupled and uncoupled implementations of WRF-Hydro have reported efficiency metrics ranging from moderate to high performance (e.g., NSE or KGE values around 0.7–0.8 in calibrated applications), demonstrating the model's capability to capture the timing and magnitude of discharge in different basins (Ndiaye et al., 2026; Senatore et al., 2015; Wang et al., 2020). However, several studies also highlight persistent challenges in reproducing peak flows or extremes, often linked to precipitation forcing errors or parameter uncertainties (Naabil et al., 2017). The behaviour identified in the present study – where WRF-Hydro better captures variability but may occasionally overestimate peaks – therefore aligns with the general characteristics reported in previous evaluations.

In contrast, very few studies have assessed the performance of CaMa-Flood in the Euro-Mediterranean context. Existing applications in this region have mainly used the model for large-scale water-cycle analyses or river discharge estimates rather than detailed basin-scale performance evaluation (e.g., closing the regional water balance), limiting opportunities for direct benchmarking of routing performance. Consequently, the systematic underestimation of variability and extremes found in our work provides new insight into the behaviour of CaMa-Flood when driven by a regional Earth system model over Mediterranean basins.

To explore whether the basin size influences the model performance, we analysed the Pearson's correlation between key performance metrics and the size of the analysed Mediterranean river basins. These basins represent a wide range of upstream areas, from smaller basins like Kopru to larger ones such as the Rhone. The Danube basin was excluded from this analysis due to its significantly larger size compared to the others, which could substantially alter the correlation results. The results for Spearman's ρ indicate that the flow timing of both CaMa-Flood and WRF-Hydro models, driven by a regional coupled model (ENEA–REG), shows a weak relationship with basin size, with both models having a correlation value of −0.13. The flow variability, as well as the time lag, increases with increasing basin size only for CaMa-Flood (correlation of 0.48 and 0.63 respectively). For percent bias, CaMa-Flood is less sensitive to basin size compared to WRF-Hydro (−0.32 vs. −0.67), especially in terms of high-flow bias (−0.18 for CaMa-Flood, −0.58 for WRF-Hydro) compared to low-flow bias (−0.46 for CaMa-Flood, −0.66 for WRF-Hydro). Although these correlations are not statistically valid due to the limited number of basins analysed, they are still useful for gaining insight into the sensitivity of each model to basin size.

In summary, both models provide added value over the HD baseline and can reproduce Mediterranean discharge dynamics to an intermediate level of skill.

Their major applications include: improving freshwater fluxes into the ocean component to reduce biases in sea surface salinity; predicting hydrological extremes; supporting long-term water availability for planning; and projecting future hydrological impacts of climate change. Decomposing KGE into its components and examining high- and low-flow biases helps to explain the underlying trade-offs in model performance across Mediterranean basins. CaMa-Flood tends to produce smoother hydrographs with smaller bias in some basins but underestimates variability and extremes, whereas WRF-Hydro more accurately captures flow timing and variability but occasionally overestimates peak flows.

These trade-offs have practical implications for different applications. For ocean–land coupling and improving mean freshwater fluxes into the Mediterranean, CaMa-Flood may be sufficient due to its stable estimates of flow volumes. For extremes prediction, which focuses on high- and low-flow events relevant to short-term forecasting and risk assessment, WRF-Hydro is generally better suited due to its richer process representation, more dynamic flow regime, and ability to capture peak flows. CaMa-Flood retains the advantage of simulating floodplain dynamics, including flood depth and extent, which is valuable for flood hazard assessment in smaller or steep catchments. For projections of future hydrological impacts under climate change, which rely on climate model forcings and often assume stationarity of biases over time, both models are appropriate, as moderate differences in extremes do not strongly affect long-term water balance (Hamitouche et al., 2025a; Oda et al., 2024). Additionally, WRF-Hydro is particularly suitable for studies investigating precipitation–soil moisture feedbacks, as it allows full coupling between soil hydrology and the atmosphere, capturing critical land–atmosphere interactions.

Basin characteristics further influence model suitability: WRF-Hydro performs well in large temperate basins, while CaMa-Flood provides stable, conservative estimates in smaller or steep catchments. Overall, model choice should be guided by the intended application, basin characteristics, and the balance between process detail and computational efficiency.

While this study provides a systematic comparison between CaMa-Flood and WRF-Hydro, it is important to underscore that the two models are not driven by identical types of inputs. CaMa-Flood receives total runoff from Noah-MP within the ENEA-REG framework, which in our configuration is provided at daily resolution, although the model can also operate with higher-frequency inputs such as hourly runoff. In contrast, WRF-Hydro uses atmospheric fields generated by ENEA-REG at 6-hourly resolution to internally compute runoff through its embedded Noah-MP land-surface model, even though it is likewise capable of operating with finer-scale atmospheric forcing. Consequently, differences in simulated discharge reflect the combined influence of both runoff generation and routing processes. For this reason, the comparison should not be interpreted as a purely hydrological benchmark in which both systems operate under identical inputs. Instead, it reflects how each model performs within the coupling configuration in which it would realistically operate in a regional climate or Earth system modelling environment. In such frameworks, CaMa-Flood is typically coupled downstream of a land-surface model, while WRF-Hydro employs its own land-surface component driven directly by meteorological fields. Forcing both models with identical runoff inputs would not represent their intended operational use and would fall outside the scope of this study. Our aim was therefore to assess their relative suitability and behaviour in configurations consistent with their implementation within regional coupled systems such as ENEA-REG.

3.2 WRF-Hydro calibration: performance improvements and related hydrological impacts

3.2.1 Performance metrics before and after Calibration

The calibration of WRF-Hydro, aimed to optimize the Kling–Gupta Efficiency (KGE), leads to an improvement of the model performances in most of the basins and metrics. In average, the calibration improved the KGE from 0.32 to 0.48, with the median increasing from 0.45 to 0.60. The quality of performance pass from intermediate to good (KGE > 0.6; adapted from Sanchez Lozano et al., 2025) in several basins such as the Arno, the Danube, the Rhone and the Ebro. After calibration, Goeksu and Tiber rivers show intermediate performance, while for the Po, Kopru, and Maritsa basins the KGE remained unchanged (Fig. 4).

These unchanged KGE values are particularly noteworthy. They suggest that the default parameters for these basins were already close to optimal, and further calibration did not yield any improvements. This outcome highlights the robustness of the default parameter configuration in specific contexts and emphasizes the need for calibration strategies that are sensitive to basin-specific hydrological characteristics.

The components of the KGE offer further insight into these outcomes. The temporal correlation showed relative stability across most basins, with modest improvements in the Goeksu, Rhone and Arno, while noticeable declines were found in the Ceyhan. These results indicate that the improved temporal alignment between modelled and observed discharges is not systematic for all basins. In terms of bias, calibration yielded mixed results: the bias reduced substantially for the Goeksu (−33 %) and the Rhone (−8.2 %) basins, while worsen in the Danube (+4.6 %) and the Ceyhan (+8.2 %). These changes illustrate that while calibration reduced bias in many cases, it occasionally introduced trade-offs. The standard deviation also exhibited changes with an average decreased from 1.18 to 1.08, suggesting improved consistency between simulated and observed flow variability. Relevant improvements were seen in basins such as the Danube, Ceyhan, and Arno (Fig. 6).

https://gmd.copernicus.org/articles/19/2881/2026/gmd-19-2881-2026-f06

Figure 6Taylor diagram showing the performance of WRF-Hydro (Default, blue) and WRF-Hydro (Calibrated, green) models, driven by a regional coupled model (ENEA–REG), in different river basins numerated from 1 to 11. Number 12 refers to the mean value. The size and the orientation of the triangles represent the magnitude and under- or overestimation of the percent bias.

Download

In addition to these core metrics, calibration also influenced lag time (Table 2), low-flow bias, and high-flow bias (Fig. 5). Lag time improved noticeably with the average decreasing from 4.8 to 2 d. The Rhone, Ceyhan, and Arno showed substantial reductions, achieving zero lag time after calibration, while slight increases were observed in basins like the Danube. Low-flow bias exhibited minor overall improvement, shifting in average from −18.0 % to −17.3 %, with notable reductions in the Rhone and Goeksu basins, but with a worsening in Ceyhan. High-flow bias, on the other hand, displayed mixed results: a reduction of the extreme positive biases in the Danube and the Ceyhan together with a slight worsening in the Adige basin.

Overall, the calibration of WRF-Hydro enhanced discharge simulations for many Mediterranean basins, as reflected in the improvements in KGE and reductions in lag time. However, the trade-offs between correlation, bias, and standard deviation, along with the basin-specific variability, highlight areas where the calibration strategy could be refined to better capture the unique dynamics of each basin.

For the Adige basin, the poor performance, both with default and calibrated WRF-Hydro, observed particularly for the strong overestimation of high-flow events, the overall bias, and the variability, resulting in a very poor KGE, can be explained by the substantial corrections made to the drainage network. To disconnect the Adige basin from the adjacent Po River basin – necessary due to the coarse spatial resolution of the model – elevation reconditioning was applied. However, these corrections altered the topography slope, a critical factor in discharge calculations. The changes to the slope resulted in overly fast flows, which in turn caused exaggerated high-flow peaks and impacted the model's ability to simulate discharge accurately for the Adige basin. This finding underscores the sensitivity of WRF-Hydro, as any other hydrodynamic or hydrological model, to topography and the need for careful elevation reconditioning, particularly at coarse spatial resolutions. While such corrections are essential to accurately delineate basin boundaries and drainage networks, they must be implemented cautiously to avoid introducing errors that affect hydrological dynamics.

Excluding the Adige basin from the analysis demonstrated notable improvements in the overall performance of the WRF-Hydro model outperforming CaMa-Flood (Fig. 7). For instance, the calibrated WRF-Hydro configuration showed an increase in KGE from 0.48 (including Adige) to 0.53 (excluding Adige), compared to the default settings (0.32 and 0.4 respectively) and CaMa-Flood (0.37). Similarly, the percent bias improved from 23.9 % to 21.5 %, and the relative standard deviation became closer to 1, indicating better approximation of flow variability. These improvements underscore the sensitivity of WRF-Hydro's performance to accurate input parameters and the profound impact of specific basin corrections on overall model results.

https://gmd.copernicus.org/articles/19/2881/2026/gmd-19-2881-2026-f07

Figure 7Performance metrics for each model experiment (with Adige basin excluded). The X represents the mean value.

Download

It is worth noting that while some metrics, such as high-flow bias, showed a decrease in performance, this is primarily because the Adige basin's overestimation of high flows compensated for the negative biases in other basins. Nevertheless, the improvements in the key metrics – KGE, Percent Bias, and relative standard deviation – demonstrate the importance of addressing basin-specific challenges to enhance model reliability.

Post-calibration improvement was actually expected, particularly because several important sources of structural and input uncertainty are not explicitly represented in our modelling framework.

First, none of the models account for human interventions such as reservoirs, hydropower operations, and water withdrawals, which can substantially modify the timing, magnitude, and variability of streamflow in many Mediterranean rivers. This omission is primarily due to the lack of consistent and comprehensive information on reservoir characteristics and management practices across all Mediterranean basins, which span multiple countries and governance systems. Consequently, the simulated discharge reflects a naturalized flow regime, whereas the observations often reflect regulated conditions, a well-documented issue for most Mediterranean rivers (Grill et al., 2019). This inherent mismatch contributes to discrepancies in low-flow periods, peak attenuation, and seasonal timing.

Second, locally calibrated scores help compensate for errors in catchment boundaries (Kauffeldt et al., 2013; Lehner, 2012), primarily due to the coarse resolution of elevation data, as exemplified by the Adige basin. From a purely hydrological perspective, this resolution is inadequate, and a higher resolution on the order of hundreds of meters would be preferable to better represent small basins. However, this is constrained by the very high computational resources required for simulations covering a large domain, such as Med-CORDEX.

Third, calibration also mitigates errors in meteorological forcing, particularly in precipitation data (Beck et al., 2017, 2019), as shown in Sect. S1 and Fig. S8. The regional coupled models generate meteorological forcing at relatively coarse spatiotemporal resolution and rely on parameterized convection, known source of uncertainty and errors. In contrast, convection-permitting models, km-scale models that explicitly resolve deep convection, can more realistically represent sub-daily statistics and extreme events (Fosser et al., 2024; Struglia et al., 2025), highlighting an additional limitation of regional models in capturing high-flow and short-duration extremes. Further efforts in bias correction or data assimilation of climate forcing could improve hydrological model performance, pending the extension of convection-permitting models to large domains such as Med-CORDEX, given the substantial computational resources required.

Readers should interpret the findings within this context, and future work incorporating anthropogenic controls, higher-resolution runs, and refined meteorological forcing could further improve model performance.

https://gmd.copernicus.org/articles/19/2881/2026/gmd-19-2881-2026-f08

Figure 8Comparison of (a) average daily total runoff and (b) the ratio of subsurface to total runoff across each river basin as simulated by ENEA-REG (orange), WRF-Hydro with default parameters (blue), and WRF-Hydro with calibrated parameters (green).

Download

3.2.2 Effects of calibration on runoff generation and partitioning

To better understand the source of performance improvements following calibration, we investigated how the calibrated parameters influenced key hydrological processes – particularly runoff generation and the surface–subsurface flow partitioning. This analysis helps explain the hydrological changes driving model behaviour across the Mediterranean basins. Specifically, we examined simulated total runoff, as it serves as the primary input for CaMa-Flood, and the ratio of subsurface to total runoff. As a reminder, runoff is simulated by the Noah-MP land surface model, which is integrated within WRF-Hydro and embedded in WRF – used to dynamically downscale ERA5 data within the ENEA–REG coupled model. Details on the calibrated parameters, their roles in controlling hydrological processes, and trends across basins are provided in the Supplement (Sect. S2 and Table S7), while Fig. 8 summarizes and compares the average daily total runoff (mm d⁻¹) and the subsurface-to-total runoff ratio simulated by the ENEA-REG model, and by WRF-Hydro using both default and calibrated parameter values. This comparison is essential for understanding the observed performance differences between the CaMa-Flood and WRF-Hydro models, alongside the distinct routing configurations in each model.

Our results indicate that WRF-Hydro (default) predominantly generate higher runoff compared to ENEA-REG, with percentage increases ranging from 11.1 % to 49.8 %, except for the Göksu basin, where it shows lower runoff by 3.6 %. Similarly, the subsurface-to-total runoff ratio simulated by WRF-Hydro exceeds that of ENEA-REG across all basins, with percentage increases ranging from 4.1 % to 119.9 %. These findings underscore the sensitivity of runoff to the dynamic interactions between land surface models, routing processes, and the partitioning of surface and subsurface flows.

Excluding the Po, Maritsa, and Kopru basins – where default and calibrated parameters coincide – the calibration process (WRF-Hydro (calibrated)) led to a remarkable reduction in total runoff for most basins. This decrease, brought runoff values closer to, or even below, those simulated by ENEA-REG with and average reduction between 20.5 % and 48.1 %. However, in the Tiber and Göksu basins, the calibrated parameters produced a marked increase in total runoff. These variations can largely be attributed to changes in parameters that control transpiration and soil evaporation, as the slope parameter consistently decreased, limiting deep drainage.

For the subsurface-to-total runoff ratio, calibration effects varied. In the Göksu basin, there was a slight increase of 3.8 %, while for other basins the ratio decreased, either marginally (e.g., Danube at −1.5 %, Adige at −3.4 %, and Ceyhan at −4.9 %) or significantly (e.g., Rhone at −26.2 % and Ebro at −28.7 %). These changes stemmed from the combined influence of parameters governing infiltration and runoff partitioning. Together, the results highlight the critical role of parameter calibration in modulating both total runoff and its partitioning into surface and subsurface components, thereby influencing hydrological model performance.

4 Conclusions

This study assessed whether WRF-Hydro and CaMa-Flood can serve as effective alternatives to HD routing scheme for improving hydrological simulations within Euro-Mediterranean regional coupled models, and evaluated the extent to which calibration can enhance WRF-Hydro's performance.

Both models, when driven by the ENEA-REG regional coupled system, demonstrated substantial improvements over the HD baseline, particularly in reproducing temporal dynamics and discharge variability across Mediterranean basins. CaMa-Flood delivered stable, conservative estimates with consistent underestimation of variability and extremes, while WRF-Hydro exhibited a more dynamic hydrological response with generally better representation of flow timing, variability, and high-flow behaviour. However, neither model performed uniformly across all basins, reflecting the diverse hydrological regimes of the Mediterranean region. Importantly, this model comparison should be interpreted within the context of their integration into a regional coupled modelling system, rather than as a pure hydrological benchmark with identical runoff inputs or meteorological forcings.

Calibration significantly improved the performance of WRF-Hydro across most basins, increasing KGE, reducing timing lags, and improving the representation of discharge variability. These improvements were basin-dependent and were influenced by different sources of structural and input uncertainty – including the exclusion of human interventions, catchment boundary inaccuracies linked to coarse-resolution topography, and biases in precipitation forcing typical of regional climate models. The Adige basin highlighted the sensitivity of hydrodynamic models to drainage network corrections, underscoring the need for careful topographic preparation in coarse-resolution regional applications.

Overall, the findings demonstrate that both WRF-Hydro and CaMa-Flood are viable alternatives to the HD model for use in regional coupled modelling frameworks such as Med-CORDEX. WRF-Hydro provides stronger performance especially for applications requiring detailed representation of variability, extremes, and land–atmosphere interactions. CaMa-Flood, with its computational efficiency and capacity to simulate floodplain dynamics, offers a robust and scalable alternative, particularly for Mediterranean basins where stable discharge estimates are sufficient.

The study also provides a calibrated parameter set for WRF-Hydro that may be transferable to hydrologically similar basins or serve as a baseline for higher-resolution applications. Future work should incorporate reservoir operations, enhanced topographic representation, and improved meteorological forcing, to further advance hydrological realism in regional coupled systems.

Code and data availability

ERA5 data (Hersbach et al., 2020), provided by the European Centre for Medium-Range Weather Forecast (ECMWF), can be freely downloaded from the Copernicus Data Store (https://doi.org/10.24381/cds.adbb2d47, Hersbach et al., 2023). All the model codes used in our study, including ENEA-REG, WRF-Hydro, CaMa-Flood, and the WRF-Hydro calibration package, as well as the discharge observation data used for model evaluation, are available at the following permanent repository: https://doi.org/10.5281/zenodo.16625613 (Hamitouche et al., 2025b). Model output data are available from the corresponding author upon request due to their large volume. However, a step-by-step instruction file, along with all associated materials required to reproduce the results of this study, is publicly available on Zenodo at https://doi.org/10.5281/zenodo.16333943 (Hamitouche et al., 2025c).

Supplement

The supplement related to this article is available online at https://doi.org/10.5194/gmd-19-2881-2026-supplement.

Author contributions

MH, GF, and AA: conceptualisation and methodology; MH: formal analysis, funding acquisition, investigation, visualisation and writing-original draft preparation; MH and AA: simulations; AR: helped with calibration setup; GF and AA: supervision; MH, GF, AA, and AR: validation and writing-review & editing. All authors have read and agreed to the published version of the paper.

Competing interests

The authors declare that they have no conflict of interest.

Disclaimer

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. The authors bear the ultimate responsibility for providing appropriate place names. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Acknowledgements

We would like to thank Stefan Hagemann for providing us with some discharge observation data and Day Yamazaki and Kevin Samspson for their valuable help in setting up the CaMa-Flood and WRF-Hydro models.

The analysis was partially carried out on the High Performance Computing DataCenter at IUSS, co-funded by Regione Lombardia through the funding programme established by Regional Decree No. 3776 of 3 November 2020.

Financial support

This paper and related research have been conducted during and with the support of the Italian PhD course in Sustainable Development and Climate change (https://www.phd-sdc.it/, last access: 14 June 2025) at the University School for Advanced Studies IUSS and developed within the framework of the project “Dipartimento di Eccellenza 2023–2027”, with the financial support from the ICSC Italian Research Center on High-Performance Computing, Big Data and Quantum Computing and received funding from the European Union Next-GenerationEU (National Recovery and Resilience Plan-NRRP, Mission 4, Component 2, Investment 1.4-D.D: 3138 16/12/2021, CN00000013).

This research has also been supported by the CIHEAM Prize for the Best MSc Thesis 2022.

Review statement

This paper was edited by Lele Shu and reviewed by Miaohua Mao and three anonymous referees.

References

Amelia, J.: Development of Two-Way Coupled Atmospheric Hydrological models, J. Climatol. Weather Forecast., 11, 001–002, 2022.

Anav, A., Carillo, A., Palma, M., Struglia, M. V, Turuncoglu, U. U., and Sannino, G.: The ENEA-REG system (v1.0), a multi-component regional Earth system model: sensitivity to different atmospheric components over the Med-CORDEX (Coordinated Regional Climate Downscaling Experiment) region, Geosci. Model Dev., 14, 4159–4185, https://doi.org/10.5194/gmd-14-4159-2021, 2021.

Bates, P. D., Horritt, M. S., and Fewtrell, T. J.: A simple inertial formulation of the shallow water equations for efficient two-dimensional flood inundation modelling, J. Hydrol., 387, 33–45, https://doi.org/10.1016/j.jhydrol.2010.03.027, 2010.

Beck, H. E., Vergopolan, N., Pan, M., Levizzani, V., van Dijk, A. I. J. M., Weedon, G. P., Brocca, L., Pappenberger, F., Huffman, G. J., and Wood, E. F.: Global-scale evaluation of 22 precipitation datasets using gauge observations and hydrological modeling, Hydrol. Earth Syst. Sci., 21, 6201–6217, https://doi.org/10.5194/hess-21-6201-2017, 2017.

Beck, H. E., Pan, M., Roy, T., Weedon, G. P., Pappenberger, F., van Dijk, A. I. J. M., Huffman, G. J., Adler, R. F., and Wood, E. F.: Daily evaluation of 26 precipitation datasets using Stage-IV gauge-radar data for the CONUS, Hydrol. Earth Syst. Sci., 23, 207–224, https://doi.org/10.5194/hess-23-207-2019, 2019.

Beck, H. E., Pan, M., Lin, P., Seibert, J., van Dijk, A. I. J. M., and Wood, E. F.: Global Fully Distributed Parameter Regionalization Based on Observed Streamflow From 4,229 Headwater Catchments, J. Geophys. Res.-Atmos., 125, e2019JD031485, https://doi.org/10.1029/2019JD031485, 2020.

Casper, M. C., Grigoryan, G., Gronz, O., Gutjahr, O., Heinemann, G., Ley, R., and Rock, A.: Analysis of projected hydrological behavior of catchments based on signature indices, Hydrol. Earth Syst. Sci., 16, 409–421, https://doi.org/10.5194/hess-16-409-2012, 2012.

CEH-CEDEX: Centro de Estudios Hidrográficos del Centro de estudios y experimentación de obras públicas, Spain, https://ceh.cedex.es (last access: 28 April 2024), 2024.

Cerbelaud, A., Lefèvre, J., Genthon, P., and Menkes, C.: Assessment of the WRF-Hydro uncoupled hydro-meteorological model on flashy watersheds of the Grande Terre tropical island of New Caledonia (South-West Pacific), J. Hydrol.: Reg. Stud., 40, 101003, https://doi.org/10.1016/j.ejrh.2022.101003, 2022.

Cisterna-García, A., González-Vidal, A., Martínez-Ibarra, A., Ye, Y., Guillén-Teruel, A., Bernal-Escobedo, L., and Skarmeta, A. F.: Artificial intelligence for streamflow prediction in river basins: a use case in Mar Menor, Sci. Rep., 15, 19481, https://doi.org/10.1038/s41598-025-04524-0, 2025.

Cosgrove, B., Gochis, D., Flowers, T., Dugger, A., Ogden, F., Graziano, T., Clark, E., Cabell, R., Casiday, N., Cui, Z., Eicher, K., Fall, G., Feng, X., Fitzgerald, K., Frazier, N., George, C., Gibbs, R., Hernandez, L., Johnson, D., Jones, R., Karsten, L., Kefelegn, H., Kitzmiller, D., Lee, H., Liu, Y., Mashriqui, H., Mattern, D., McCluskey, A., McCreight, J. L., McDaniel, R., Midekisa, A., Newman, A., Pan, L., Pham, C., RafieeiNasab, A., Rasmussen, R., Read, L., Rezaeianzadeh, M., Salas, F., Sang, D., Sampson, K., Schneider, T., Shi, Q., Sood, G., Wood, A., Wu, W., Yates, D., Yu, W., and Zhang, Y.: NOAA's National Water Model: Advancing operational hydrology through continental-scale modeling, J. Am. Water Resour. Assoc., 60, 247–272, https://doi.org/10.1111/1752-1688.13184, 2024.

Djurdjevic, V. and Rajkovic, B.: Development of the EBU-POM coupled regional climate model and results from climate change experiments, in: Advances in Environmental Modeling and Measurements, edited by: Mihajlović, T. D. and Lalić, B., Nova Science Publisher Inc., New York, USA, 29–32, ISBN 9781608765997, 2010.

Drobinski, P., Anav, A., Lebeaupin Brossier, C., Samson, G., Stéfanon, M., Bastin, S., Baklouti, M., Béranger, K., Beuvier, J., Bourdallé-Badie, R., Coquart, L., D'Andrea, F., de Noblet-Ducoudré, N., Diaz, F., Dutay, J.-C., Ethe, C., Foujols, M.-A., Khvorostyanov, D., Madec, G., Mancip, M., Masson, S., Menut, L., Palmieri, J., Polcher, J., Turquety, S., Valcke, S., and Viovy, N.: Model of the Regional Coupled Earth system (MORCE): Application to process and climate studies in vulnerable regions, Environ. Model. Softw., 35, 1–18, https://doi.org/10.1016/j.envsoft.2012.01.017, 2012.

Fosser, G., Gaetani, M., Kendon, E. J., Adinolfi, M., Ban, N., Belušić, D., Caillaud, C., Careto, J. A. M., Coppola, E., Demory, M.-E., de Vries, H., Dobler, A., Feldmann, H., Goergen, K., Lenderink, G., Pichelli, E., Schär, C., Soares, P. M. M., Somot, S., and Tölle, M. H.: Convection-permitting climate models offer more certain extreme rainfall projections, npj Clim. Atmos. Sci., 7, 51, https://doi.org/10.1038/s41612-024-00600-w, 2024.

Galanaki, E., Lagouvardos, K., Kotroni, V., Giannaros, T., and Giannaros, C.: Implementation of WRF-Hydro at two drainage basins in the region of Attica, Greece, for operational flood forecasting, Nat. Hazards Earth Syst. Sci., 21, 1983–2000, https://doi.org/10.5194/nhess-21-1983-2021, 2021.

Gochis, D. J., Barlage, M., Cabell, R., Casali, M., Dugger, A., Fitzgerald, K., Mcallister, M., McCreight, J., RafieeiNasab, A., Read, L., Sampson, K., Yates, D., and Zhang, Y.: The WRF-Hydro Modeling System Technical Description. (Version 5.2), NCAR Technical Note, NCAR, 108 pp., https://ral.ucar.edu/sites/default/files/docs/water/wrf-hydro-v511-technical-description.pdf (last access: 27 February 2025), 2021.

GRDC: Data Portal, https://grdc.bafg.de/data/data_portal/ (last access: 28 April 2024), 2024.

Grill, G., Lehner, B., Thieme, M., Geenen, B., Tickner, D., Antonelli, F., Babu, S., Borrelli, P., Cheng, L., Crochetiere, H., Ehalt Macedo, H., Filgueiras, R., Goichot, M., Higgins, J., Hogan, Z., Lip, B., McClain, M. E., Meng, J., Mulligan, M., Nilsson, C., Olden, J. D., Opperman, J. J., Petry, P., Reidy Liermann, C., Sáenz, L., Salinas-Rodríguez, S., Schelle, P., Schmitt, R. J. P., Snider, J., Tan, F., Tockner, K., Valdujo, P. H., van Soesbergen, A., and Zarfl, C.: Mapping the world's free-flowing rivers, Nature, 569, 215–221, https://doi.org/10.1038/s41586-019-1111-9, 2019.

Gupta, H. V, Kling, H., Yilmaz, K. K., and Martinez, G. F.: Decomposition of the mean squared error and NSE performance criteria: Implications for improving hydrological modelling, J. Hydrol., 377, 80–91, https://doi.org/10.1016/j.jhydrol.2009.08.003, 2009.

Hagemann, S. and Dümenil, L.: A parametrization of the lateral waterflow for the global scale, Clim. Dynam., 14, 17–31, https://doi.org/10.1007/s003820050205, 1997.

Hagemann, S., Stacke, T., and Ho-Hagemann, H. T. M.: High Resolution Discharge Simulations Over Europe and the Baltic Sea Catchment, Front. Earth Sci., 8, https://doi.org/10.3389/feart.2020.00012, 2020.

Hamitouche, M., Fosser, G., Anav, A., He, C., and Lin, T.-S.: Impact of runoff schemes on global flow discharge: a comprehensive analysis using the Noah-MP and CaMa-Flood models, Hydrol. Earth Syst. Sci., 29, 1221–1240, https://doi.org/10.5194/hess-29-1221-2025, 2025a.

Hamitouche, M., Fosser, G., RafieeiNasab, A., and Anav, A.: Model codes and observation data for “Regional-scale Hydrologic Model Comparison Including Calibration for Improved River Discharge Simulations into the Mediterranean Sea”, Zenodo [code and data set], https://doi.org/10.5281/zenodo.16333943, 2025b.

Hamitouche, M., Fosser, G., RafieeiNasab, A., and Anav, A.: Step-by-step instructions to reproduce the results of “Regional-scale Hydrologic Model Comparison Including Calibration for Improved River Discharge Simulations into the Mediterranean Sea”, Zenodo [workflow], https://doi.org/10.5281/zenodo.16333943, 2025c.

Hersbach, H., Bell, B., Berrisford, P., Hirahara, S., Horányi, A., Muñoz-Sabater, J., Nicolas, J., Peubey, C., Radu, R., Schepers, D., Simmons, A., Soci, C., Abdalla, S., Abellan, X., Balsamo, G., Bechtold, P., Biavati, G., Bidlot, J., Bonavita, M., De Chiara, G., Dahlgren, P., Dee, D., Diamantakis, M., Dragani, R., Flemming, J., Forbes, R., Fuentes, M., Geer, A., Haimberger, L., Healy, S., Hogan, R. J., Hólm, E., Janisková, M., Keeley, S., Laloyaux, P., Lopez, P., Lupu, C., Radnoti, G., de Rosnay, P., Rozum, I., Vamborg, F., Villaume, S., and Thépaut, J.-N.: The ERA5 global reanalysis, Q. J. Roy. Meteorol. Soc., 146, 1999–2049, https://doi.org/10.1002/qj.3803, 2020.

Hersbach, H., Bell, B., Berrisford, P., Biavati, G., Horányi, A., Muñoz Sabater, J., Nicolas, J., Peubey, C., Radu, R., Rozum, I., Schepers, D., Simmons, A., Soci, C., Dee, D., and Thépaut, J.-N.: ERA5 hourly data on single levels from 1940 to present, Copernicus Climate Change Service (C3S) Climate Data Store (CDS) [data set], https://doi.org/10.24381/cds.adbb2d47, 2023.

Kauffeldt, A., Halldin, S., Rodhe, A., Xu, C.-Y., and Westerberg, I. K.: Disinformative data in large-scale hydrological modelling, Hydrol. Earth Syst. Sci., 17, 2845–2857, https://doi.org/10.5194/hess-17-2845-2013, 2013.

Kiliçarslan, B. M.: Calibration and Evaluation of WRF-Hydro Modeling System for Extreme Runoff Simulations: Use of High-Resolution Sea Surface Temperature (SST) Data, Middle East Technical University, Turkey, https://hdl.handle.net/11511/96689 (last access: 13 June 2025), 2022.

Knoben, W. J. M., Freer, J. E., and Woods, R. A.: Technical note: Inherent benchmark or not? Comparing Nash–Sutcliffe and Kling–Gupta efficiency scores, Hydrol. Earth Syst. Sci., 23, 4323–4331, https://doi.org/10.5194/hess-23-4323-2019, 2019.

Lahmers, T. M., Gupta, H., Castro, C. L., Gochis, D. J., Yates, D., Dugger, A., Goodrich, D., and Hazenberg, P.: Enhancing the Structure of the WRF-Hydro Hydrologic Model for Semiarid Environments, J. Hydrometeorol., 20, 691–714, https://doi.org/10.1175/JHM-D-18-0064.1, 2019.

Lehner, B.: Derivation of Watershed Boundaries for GRDC Gauging Stations Based on the HydroSHEDS Drainage Network; Technical Report Prepared for the GRDC, Bundesanstalt für Gewässerkunde, https://doi.org/10.5675/GRDC_Report_41, 2012.

Lin, P., Pan, M., Beck, H. E., Yang, Y., Yamazaki, D., Frasson, R., David, C. H., Durand, M., Pavelsky, T. M., Allen, G. H., Gleason, C. J., and Wood, E. F.: Global Reconstruction of Naturalized River Flows at 2.94 Million Reaches, Water Resour. Res., 55, 6499–6516, https://doi.org/10.1029/2019WR025287, 2019.

Lionello, P., Abrantes, F., Congedi, L., Dulac, F., Gacic, M., Gomis, D., Goodess, C., Hoff, H., Kutiel, H., Luterbacher, J., Planton, S., Reale, M., Schröder, K., Vittoria Struglia, M., Toreti, A., Tsimplis, M., Ulbrich, U., and Xoplaki, E.: Introduction: Mediterranean Climate – Background Information, in: The Climate of the Mediterranean Region, xxxv–xc, edited by: Lionello, P., Elsevier, Oxford, https://doi.org/10.1016/B978-0-12-416042-2.00012-4, 2012.

Ludwig, W., Dumont, E., Meybeck, M., and Heussner, S.: River discharges of water and nutrients to the Mediterranean and Black Sea: Major drivers for ecosystem changes during past and future decades?, Prog. Oceanogr., 80, 199–217, https://doi.org/10.1016/j.pocean.2009.02.001, 2009.

Momblanch, A., Andreu, J., Paredes-Arquiola, J., Solera, A., and Pedro-Monzonís, M.: Adapting water accounting for integrated water resource management. The Júcar Water Resource System (Spain), J. Hydrol., 519, 3369–3385, https://doi.org/10.1016/j.jhydrol.2014.10.002, 2014.

Naabil, E., Lamptey, B. L., Arnault, J., Olufayo, A., and Kunstmann, H.: Water resources management using the WRF-Hydro modelling system: Case-study of the Tono dam in West Africa, J. Hydrol.: Reg. Stud., 12, 196–209, https://doi.org/10.1016/j.ejrh.2017.05.010, 2017.

Ndiaye, A., Arnault, J., Mbaye, M. L., Camara, M., Kunstmann, H., and Lawin, A. E.: Evaluation of the WRF-Hydro model output based on different rainfall input data over the upper basin of the Senegal River, Hydrol. Res., 57, 1–12, https://doi.org/10.2166/nh.2026.044, 2026.

Oda, T., Iwasaki, K., Egusa, T., Kubota, T., Iwagami, S., Iida, S., Momiyama, H., and Shimizu, T.: Scale-Dependent Inter-Catchment Groundwater Flow in Forested Catchments: Analysis of Multi-Catchment Water Balance Observations in Japan, Water Resour. Res., 60, e2024WR037161, https://doi.org/10.1029/2024WR037161, 2024.

Pinardi, N. and Masetti, E.: Variability of the large scale general circulation of the Mediterranean Sea from observations and modelling: a review, Palaeogeogr. Palaeocl. Palaeoecol., 158, 153–173, https://doi.org/10.1016/S0031-0182(00)00048-1, 2000.

Pinardi, N., Zavatarelli, M., Adani, M., Coppini, G., Fratianni, C., Oddo, P., Simoncelli, S., Tonani, M., Lyubartsev, V., Dobricic, S., and Bonaduce, A.: Mediterranean Sea large-scale low-frequency ocean variability and water mass formation rates from 1987 to 2007: A retrospective analysis, Prog. Oceanogr., 132, 318–332, https://doi.org/10.1016/j.pocean.2013.11.003, 2015.

Rafieeinasab, A., Mazrooei, A., Enzminger, T., Srivastava, I., Dugger, A., Gochis, D., Omani, N., Grim, J., Sampson, K., Zhang, Y., LaFontaine, J., Viger, R., Liu, Y., and Schneider, T.: A WRF-Hydro-based retrospective simulation of water resources for US integrated water availability assessment, Hydrol. Earth Syst. Sci. Discuss. [preprint], https://doi.org/10.5194/hess-2024-262, 2024.

RafieeiNasab, A., Fienen, M. N., Omani, N., Srivastava, I., and Dugger, A. L.: Ensemble Methods for Parameter Estimation of WRF-Hydro, Water Resour. Res., 61, e2024WR038048, https://doi.org/10.1029/2024WR038048, 2025.

Reale, M., Giorgi, F., Solidoro, C., Di Biagio, V., Di Sante, F., Mariotti, L., Farneti, R., and Sannino, G.: The Regional Earth System Model RegCM-ES: Evaluation of the Mediterranean Climate and Marine Biogeochemistry, J. Adv. Model. Earth Syst., 12, e2019MS001812, https://doi.org/10.1029/2019MS001812, 2020.

Ruti, P. M., Somot, S., Giorgi, F., Dubois, C., Flaounas, E., Obermann, A., Dell'Aquila, A., Pisacane, G., Harzallah, A., Lombardi, E., Ahrens, B., Akhtar, N., Alias, A., Arsouze, T., Aznar, R., Bastin, S., Bartholy, J., Béranger, K., Beuvier, J., Bouffies-Cloché, S., Brauch, J., Cabos, W., Calmanti, S., Calvet, J.-C., Carillo, A., Conte, D., Coppola, E., Djurdjevic, V., Drobinski, P., Elizalde-Arellano, A., Gaertner, M., Galàn, P., Gallardo, C., Gualdi, S., Goncalves, M., Jorba, O., Jordà, G., L'Heveder, B., Lebeaupin-Brossier, C., Li, L., Liguori, G., Lionello, P., Maciàs, D., Nabat, P., Önol, B., Raikovic, B., Ramage, K., Sevault, F., Sannino, G., Struglia, M. V, Sanna, A., Torma, C., and Vervatis, V.: Med-CORDEX Initiative for Mediterranean Climate Studies, B. Am. Meteorol. Soc., 97, 1187–1208, https://doi.org/10.1175/BAMS-D-14-00176.1, 2016.

Sanchez Lozano, J. L., Rojas Lesmes, D. J., Romero Bustamante, E. G., Hales, R. C., Nelson, E. J., Williams, G. P., Ames, D. P., Jones, N. L., Gutierrez, A. L., and Cardona Almeida, C.: Historical simulation performance evaluation and monthly flow duration curve quantile-mapping (MFDC-QM) of the GEOGLOWS ECMWF streamflow hydrologic model, Environ. Model. Softw., 183, 106235, https://doi.org/10.1016/j.envsoft.2024.106235, 2025.

Senatore, A., Mendicino, G., Gochis, D. J., Yu, W., Yates, D. N., and Kunstmann, H.: Fully coupled atmosphere-hydrology simulations for the central Mediterranean: Impact of enhanced hydrological parameterization for short and long time scales, J. Adv. Model. Earth Syst., 7, 1693–1715, https://doi.org/10.1002/2015MS000510, 2015.

Shahi, N. K., Polcher?, J., Bastin, S., Pennel, R., and Fita, L.: Assessment of the spatio-temporal variability of the added value on precipitation of convection-permitting simulation over the Iberian Peninsula using the RegIPSL regional earth system model, Clim. Dynam., 59, 471–498, https://doi.org/10.1007/s00382-022-06138-y, 2022.

Smakhtin, V. U.: Low flow hydrology: a review, J. Hydrol., 240, 147–186, https://doi.org/10.1016/S0022-1694(00)00340-1, 2001.

Sofokleous, I., Bruggeman, A., Camera, C., and Eliades, M.: Grid-based calibration of the WRF-Hydro with Noah-MP model with improved groundwater and transpiration process equations, J. Hydrol., 617, 128991, https://doi.org/10.1016/j.jhydrol.2022.128991, 2023.

Sofokleous, I., Bruggeman, A., and Camera, C.: The Role of Parameterizations and Model Coupling on Simulations of Energy and Water Balances – Investigations With the Atmospheric Model WRF and the Hydrologic Model WRF-Hydro, J. Geophys. Res.-Atmos., 129, e2023JD040335, https://doi.org/10.1029/2023JD040335, 2024.

Somot, S., Coppola, E., Solmon, F., Jordà, G., Sannino, G., Ahrens, B., Sevault, F., and Reale, M.: Med-CORDEX phase 3: Common protocol for the Baseline runs for the CORDEX-CMIP6 framework, https://doi.org/10.5281/zenodo.11659642, 2024.

Spearman, C.: The Proof and Measurement of Association between Two Things, Am. J. Psychol., 15, 72–101, https://doi.org/10.2307/1412159, 1904.

Stacke, T. and Hagemann, S.: HydroPy (v1.0): a new global hydrology model written in Python, Geosci. Model Dev., 14, 7795–7816, https://doi.org/10.5194/gmd-14-7795-2021, 2021.

Storto, A., Hesham Essa, Y., de Toma, V., Anav, A., Sannino, G., Santoleri, R., and Yang, C.: MESMAR v1: a new regional coupled climate model for downscaling, predictability, and data assimilation studies in the Mediterranean region, Geosci. Model Dev., 16, 4811–4833, https://doi.org/10.5194/gmd-16-4811-2023, 2023.

Struglia, M. V., Mariotti, A., and Filograsso, A.: River Discharge into the Mediterranean Sea: Climatology and Aspects of the Observed Variability, J. Climate, 17, 4740–4751, https://doi.org/10.1175/JCLI-3225.1, 2004.

Struglia, M. V, Anav, A., Antonelli, M., Calmanti, S., Catalano, F., Dell'Aquila, A., Pichelli, E., and Pisacane, G.: Impact of spatial resolution on multi-scenario WRF-ARW simulations driven by the CMIP6 MPI-ESM1-2-HR global model: a focus on precipitation distribution over Italy, Geosci. Model Dev., 18, 6095–6116, https://doi.org/10.5194/gmd-18-6095-2025, 2025.

Suárez-Almiñana, S., Pedro-Monzonís, M., Paredes-Arquiola, J., Andreu, J., and Solera, A.: Linking Pan-European data to the local scale for decision making for global change and water scarcity within water resources planning and management, Sci. Total Environ., 603–604, 126–139, https://doi.org/10.1016/j.scitotenv.2017.05.259, 2017.

Tolson, B. A. and Shoemaker, C. A.: Dynamically dimensioned search algorithm for computationally efficient watershed model calibration, Water Resour. Res., 43, https://doi.org/10.1029/2005WR004723, 2007.

Verma, K. and J., I.: Applicability of SWOT data in calibrating WRF-Hydro hydrological model over the Tawa River basin, Geocarto Int., 38, 2185292, https://doi.org/10.1080/10106049.2023.2185292, 2023.

Vogel, R. M. and Fennessey, N. M.: Flow-Duration Curves. I: New Interpretation and Confidence Intervals, J. Water Resour. Plan. Manage., 120, 485–504, https://doi.org/10.1061/(ASCE)0733-9496(1994)120:4(485), 1994.

Wang, W., Liu, J., Li, C., Liu, Y., Yu, F., and Yu, E.: An Evaluation Study of the Fully Coupled WRF/WRF-Hydro Modeling System for Simulation of Storm Events with Different Rainfall Evenness in Space and Time, Water, 12, https://doi.org/10.3390/w12041209, 2020.

Wu, H., Kimball, J. S., Li, H., Huang, M., Leung, L. R., and Adler, R. F.: A new global river network database for macroscale hydrologic modeling, Water Resour. Res., 48, https://doi.org/10.1029/2012WR012313, 2012.

Yamazaki, D., Oki, T., and Kanae, S.: Deriving a global river network map and its sub-grid topographic characteristics from a fine-resolution flow direction map, Hydrol. Earth Syst. Sci., 13, 2241–2251, https://doi.org/10.5194/hess-13-2241-2009, 2009.

Yamazaki, D., Kanae, S., Kim, H., and Oki, T.: A physically based description of floodplain inundation dynamics in a global river routing model, Water Resour. Res., 47, https://doi.org/10.1029/2010WR009726, 2011.

Yamazaki, D., Baugh, C. A., Bates, P. D., Kanae, S., Alsdorf, D. E., and Oki, T.: Adjustment of a spaceborne DEM for use in floodplain hydrodynamic modeling, J. Hydrol., 436–437, 81–91, https://doi.org/10.1016/j.jhydrol.2012.02.045, 2012.

Yamazaki, D., de Almeida, G. A. M., and Bates, P. D.: Improving computational efficiency in global river models by implementing the local inertial flow equation and a vector-based river network map, Water Resour. Res., 49, 7221–7235, https://doi.org/10.1002/wrcr.20552, 2013.

Yamazaki, D., Ikeshima, D., Sosa, J., Bates, P. D., Allen, G. H., and Pavelsky, T. M.: MERIT Hydro: A High-Resolution Global Hydrography Map Based on Latest Topography Dataset, Water Resour. Res., 55, 5053–5073, https://doi.org/10.1029/2019WR024873, 2019.

Yilmaz, K. K., Gupta, H. V., and Wagener, T.: A process-based diagnostic approach to model evaluation: Application to the NWS distributed hydrologic model, Water Resour. Res., 44, https://doi.org/10.1029/2007WR006716, 2008.

Yu, E., Liu, X., Li, J., and Tao, H.: Calibration and Evaluation of the WRF-Hydro Model in Simulating the Streamflow over the Arid Regions of Northwest China: A Case Study in Kaidu River Basin, Sustainability, 15, https://doi.org/10.3390/su15076175, 2023.

Yue, S., Pilon, P., and Cavadias, G.: Power of the Mann–Kendall and Spearman's rho tests for detecting monotonic trends in hydrological series, J. Hydrol., 259, 254–271, https://doi.org/10.1016/S0022-1694(01)00594-7, 2002.

Zavatarelli, M., Raicich, F., Bregant, D., Russo, A., and Artegiani, A.: Climatological biogeochemical characteristics of the Adriatic Sea, J. Mar. Syst., 18, 227–263, https://doi.org/10.1016/S0924-7963(98)00014-1, 1998.

Articles

Short summary

Predicting how much water flows from rivers into the Mediterranean is challenging due to climate change and human impacts. We compared two hydrological models – a global river routing model and a fully coupled land surface–hydrology model – to assess their performance. The coupled model, especially after calibration, better reproduces river discharge and seasonal flow, helping improve flood and drought planning.