Parametrization of lakes water dynamics in the ISBA-CTRIP land surface system (SURFEX v8.1)

Lakes are of fundamental importance in the Earth system as they support essential environmental and economic services such as freshwater supply. They also modify the local hydro-meteorological continuum as a lower boundary of the atmosphere. Sentinels of climate change and anthropization, these open water bodies are facing disruptions of their equilibrium generally leading to a notable reduction of their levels worldwide. Stream-flow variability and temporal evolution are impacted by the 5 presence of lakes in the river network, therefore any change in the lake state can induce a modification of the regional hydrological regime. Despite the importance of the impact of lakes on hydrological fluxes and the water balance, a representation of the mass budget is generally not included in climate models and global scale hydrological modeling platforms. The goal of this study is to introduce a new lake mass module, MLake (Mass-Lake model), into the river routing model CTRIP to resolve the specific mass-balance of open water bodies. Based on the inherent CTRIP parameters, the development of the non-calibrated 10 MLake model was introduced to examine the influence of such hydrological buffer areas on the global scale river routing performances. In the current study, an off-line evaluation was performed for four river networks using a set of state-of-the-art quality atmospheric forcings and a combination of in situ and satellite measurements for river discharge and lake level observations. The results reveal a general improvement in CTRIP simulated discharge and its variability, while also generating realistic lake 15 level variations. MLake produces more realistic streamflows both in terms of daily and seasonal correlation. Excluding the specific case of Lake Victoria having low performances, the mean skill score of Kling-Gupta Efficiency (KGE) is 0.41 while the Normalized Information Contribution (NIC) shows a mean improvement of 0.56 (ranging from 0.15 to 0.94). Streamflow results are spatially scale-dependent, with better scores associated with larger lakes, and increased sensitivity to the width of the lake outlet. Regarding lake levels variations, results indicate a good agreement between observations and simulations with 20 a mean correlation of 0.56 (ranging from 0.07 to 0.92) which is linked to the capability of the model to retrieve seasonal variations. Discrepancies in the results are mainly explained by the anthropization of the selected lakes which introduces highfrequency variations in both streamflows and lake levels that degraded the scores. Anthropization effects are prevalent in most of the lakes studied, but they are predominant for Lake Victoria and are the main cause for relatively low statistical scores for this river. However, results on the Angara and the Neva rivers also depend on the inherent gap of ISBA-CTRIP processes 25 1 https://doi.org/10.5194/gmd-2020-296 Preprint. Discussion started: 10 November 2020 c © Author(s) 2020. CC BY 4.0 License.

these, there was a lack of consideration of lake water mass dynamics (Gronewold et al., 2020) because of both the coarse resolution of global models and the associated increased computational costs. Global Climate Models (GCMs) usually consider lake energy budget without giving much importance on the river-lake connectivity, even if key regions in climate studies such as Scandinavia and Northern America are mainly dependent on this. General Hydrological Models (GHMs) usually represent 95 lakes as large rivers with modified characteristics in order to retrieve the correct downstream river discharge. To address a comprehensive outcomes resulting from long-term water cycle evolution, GHMs need to characterise every key component interacting with each other (Gronewold et al., 2020). In the recent years, many studies have focused on anthropogenic open waters (Hanasaki et al., 2006;Haddeland et al., 2006;Gao et al., 2012) with less attention devoted to the understanding of natural lake global influence on the global water cycle. All this advocates for a realistic representation of lake mass balance 100 in climate studies in order to study their role in the global water budget in addition to flood risk management, drought predictions and in helping stakeholders to implement realistic policies in water resource management. To our knowledge, only a few models consider specific processes driving lake mass balance (Table 1). These models have been used for improving flood forecasting (Zajac et al., 2017), assessing the impact of lakes on river streamflows (Huziy and Sushama, 2017) and understanding the impact of open water bodies in the regional water cycle (Bowling and Lettenmaier, 2010). The main outcome of 105 these studies is that it is necessary to implement lakes in a hydrological model as they affect both the regional and global water transfer. Nonetheless, even the latest research efforts remain at a coarse resolution, which limits the number of lakes that can be represented. These models are often calibrated in order to retrieve local water patterns which limit the ability to implement such schemes at the global scale. Finally and to our knowledge, no mass-balance lake models are effectively integrated within the land surface system for use in climate modelling and global hydrological applications.  ). This climate model contains an improved representation of the coupled thermal and hydrological processes of the land surface called ISBA-CTRIP (Decharme et al.,115 2019). This system is based on the coupling between the Interaction-Sol-Biosphère-Atmosphère (ISBA) land surface model (Noilhan and Planton, 1989) and the CNRM version of the Total Runoff Integrating Pathways (CTRIP) river routing model (Oki and Sud, 1998;Decharme and Douville, 2007). Thanks to recent developments described in detail in Decharme et al.
(2019), CTRIP is now one of the only global model representing the joint effect of floodplains and groundwater on the surface water and energy budget in a climate model. However, the representation of lakes in the model is limited to the resolution of 120 the energy budget by the bulk-model FLake (Mironov, 2008) which does not take lake mass fluxes into consideration.
The purpose of the study is to implement lake processes in the CTRIP river routing model. This paper will examine the impact of introducing this non calibrated lake model MLake at the global scale on river discharge. It will also assess the performance of retrieving correct water storage variations by comparing observed and simulated lake level variations. To do limit in resolution for the physical processes in the current CTRIP model (otherwise, changes to the module formulation and introduction of hydrodynamic processes would likely be necessary). Within the system, ISBA simulates runoff and drainage in response to atmospheric forcing whilst CTRIP, the river routing model, transfers water through the hydrographic network of the resolved watersheds. Note that there are challenges to evaluating such a new model, since global lake datasets remain scarce 130 or incomplete. This is mainly explained by the extensive detailed field measurements required, such as bathymetry profiling and the associated costs (Hollister and Milstead, 2010). This study tries to overcome these limitations by using inherent CTRIP parameters like the river channel width at the lake outlet, which obviously leads to uncertainties. Sensitivity tests are done by prescribing different outlet width configurations and then studying their impact on both the river streamflow and the lake level amplitude and variability for multiple study sites. 135 After presenting the framework in which the lakes are implemented and the datasets used for evaluation and global estimations, Section 2 describes the method for implementing lakes and the methodology chosen to answer the scientific questions.
Section 3 presents the study sites and their main characteristics. Section 5 shows the effects of lakes on downstream flows in the selected watersheds, first compared to a reference ISBA-CTRIP run then to gauge station measurements. Although a 140 sensitivity test is run on the only adaptable parameter, the lake outlet width. Discussions on the main limitations are presented in Section 6. Finally, Section 7 concludes and gives the main findings and perspectives on further developments. The ISBA-CTRIP system (https://www.umr-cnrm.fr/spip.php?article1092; accessed: 01/09/2020; Decharme et al., 2019) sim-145 ulates the surface energy and water budgets for large scale climate and hydrological applications. A schematic of this coupled model is shown in Fig. 1 Spatially distributed, this model has been evaluated globally in off-line mode (i.e. decoupled from the atmosphere and forced at the upper boundary using an optimal blend of observations and numerical weather prediction output) using two sets of atmospheric forcings against in situ measurements and satellite products. The most significant results show improvements in the river discharge simulations, the snowpack representation and the land surface evapotranspiration 150 . Recently, the updated version of ISBA-CTRIP, considering improvements such as wild fires and land cover changes (Séférian et al., 2019), has also shown a better representation of global-scale carbon pools and fluxes (Delire et al., 2020).
Originally the land surface model ISBA simulated several key land surface variables such as surface runoff or soil moisture 155 in response to atmospheric forcings based on a force-restore approach. This scheme represents land processes as a single soilvegetation-snow continuum limiting the prediction of root layer droughts and the heterogeneity of soil properties. Currently, the diffusive version of ISBA is used for hydrological and climate modelling applications. It explicitly resolves both the onedimensional Fourier and Darcy laws for subsurface thermal and mass fluxes, and it accounts for the hydraulic and the thermal the effects of soil organics on the thermal and hydrological properties of the soil. The snow is simulated using a multi-layer snow model based on the work of Boone and Etchevers (2001) which recent improvements in physics and increased vertical resolution as described in Decharme et al. (2016).
ISBA is fully integrated within the surface modelling platform SURFEX (v8.1) (Masson et al., 2013;Le Moigne et al., 2020) 165 developed at the CNRM in order to bring all the models related to the surface parametrisation into one unique software platform. SURFEX allows studies to be performed in offline mode or fully coupled to an atmospheric model extending -de factoits applicability range from local hydrological to large scale climate studies. The distinction of such land processes in SURFEX comes from the global land cover database ECOCLIMAP-II which dynamically renders the type of vegetation and its cover at the chosen spatial resolution of the model for a given application (Masson et al., 2013;Faroux et al., 2013).

170
ECOCLIMAP-II is a 1 km resolution land-use and land-cover database based on satellite products designed for operational and research numerical weather prediction, climate modelling, hydrological forecasting, and in land surface numerical studies within the SURFEX surface modelling platform (Le Moigne et al., 2009). ECOCLIMAP-II details whether a pixel contains one of the four different type of covers: lake, town, land or ocean, and it distinguishes hundreds of plant functional types 175 representing a large variety of ecosystems (Faroux et al., 2013). SURFEX further aggregates the initial covers into upwards of 20 patches which correspond to different land covers or plant functional types. The orography is extracted and upscaled from the 90 m resolution Shuttle Radar Topography Mission to a 1 km-resolution (Werner, 2001). The ECOCLIMAP-II lake cover scheme provides binary information on the presence or not of a lake in the pixel. No other information is provided thus lake cover information is completed with the Global Lake DataBase (GLDB, Kourzeneva et al., 2012;Choulga et al., 2014) which 180 has gridded in situ and estimated lake mean depth at 1 km resolution globally. This global database has been developed to gather lake information and retrieve mean depth information for numerical weather prediction. It already serves as input for correcting land cover used by SURFEX for approximately 15,000 lakes on a 1-km resolution grid. However, a dataset threshold is introduced on lake detection and set at a surface area of 1 km 2 that limits the number of lakes considered in our calculations.
In this research, we used continuous mean depth field recently developed at ECMWF (Choulga et al., 2019) to ease aggregation 185 technique from 1-km to 1/12°. Streamflow routing is simulated using CTRIP (Fig.1) which integrates a dynamic resolution of river flows on each cell grid based on the Manning equation which is dependent on the characteristics of the river section. CTRIP is fully coupled to SUR-FEX and considers the interaction between the rivers, the atmosphere and the soil through the input by CTRIP which then 190 computes the river discharge, water table evolution and surface flooded fraction. Moreover, it explicitly accounts for groundwater processes with the integration of a two-dimensional diffusive aquifer scheme connected to rivers and a parameterization of the capillary fluxes within the soil (Vergnes et al., 2014). Descriptions of the parameterization of flooding processes can be found in . The coupling of ISBA and CTRIP is made through the OASIS3-MCT coupler (Voldoire et al., water table height or floodplain fraction. In addition to the fully coupled configuration, CTRIP can be used in an offline configuration forced by the runoff/drainage coming from ISBA (or other land surface model) simulations and without feedbacks between the water bodies and the soil processes. Further details on the physical processes are presented in Decharme et al. (2019).

200
Each CTRIP pixel represents an unique rectangular river section with its own characteristics. As shown in Fig. 2, instead of working directly with grid cell, each river section is integrated as a node in the network and all nodes are labelled sequentially.
Their number defines the position of the river section in the network for each hydrographic basin. The scheme increasingly iterates on this number and ensures all the upstream masses have been updated before the numerical resolution on a designated node of the network starts. This resolution framework assures the resolution is done from the upstream part to the downstream 205 part of the watershed. In every basin, the head-water cells have the lowest sequence order: one, which is incremented for each downstream node. The general rules of attribution consider that a node can receive water from multiple affluents but can not have multiple downstream sections. Considering the case of an affluent with multiple upstream nodes and in order to avoid conflicts at the confluence, the downstream sequence order SN downstream attribution follows the rule:

210
where SN i,upstream represents the sequence number of the upstream river, i, and SN downstream is the sequence number of the downstream river.
The main driver for the integration of new processes in CTRIP is to both simulate river discharge and to enable the quantification of the impact of climate change on drought and flood risk over the entire globe. It is also a valuable tool which gives 215 estimates of global water resources in the context of global depletion. Regarding the global water budget, the ISBA-CTRIP model improves the simulations of both peak discharges and baseflow, in addition to global terrestrial water storage variations.
However, Decharme et al. (2019) addressed the need to increase the resolution in order to avoid a sub-grid parameterization and in order to consider the water dynamics more precisely. Originally used at a resolution of 1°, then down-scaled at 0.5°, on-going improvements permit the model to run at its current resolution: 1/12°(approximately 6-8 km at mid-latitudes). This 220 resolution guarantees a better discretization of surface and subsurface processes without the need to implement additional river hydrodynamic processes. The river network at 1/12°has been derived by applying the Dominant River Tracing algorithm (DRT; Wu et al., 2012) on the high-resolution river network (3 arcsec) of MERIT-HYDRO (Yamazaki et al., 2019). CTRIP parameters describing river properties as well as floodplain and aquifer characteristics have been derived following the same methodology as for the 0.5°version of CTRIP (for details see Decharme et al. (2019)).

Lake Evaporation
Lake evaporation is simulated using the FLake model (Mironov, 2008). When considered together with the precipitation, an estimation of water mass exchange by the lake with the atmosphere can be made. FLake is a bulk-model capable of simulating the lake energy budget within the lake and at the lake-atmosphere interface (Mironov et al., 2010). FLake is designed mainly for use in numerical weather prediction and climate studies where it helps determining the vertical lake temperature structure, the 230 mixing conditions and the retroactions with the local/regional climate (Balsamo et al., 2012;Le Moigne et al., 2016;Salgado and Le Moigne, 2010). FLake is based on a numerical resolution of a two-layer parametric evolution of the temperature profile and the integral budgets of heat and kinetic energy. The mixed layer is characterized by a uniform temperature and an entrainment equation which estimates the layer depth. Below this first layer, the vertical temperature profile is parameterized in order to represent the thermocline shape based on a self-similarity concept (Kitaigorodsky and Miropolsky, 1970). This model 235 uses external parameters of which the most important ones are the lake mean depth and the extinction coefficient (set to 0.5 m −1 following Le Moigne et al. (2016)). The numerical resolution is based on the evolution of four lake prognostic variables: the surface temperature, the lake bottom temperature, the thickness of the mixed-layer, the shape factor and one parameter: the mean lake depth. An extensive description of the model can be found in Mironov (2008). Before implementing the numerical representation of lake dynamics into the CTRIP model, lakes need to be introduced in the river network at 1/12°. However the ECOCLIMAP-II provides binary information of the lake detection at 1/120°meaning the information needs to be upscaled to the CTRIP resolution. The method is based on a recursive aggregation of neighbouring lake pixels which depends on the GLDB mean depth. In other words, for every pixel at 1/120°, the algorithm scans the surrounding 245 pixels and aggregates those that are connected and have the same mean depth. Each aggregated lake is then identified with a unique number used further when attributing inherent parameters and variables.

MLake: A global scale mass balance lake model
This method is developed for large lake identification but struggles in the regions with a high density of small lakes, e.g.
Finland. For example, estimated lake mean depth in all Boreal zone is based on geological method taking into account a 250 tectonic-plate map and geological maps (Choulga et al., 2014). The geological method assumes that lakes of the same origin and region should have the same morphological parameters, e.g. mean depth. In our study small lakes tend to be aggregated as a unique larger lake that might not represent the local morphology. These anomalies can modify the local hydrology, however considering the scale of the current study, these effects are limited or even can be filtered by averaging.

Integration of lakes in the river routing model (RRM)
At the model resolution of CTRIP, a unique river stretch is attributed to each grid cell. Replacing a river pixel by a lake follows the same logic as water transfer, which is dependent on the riparian topography and its location within the watershed. However, integrating a component which can cover more than one grid cell in the CTRIP river networks is not straightforward. Huziy and Sushama (2017) proposed a distinction between local lakes, covering at least 60 % of a grid cell, and global lakes, which can cover several grid cells. This distinction brings some dynamic limitations as a local lake can only be an extension of the 260 river section that contributes to the downstream flow without being fed by the river itself. On the contrary, a large lake is part of the river network and divides the river in an upstream section that contributes to the total lake inflow, and a downstream section connected to the lake, receiving its mass from the lake outlet. However, it is important to keep a unique method that can adapt to all lakes regardless of their size.

265
Some issues related to the integration of lakes in the river network emerge when considering that lakes add a spatial dimension to the network linked to the fraction of pixel covered. First, the model must estimate a correct partitioning of the runoff between rivers and lakes when both components are located on a pixel. At the 1/12°, a lake can cover a small fraction of the pixel while being actually part of another watershed. This is the case for the Lake Bourget (France, Fig. 3) where a river that flows on another watershed contains most of the runoff of the pixel whilst the lake only captured a small amount of water that is part of 270 the lake watershed. The other issues concern the location of the lake in the river network and which river stretch is actually a part of lake. In some regions, river stretch can be large and have a dynamic close to a lake. Consequently, finding a compromise between the lake spatial extension at different resolutions and the actual lake hydrological dynamic is important. The approach used herein to resolve this issue is to replace a river section by a lake pixel (corresponding to a unique node in the network) when a lake covers at least 50 % of a unique grid cell (Fig. 2). Wherever a lake spreads over several grid cells, two distinct lake 275 masks are necessary. This is important, on the one hand, to ensure that the water flux remains realistic and on the other hand, the introduction of lake mass dynamics should not significantly change the local hydrology.
The lake runoff mask creation is based on the lake information coming from ECOCLIMAP at 1/120°resolution. In fact, this runoff mask corresponds to every CTRIP pixel at the 1/12°resolution that contains at least one ECOCLIMAP lake pixel (at 280 the 1/120°resolution). In other words, this is a mask of the lake fraction at 1/12°without any distinction of the watershed nor the lake fraction. It provides information on the spatial extension of the lake within the river network, and it is used for computing the water mass intercepted by the lake from the LSM (as runoff and drainage). Thus, a second lake mask, called the "network mask" is needed to locate the lake within the river network and to link the considered lake to the correct river.
In CTRIP, an identification number is assigned to every river that allows a distinction between rivers of the same watershed.

285
This identification number comes from the upscaling of the 90 m resolution MERIT HYDRO (Yamazaki et al., 2019). To do this scaling, a function recursively determines every lake pixel at a 1/120°resolution that covers a river stretch of the MERIT HYDRO river network with the same identification number as the river that flows at the outlet. The network mask ensures that all of the lake pixels with the same ID number are coupled within a unique mass-balance process. At the end of each time step, diagnostic variables are distributed on this mask. This method ensures all freshwater lake pixels are effectively linked to the 290 correct river within the entire network and that water mass flowing in a different watershed is not entering the lake.

Lake model
The MLake mass balance equation is based on the difference between the mass fluxes entering and leaving the lake. At each time step, the lake module calculates the prognostic net water storage V lake (kg) over the lake surface area based on the following equation: where t is the time (s), P ol is the over-lake precipitation term (kg.s −1 ), E ol is the over-lake evaporation term (kg.s −1 ), R and D are terms to account for respectively runoff and drainage estimated by ISBA (kg.s −1 ) over the runoff mask, Q in is the inflow entering the lake from the tributaries (kg.s −1 ), Q out is the lake outflow (kg.s −1 ) and Q gw represents the contribution of the lake-groundwater fluxes (kg.s −1 ).

300
The mass balance equation is numerically resolved in two steps: first an estimate of the incoming flows is computed and used to define an intermediate lake volume V * lake . Next, the outgoing water flow estimation based on this intermediate state in order to return to a new lake equilibrium state. Incoming flows consist in the contribution of both the riparian banks and the direct river inflows. The riparian bank runoff and drainage volumes are collected by the lake and computed over the runoff mask as 305 shown in Fig. 4 following the rules: where r S and d S represent the runoff and drainage fluxes, respectively, over the pixel p on the runoff mask ω. The specific inflows flowing into the lake are composed of all the upstream tributaries (with a lower sequence number) connected the network mask following the equation: where q in is the river discharge of the tributary number k and l is the total number of tributaries for the considered lake. Even if it is not applicable for long-term hydrological analysis, and due to a lack of knowledge on the large-scale process, the groundwater flux is often the missing term indirectly retrieved from the residuals of the mass balance resolution. The lateral and vertical groundwater fluxes are very sensitive to the spatial resolution (Reinecke et al., 2020). Groundwater-lake interactions 315 are generally better-understood locally (Bouchez et al., 2015), but the representation of such interactions at a larger scale can be difficult owing to a lack of understanding of the processes involved. As a consequence, only groundwater-river processes already present in the model are activated, meaning there is no interaction between groundwater and lakes which will be integrated in a further version of MLake.
where ∆t is the time step (s) and V (t − ∆t) is the lake volume at the previous time step t − ∆t (kg.s −1 ). Eq. 6 provides an with A ECO the lake area in the ECOCLIMAP-II database (m 2 ).
The outflow is, by definition, linked to the lake water storage assuming a rating curve relation based on an empirical weir relationship that links the discharge to the water head over the crest (Eq. 7). The outflow starts as soon as the lake height exceeds 330 the weir height. The discharge is then function of a hydraulic head which represents the height of water above the weir. This approach mimics the lake outlet dynamic as a waterproof basin which flows out through a counter-slope. The need to model outflow at the global scale restricts the complexity of the parametrization as it needs to take into account all lake types. At the current resolution of the model (i.e. 1/12°), the outlet is assumed to be small enough to be considered as a straight section connected to the downstream river without any friction and to have the same shape as the downstream rectangular river section.

335
This approach is represented in the Figure. 6 The outflow is calculated as: where C d a dimensionless coefficient related to the drag of the weir which is prescribed to be 0.485 (Lencastre, 1963), W weir 340 the width of the outlet equal to the width of the river in the downstream pixel (m), h weir the height of the weir (m) and ρ ω is the volumetric mass of the water (kg.m −3 ).
The river width was first determined over France by comparing the mean annual discharge measurements from the Banque Hydro database (www.hydro.eaufrance.fr) and the river width of the "Systeme Relationnel d'Audit de l'Hydromorphologie des 345 Cours d'Eau" (SYRAH) which leads to the empirical equation (Vergnes et al., 2014): where α and β are dimensionless parameters, respectively, equal to 5.41 and 0.59 (Vergnes et al., 2014). Q mean is the mean annual discharge of the river calculated over the climate period . This empirical exponential function has been The initial lake level is equal to the weir height, which results in an initial lake outflow equal to zero. Eq. 7 incorporates the dependence of the depth on the hydraulic head over the weir. The final lake volume for the time step (t) is derived from the 355 equation: Eq.
(2) calculates a change in lake water storage from which the diagnostic variables, such as surface area and lake level, are estimated. Numerous hydrological models assume the lake storage to be a linear function of the surface area and depth. This solution does not take into account the specific lake bathymetry, and it simulates a realistic hypsographic relation, thus the lake 360 surface area is supposed constant. However, knowing how the lake surface area varies with respect to depth is important for improving over-lake evaporation estimations. With regards to the relative scarcity of global-scale datasets on lake bathymetry, implementing appropriate lake hypsometric curves would require extensive developments that will be carried out in further studies. For simplicity, in the current study, hypsometric curves are assumed to be linear.

Algorithm description
Large scale hydrological simulations including lakes are generated over using three steps as presented in the Figure. 7. Among these, two steps are dedicated to rivers-lakes processes and the last one is organised in order to generate the forcing files from the SURFEX platform.

Preparation of forcings files
Runoff and drainage NetCDF forcing files are generated in offline mode from the land surface model ISBA within the modelling platform SURFEX (Section 2.1). A global FLake simulation allow the inclusion of over-lake evaporation in the forcing data prior to the generation of runoff/drainage. Forcing files are used within the resolution process to attribute inflows contribution to the mass balance (Eq. 3 and Eq. 4).

Initialisation of lakes
This step consists in an externalised procedure that creates a map containing physiographic information and initialised variables. This particular part is currently written with a mix of Python and Fortran90 (working with the Gfortran and Intel Fortran compilers). Several aspects of this step are related to the Section 2.3.1 and Section 2.3.2. NetCDF files are generated from the 380 integration of lake information, aggregated from the ECOCLIMAP database at 1 km resolution and down-scaled to 1/12 o , for the current CTRIP river routing network. Key parameters and variables generated by this pre-processing step are gathered in the Table 2.

385
Global scale NetCDF files containing the physiographic parameters and initial storage variables of the desired configuration are prepared during the initialisation. Thus, the computation of river discharge using Manning's equation and the temporal integration of the lake mass balance equation begins (see Section 2, Eq. 2 & Eq. 7). The numerical resolution and the water transfer is fully written in Fortran90 and divided in an initialisation stage which creates a subset, if necessary, of the global maps on the study zone (see Step 2 of the Fig.7). Then the resolution program spreads the runoff/drainage forcing data on the 390 river/lake network, created during the preparatory stage, and routes the water mass from headwater cells to the outlet of each basin. Diagnostic NetCDF files are written at the input time step (generally daily) and contains two diagnostic data type. On the latitude/longitude dimension, it gives the river discharge, storage, streamflow velocity and stream height on every pixel. On the dimension characterised by the lake ID number, it gives the outflow discharge and the level of lakes.

395
This model tends to be user-friendly with an optimised command interface, allowing users with limited support to operate the code. As soon as the code is compiled, the Fortran option namelist needs to be completed in order to give the desired configuration of the run (input zone, computational time step, diagnostic time step) with the technical support of the SURFEX website (https://www.umr-cnrm.fr/surfex/). Then both the preparatory stage and the master program are run by executing two ready-to-run shell scripts.

Study sites
Four watersheds have been selected in order to assess the impact of lakes on the regional scale hydrology. A map showing the location of the basins is presented in Fig. 8. They have been chosen based on several criteria: the size, their localisation in the drainage basin, and the climate characteristics (in order to assess the sensitivity of the model to different forcings conditions).
These characteristics are summarized in Table 3. The first watershed is the Rhône basin with its outlet located at Beaucaire 405 (France). Flowing from the Furka glacier in Switzerland to the Mediterranean Sea (Rhône Delta), the basin represents 17 % of the French metropolitan area. The Rhône is a socio-economic lever in terms of both quantitative (freshwater resource, industrial needs, sailing ...) and qualitative resource management (ecological state, tourism ...). In its upstream part, the streamflows are dependent on the glacier water supply whereas in its downstream part, the Mediterranean climate directly impacts the discharge and water level associated with flash flood risks. Therefore, these diverse forcings induce a bi-modal hydrological regime.

410
Within this watershed, five lakes are identified at the spatial resolution which must be resolved within the current study, among them Lake Geneva which is one of the largest European freshwater reservoirs with an average volume of 89 km 3 . With a relatively small drainage area compared to other lakes, Lake Geneva creates a link between the mountainous upstream and the fluvial downstream regimes. Located on the upstream part of the Rhône network, it also controls the streamflows and limits the floods during spring. Due to the importance of karstic structures on the downstream Rhône river and especially on the baseflow, 415 this basin is the only study site where the groundwater scheme has been activated.
The second watershed is the Angara river basin in Irkoutsk (Russia). The water mass flowing from Lake Baikal controls the streamflows of the Angara watershed which flows to its confluence with the Yenissei river at Strelka. This watershed was selected in order to study the specific hydrological conditions of Lake Baikal which waters freeze in winter and its prevalence on the regional hydrological system. Known both for its unique endemic ecosystem and its morphometric characteristics, 420 Lake Baikal is the deepest lake in the world (maximum depth of 1,632 m) and the second largest lake in terms of volume (approximately 23,600 km 3 ). One of the lake's characteristics is its surface freezing period (approximately five months) which contributes to its specific hydrological regime.
The third watershed is the upstream part of White Nile river in Jinja (Uganda). Characterized by a dry continental climate, the White Nile originates from the outflow of Lake Victoria, which is the second largest lake in the world (surface area 69,485 425 km 3 ). In contrast to lakes such as Lake Baikal, Lake Victoria has a relatively small drainage area (167,000 km 2 ) and its water balance is driven mainly by the precipitation and evaporation (Vanderkelen et al., 2018). Surrounded by the Great Rift Valley, it is a major socio-economic resource which directly supplies 30 million people and indirectly, over 300 million people living nearby the Nile. Since 1951, the outflow has been regulated by the Nalubaale dam, and a second dam was built in the 1990s by the World Bank. However, the regulation is controlled by an 'Agreed Curve' which intends to mimic natural outflow and links 430 the water releases to the lake levels.
The last watershed is the Neva river basin close to Saint-Petersburg (Russia). This relatively small river (74 km) is the main outlet of Lake Ladoga, the largest European lake. Neva is influenced by the Svir river, at the outlet of lake Onega, which is the second largest European lake. The surface area of these lakes are 17,800 km 2 and 9,800 km 2 (Filatov et al., 2019), respectively. The Ladoga hydrographic basin is complex and represents dozens of lakes that buffer the streamflows within the 435 basin. In addition, these lakes are located in the boreal zone, which are regions where the positive air temperature anomalies are the largest. Ladoga remains partly ice-free until early winter (the freezing season extends from November until the end of May), and therefore it has a significant impact on the regional meteorological conditions such as the enhancement of severe convective snowfall episodes (Eerola et al., 2014). In response, the water temperatures of the lakes, specifically those from the Onega, are sensitive to atmospheric changes because of their relatively low heat capacity (Filatov et al., 2016). The Ladoga 440 drainage area is approximately 97,800 km 2 and the Onega's is 51,540 km 2 . These lakes are particularly affected by changes in river runoff and studies show a decline in the lake levels owing mainly to a regulation of its flows (Hanasaki et al., 2006) and complex interactions with permafrost thawing due to climate change (Karlsson et al., 2015).  (Ledoux et al., 1989).

470
SAFRAN provides an analysis, based on optimal interpolation, of near-surface variables such as daily precipitation, 2-meter relative humidity, 2-meter air temperature, 10-meter wind speed, cloudiness and models visible/infrared radiative fluxes. The ISBA model is driven offline by SAFRAN analysis, and it computes the energy and water budgets in order to generate surface runoff, total evapotranspiration, soil moisture and drainage at an 8 km horizontal resolution. MODCOU uses surface runoff and drainage as inputs for river routing and aquifer water head simulations, respectively, over all of France. SIM also needs 475 physiographic parameters that describe the land cover, soil texture and orography of the studied zone. These parameters are provided by the ECOCLIMAP-II database. This physically based system has several applications in operational, research and climate services: it is used in flood risk forecasting, water resource management and climate projections (Soubeyroux et al., 2008). Further details about the model can

Global scale atmospheric variables
Biases may appear in simulated surface/sub-surface variables in response to specific atmospheric conditions. Uncertainties  2019) showed the better performance of the model using E2O forcings compared to PGF forcings, in particular in terms of river discharge scores mainly due to higher precipitation rates. The runoff estimation for the Angara, White Nile and Neva watershed used in the current study therefore come from the multi-layer diffusive ISBA forced by the ERA-Interim E2O 495 forcings.

Results
This study follows a two-step evaluation by assessing, first, the influence of lakes on the CTRIP streamflows simulation and second, the influence of the lake module on the performance of the model to retrieve streamflows and lake levels compared to the observations. In the following part of the results, particular attention has been paid to the model sensitivity to the lake outlet 500 width, which is the only adjustable parameter.

Impact of lakes on the ISBA-CTRIP simulations
A benchmark study to evaluate the influence of the new lake module on CTRIP-simulated streamflows was first performed consisting in four simulations which are summarized in Table 4 Fig. 9). A general reduction of river discharge variability is observed which is associated with a delay in reaching peak discharges. With the exception of Lake Victoria, lakes have relatively little impact on the time-averaged river discharge, however they significantly reduce the river discharge variability and timing compared to reference simulation ctrip_nolake. The average variability reduction over the four study sites is about 46 510 % (see Table 5 for a statistical summary of the benchmark runs) of the average discharge for the evaluation period 1983-2013.
There is a clear scale-dependence as larger lakes have stronger impacts on streamflows. For example, Lake Geneva reduces the Rhône river discharge variability on average by 22 % with an average discharge increased by 0.7 %, while the Angara river mean discharge decreases by 63 % due to the influence of Lake Baikal. This is explained by the contribution of the lake to the river: the Angara river is directly influenced by Lake Baikal outflows and has no other tributaries before the gauge station in 515 Irkoutsk. In contrast, approximately half of the Rhône discharge contributions at Beaucaire come from the part of the Rhône river flowing out of Lake Geneva, and while the remaining half comes from tributaries (Saone, Isere, Durance) which are not influenced by Lake Geneva. The implementation of lakes tends to smooth the hydrograph, reduce the volume of water transferred downstream during flood events, and increase low flows while conserving approximately the time-averaged discharge (see Table 6). Among the four study sites, the Angara and the White Nile are the most impacted rivers with a decrease of the 520 variability that reaches 55 % and 63 %, respectively.
These results show the sensitivity of the streamflow simulations in relation to the outlet width. As expected, the outlet modulates the water volume that flows into the river by diminishing the response time of the lake to the forcing (Fig. 9). More specifically for Lake Baikal, the variability is increased by 105 % in a configuration where the weir width is increased by a 525 factor of five compared to ctrip_mlake_w1. On the other hand, the weir width has little impact on the streamflow simulations of the Rhône river (the average standard deviation changes 3 %). However, increasing the outlet width improved streamflow dynamics and produced the discharge time series with the strongest decrease in the low-flow period and with quicker responses to the forcing in flood period. This behaviour can also induce a phase shift between outflows and inflows resulting in a period of no flows as seen for Lake Victoria in Fig. 9). Results for Lake Ladoga reveal a counter-intuitive pattern since the introduction 530 of lakes produces an early peak discharge (both in terms of high and low flows) instead of delaying them. Flood waves take some time to propagate through the river while no time delay has been considered for lakes. Combined with a wide weir (with high flow capacity), this tends to make flood waves propagate faster.

Comparison of simulations to observations
In this section, the influence of the lake on the CTRIP model has been assessed by comparing both lake water levels and river 535 discharges to measurements. In this context, the three simulations for each study site ctrip_mlake_w05, ctrip_mlake_w1 and ctrip_mlake_w5 were used with the same characteristics.

Lake water levels
In a basin where several lakes are present, the main lake, defined as the largest lake in terms of both drainage and surface area, is considered. Lake level outputs from the model are constant over all the network lake mask. Due to the initialisation 540 method (the height of the lake crest is equal to the mean depth of the lake), the diagnostic only indicates level variations over an equilibrium level assumed to be reached after a transitory time period. Variations have been assessed by centering these levels on the time-averaged levels of the lake over the period 1983-2013. Lake level variations are shown in Fig. 10.
All of the simulations for Lake Geneva, except for ctrip_mlake_w05, show an inability to capture the range of level varia-545 tions. This is due to peak levels that remain higher than observed levels. Even though the range is not correct, the model captures the seasonal variability with high lake levels associated with snow melting in spring, decreasing levels through summer/autumn and low flows in winter. Moreover, the minimum flow values are better represented in terms of magnitude compared to the peak discharges. Regarding lakes high levels, even if the timing is acceptable, simulations show a systematic over-estimation which can reach 1m for ctrip_mlake_w05. In terms of scores as shown in the Table 7, the correlation remains low (r = 0.28)  (Fig. 11) gives information on the better performance of the ctrip_mlake_w5 configuration which shows skill in retrieving both lake level variability and magnitude with a standard deviation ratio of α=0.9, while both ctrip_mlake_w1 and ctrip_mlake_w05 do not properly simulate the observed 555 lake dynamic (α = 2.4 and α = 3.5, respectively). The underlying reason is that the weir width is impacting the lake level dynamics with a level variability inversely proportional to the lake outlet width. This is physically correct as a larger outlet results in an attenuation of the time needed to transfer the mass from the entry of the lake to the outlet where a smaller outlet increases the retention capacity and the response time of the lake to the forcing. Likewise, the drainage area of Lake Geneva is relatively small thus the concentration time is small which results in a rapid response of the water dynamic to the regional forcings. Last 560 but not least, anthropization can have a significant impact on streamflow within the Lake Geneva basin, in addition to the lake itself since it is regulated by the Seujet dam in Geneva.
In contrast, the model results are much better for Lake Baikal and Lake Ladoga in terms of the seasonal variability and the timing of peak and low flows. For Lake Baikal, results are particularly good before 2002, year when a slight shift begins. Victoria is strongly affected by climate variables due to the predominance of its surface on the basin drainage area, lake surface area represents approximately 42 % of its drainage area. Added to this is a strong anthropization of the outflows which can 585 strongly affect the lake levels. It was therefore decided to focus the analysis on the period before 2004 when the lake is less impacted by the operating rules of its outlet. Even if the outflows are regulated, simulations exhibit good performances in terms of retrieving both the timing and the magnitude of the lake levels before 2004. Moreover, the high water levels in 1998 are well simulated with a peak discharge which is well represented (Fig. 10). The standard deviation over this period shows good results with an α = 1.1 (σ s = 0.37 m, σ o = 0.35 m). In addition, the correlation is very good over this period with a score of 590 0.83, while the RMSD stays low with an average of 0.36 m. The Taylor diagram for Lake Victoria exhibits a best performance of the ctrip_mlake_w1 to retrieve the pre-2004 lake levels. Both a larger or a smaller width deteriorates the correlation and increases the variability of the levels. However these impact are quite small, and the results on the pre-2004 period still gives acceptable scores.

595
The seasonality of the simulated lake levels shows a good agreement with observed levels in accordance with a relatively good correlation (shown in Fig. 12). Lake Geneva is the only lake which exhibits low quality level variations in contrast to Lake Baikal and Lake Ladoga which, despite a temporal shift of approximately two months mainly for low flow periods, shows strong correlation for the seasonal pattern. On these lakes, the model simulated the winter low flows well which were linked to soil freezing and low solar radiation (thus little to no melt), but also to the spring high water period resulting from snowmelt.

600
The strong decrease in Lake Victoria water levels during the period 2004-2006 does not significantly affect the climatological cycle which shows good agreement on the seasonal pattern in terms of representing the wet season high water levels and low flows occurring in October.
At every study site, ctrip_mlake_w5 remains the configuration with the lowest scores which is mainly caused by higher 605 water releases resulting in lower water level variations. On the other hand, both ctrip_mlake_w1 and ctrip_mlake_w05 show better agreement to capture the natural variations of lake levels even if local discrepancies, for example inability to capture high water levels on Lake Geneva or temporal shift for Lake Ladoga, appear.

Impact on river discharge simulations
The simulated daily river discharges for the four study sites are shown on the hydrograph in Fig. 13. Even though the Rhône  Improvements of the NSE and KGE are particularly high over the Rhône river (Table 6). Looking at the distribution of scores along the network, Nash-Sutcliffe scores are higher downstream compared to those at the lake outlet. Compared to the reference simulation ctrip_nolake, NSE scores increase by 19 % at Beaucaire (NSE = 0.69) while NSE_log increases by 88 % (NSE log = 0.64). Lakes introduce a better representation of extreme events on the Rhône river with even better improvements on sustaining low flows. The KGE score is more influenced by bias and variability than the NSE, which is weighted 630 more by the correlation scores. Over the Rhône river, KGE scores are slightly improved (by 13 % at Beaucaire: KGE = 0.85).
Both local and regional streamflow variability and magnitude are better when taking lakes into account with a slight tendency to over-estimate low flows. The NIC score has been calculated using the Nash-Sutcliffe coefficient in order to quantify the contribution of the lake model compared to the baseline scenario. It reveals a mean improvement of 25 % of the NSE scores at Beaucaire which further corroborates the positive effect of the inclusion of lakes dynamics. The lake outlet width which is half of the initial value leads to better results for every metric. The Rhône streamflows are globally improved with the magnitude depending on the location within the network. However, the high frequency dynamics at the lake outlet are not captured in any configuration. In terms of variability, lakes impact the number of peak discharge events and the volume of water transferred during these events. The hydrographs are then smoothed owing to the damping effects of lakes. This is reflected in the seasonal cycle with snowmelt occurring in spring with the greatest streamflows during winter associated with low flows due to mass 640 retention (by the snowpack) in the upstream area.
For the other catchments, the introduction of lakes has a rather small impact on the scores. The main improvements resulting from the inclusion of lakes is a better representation of variability. The analysis of the White Nile simulation is constrained by the discharge measurement availability. Lake Victoria is a buffer for watershed flows and its outflows follows the same pattern 645 as the atmospheric forcings with a succession of low flows during the dry season and peak discharge during the wet season.
Despite the Agreed Curve and the improvements resulting by including the lake model, CTRIP-MLake simulations do not capture well the peak discharge. However, the seasonality is well captured with the succession of increasing discharges during wet season and decreasing discharges during dry season. The effect of lakes is consistent with the Rhône results, but daily discharges are slightly under-estimated. Lake Victoria acts as a large retention area which sharply reduces and delays discharge  (Table 6). However, NSE and KGE scores are not worsened but remain very low and reveal 660 an inadequacy of the model to retrieve White Nile discharge at the outlet of the lake (NSE = -17.6, NSE log = -8.3,KGE = -2.9). The NIC scores show an average improvement of 78 % of the NSE due to the inclusion of lakes. ctrip_mlake_w05 is the configuration that gives the best results with an improvements of 93 % and a deviation ratio of 1.54. Even though there is room for improvement, the representation of Lake Victoria has a significant impact on the White Nile streamflows compared to the reference ISBA-CTRIP simulations.

665
The Angara basin is dominated by Lake Baikal which is the worlds largest freshwater continental water body. This river is also anthropized with three large dams and its outflows are, thus, regulated by man operating rules. Fig. 13 shows the relatively poor performance of the non-calibrated lake module to retrieve anthropized streamflows and only ctrip_mlake_w05 is producing a positive NSE value (Table 6). Even though the daily discharges are not well captured, simulated and observed streamflows show good agreement in terms of seasonality with peak discharges and low flows which are reasonably well represented in time and magnitude. ctrip_mlake_w1 captures the flow variability which is confirmed by a variability ratio of 1.02. With the exception of ctrip_mlake_w5, all of the configurations significantly improve the streamflows simulations.
This reduction of the peak discharge originates from the large retention capacity of Lake Baikal. With a volume of 23,260 km 3 , this lake has a water residence time of approximately 330 years which substantially affects the regional hydrology. The all lake configurations improve the scores but only ctrip_mlake_w05 gives a positive score (NSE =0.26) which is due to a better simulation of low flows (NSE log = 0.23). Even if the NSE is low, the NIC score shows an average improvement of 87 % (ranging from 0.75 to 0.94) owing to the introduction of the lake model. Lake Baikal is covered by ice generally from January to May-June, and it is surrounded by permafrost. This specific seasonal process is the main driver of the regional hydrological pattern with low flows during winter and high peak discharge caused by snowmelt during the Summer. Since the Nash-Sutcliffe 685 score is quite sensitive to flow peaks (making it especially sensitive to the seasonal snowmelt runoff in this basin), it makes more sense to limit the result to both the KGE and NIC performance. Thus, even if all lake configurations seem to improve streamflow simulations, ctrip_mlake_w05 produces the best results.
The Neva river takes its origin in Lake Ladoga which is itself fed by the Onega lake. This region is of particular interest 690 owing to the high lake density which strongly affects the streamflow dynamic. The inclusion of lakes in Scandinavia reveals a significant impact on streamflows as shown in Section 5.1. All lake configurations significantly improve the results in terms of volume transferred with a net decrease in the peak discharge. However, the hydrograph of the ctrip_mlake_w5 is characterized by an over-estimation of the peak discharges which is confirmed by a variability ratio of 1.47. On the other hand, the ctrip_mlake_w05 simulation tends to under-estimate the flow variability (α = 0.65). The average discharge generally 695 captures the observed flows with a difference of only 2 % (Q ctrip_mlake = 2538 m 3 s −1 , Q Obs = 2485 m 3 s −1 ). Adding lakes generally improved the simulated variability (σ ctrip_mlake = 634 m 3 s −1 , σ Obs = 655 m 3 s −1 ) with strong improvements for ctrip_mlake_w1 simulations (the variability ratio is 0.92). However, the hydrographs show a systematic time shift between the simulated and the observed daily discharges (Fig. 14). This shift has significant consequences on the performance metrics by deteriorating the correlation between simulated and observed discharges despite the overall reasonable fit to the hydrograph.

700
As shown in the Table 6, NSE scores are negative in two configurations (NSE = -0.71) and positive for ctrip_mlake_w05 with a strong improvement (NSE = 0.26). In contrast, the KGE score, which reduces the weight of correlation, shows a high score for ctrip_mlake_w05 and ctrip_mlake_w1 simulations (KGE = 0.19). The positive effect of including lakes on this Scandinavian region is supported by the NIC score (which evaluates the improvement brought by the lake module) which increased by 81 % for ctrip_mlake_w05. However, it worsened the streamflow simulations in the ctrip_mlake_w5 configuration (NIC = 0.15).
Seasonal discharge is presented in Fig. 14. Generally speaking, the introduction of MLake allows a better representation of the seasonal cycle of the river discharge for the four study sites. However, these improvements are heterogeneous along the basins. In terms of variability, the impact of including lakes on the seasonal cycle leads to the reduction of both the mean discharge and the temporal variability. For the Rhône basin, the introduction of lakes increased the low flows simulations and 710 reduced the river peak discharges. The best results are for Lake Baikal where the river discharges are sharply reduced and are much closer to those observed. The main result of the analysis of the seasonal cycle is the difficulty of ctrip_mlake_w5 to simulate well the river discharge, with the exception of Lake Geneva. However, it is not possible to point out if the main reason is strictly coming from the sensitivity to the weir width or another process that is not yet represented such as reservoirs water management.

Discussion
Simulations reveal the capability of the non-calibrated CTRIP-MLake system to capture lake level variations and to improve the simulated river discharge. However, it is important to note that, despite the explicit representation of some spatially-distributed processes within the model such as runoff, the model resolves an uni-dimensional water balance equation. This means lakes, regardless of their size, are represented as points in the network. This representation could be problematic for large lakes, 720 such as Lake Baikal, where wind-stress effects on the lake height or internal wave processes (which are not included in the model) could impact the overall lake dynamics. It also means the diagnostic variables are redistributed over the lake network mask and affect the regional performance by introducing local biases. Observed height differences over lakes can reach several meters from one shore to another depending on the wind stress and consequently influence relatively fast time variations of the river discharge. Thus, local and regional assessment would benefit from developing a specific diagnostic computation applied 725 for large lakes. A specific diagnostic could, for example, consider level differences from one shore to another in a simple way without the need to introduce hydrodynamic processes. However, within the scope of the current study, the inherent model gaps have relatively few impacts on lakes with a small surface areas or on regulated lakes. Therefore, long temporal series, such as those which are characteristic of climate studies, render these processes negligible. Daily comparison must be cautiously made and long-term, seasonal analysis must be prioritised. MLake is intended for use in long-term monitoring of large scale 730 basins with respect to the inherent framework of global scale climate studies. In addition, the use of a non-calibrated model restricts the performance for local daily evaluation as lakes are not impacted by rules based on observations that minimize errors through the modification of adaptable parameters.

Lake anthropization
Most of the world's rivers are anthropized leading to an additional number of factors that modify the natural land surface pro-735 cess variability and hydrological variables such as runoff and streamflow (Grill et al., 2019;Best, 2019). Lakes are no exception to this evolution and the lake level fluctuations seen in the current study are consistent with those resulting from worldwide anthropogenic regulation. Regarding the impact on the daily outflows, Lake Baikal seems to have less natural outflows since observed discharge is characterized high variability resulting from a strong anthropization which is impossible to simulate here. This pattern is mainly introduced by the joint effect of the three reservoirs constructed on the downstream river (Irkoutsk,

740
Bratsk and Ust'-Illim reservoir). Beyond a simple local effect, these reservoirs affect the regional streamflows with a signal seen as far as on the Yenisei river (Adam et al., 2007). Among these anthropogenic structures, the Irkoutsk dam, which is the principal regulator of the Angara dam chain. Located just 55 km away from the Baikal outlet, it has increased the lake storage capacity by 37 km 3 . One of the main objectives of this dam is to restrict outflows in order to limit peak discharge and to sustain baseflow while trying to keep a natural cycle characterized by high levels in autumn and low levels at the end of winter. In the 745 dry season, the regulation sustains river baseflow by decreasing of the outflow. During the wet season, regulations control the outflow in order to refill the lake. This specific variability can explain the fairly poor performance of NSE scores (which is significantly reduced by low correlation values). MLake currently does not take any operating rules into account and is intended to only retrieve natural streamflows produced by the hydrological components. The direct consequence of the Angara regulation is the increased river discharge associated with a decrease of the maximal variability by one third that leads to a significant rise 750 of the lake water level (Vyruchalkina, 2004). However, processes that allow water abstraction, include downstream irrigation demand or represent dam operating rules are not modelled which explains most of the discrepancies in both lake level and streamflow estimations.
Similar effects occur for Lake Geneva where the Seujet dam regulates the outflows and thus the lake levels. The dam effect on the seasonal cycle is particularly clear where a cut-off the high water levels occurs in Spring resulting in an absence of peak 755 discharge for the Rhône river. On the other hand, during the summer, the dam helps to sustain baseflows thereby mitigating the impact of droughts in the basin. An inter-comparison of the different study sites reveals that Lake Ladoga is the only lake that is not actually regulated. Neva is flowing on a natural riverbed which includes the gauge stations locations just before the river reaches Saint Petersburg. Despite the temporal shift inherent to CTRIP physical processes (discussed later in Section 6.2), the model evaluation using observations shows good agreement. Anthropogenic impacts on lake levels are reduced on larger lakes 760 since the ratio between regulation dynamics and the lake temporal response to the natural forcing are not in the same order of magnitude. The upstream influence of dams on river discharge does not significantly impact the performance skill.

Historical Lake Victoria level drops
Anthropization has a clear impact on Lake Victoria and can explain many of the discrepancies between model simulations and 765 observations in the current study. The unique gauge station with continuous data is located on the Nalubaale dam complex in Jinja (Uganda) just a few kilometers downstream of the Lake Victoria outlet. In 2000, a second dam was commissioned on the White Nile river at Jinja called the Owen Fall complex. Outflow from the complex is administered under an agreement called the "Agreed Curve" which restricts the outflow rate in order to mimic natural lake outflow. This makes it rather difficult to assess the impact of such anthropization downstream and to temper the conclusion and scores. Several studies have attributed the severe 2004-2006 Lake Victoria level decline to both historical regional drought and the impact of outflow deregulation (Kull, 2006;Sutcliffe and Petersen, 2007;Vanderkelen et al., 2018;Getirana et al., 2020). According to these studies, half of the decline could be the contribution of the dam over-release and the other half could be attributed to a severe drop of the runoff resulting in very low inflows in the lake. More specifically, Vanderkelen et al. (2018) determined that the PERSIANN-CDR precipitation estimation for the watershed decreased by 13 % compared to the time averaged precipitation over the period

Temporal shifts on simulated Boreal river discharges
Shifts between observed and simulated streamflows and levels are not only caused by anthropisation and forcing input, but obviously they also arise from inherent gaps in the ISBA-CTRIP model. Results for the Lake Ladoga outflow, which is relatively free of anthropogenic effects, show a systematic two-month temporal shift of early peak discharges compared to observations. zones. The neglect of these processes generally causes an early peak in snow melt runoff and therefore river discharge. The 800 use of the ISBA Multi-Energy-Budget scheme (Boone et al., 2017) could lead to improvements in estimations by attributing an independent energy budget to the vegetation, the snow and the soil. It also includes specific processes such as interception and unloading of the snow by the canopy. A recent study shows how this model improves the timing of snow melt timing at Boreal forest sites for time scales which are consistent with the errors identified in the current study (Napoly et al., 2020).
A simple sensitivity analysis has been performed in order to provide a broad evaluation on the model response to the width of the lake outlet. In general, the hydrological statistical evaluation metrics are improved with a reduced width and worsened with an increased width (above the initial baseline value). However, there are significant improvements in the response of the lake outflows during extreme flows when the weir width is larger. In these specific cases, reducing the weir width will increase the flood-wave buffer property of the lake. As explained in Decharme et al. (2019), river widths in CTRIP are computed based 810 on the annual mean discharge. The weir width corresponds to a fraction of the total lake circumference. In the model, lake morphometry is reduced to an equivalent lake circle for which the circumference is smaller than that of an equivalent (size) natural lake. This widening produces an overestimation of the outflows which can explain, in part, the better performance of smaller weir width simulations (such as the simulation labelled ctrip_mlake_w0.5 in Section 5.3). Global datasets with information on lake morphometry remain scarce and require extensive human intervention and financial resources. Furthermore,

815
it might be impossible to gather reliable information on such on unsteady parameter which depends on either the downstream river morphology and the actual water head over the crest. A possible improvement would be the estimation of a generic width as a fraction of the lake circumference by clustering lakes based on both the observed outflows and the morphometry type.
The characterization of the lake morphometry is one of the main sources of uncertainty in such models. First, natural lake 820 outlet widths are not constant and are generally a function of the lake level which induces potential outflow overestimation during low flows. The diagnostic level inferred in the model depend on the assumption of a linear lake hypsographic curve leading to the representation of the lake bathymetry as a cylinder. Bathymetry is of particular importance for lakes as it controls most of the inherent biological, physical and ecological processes (Blais and Kalff, 1995;Håkanson, 2005;Yao et al., 2018). In the current study, lake bathymetry influences the residence time and magnitude of variations in both levels and surface area.

825
This limitation will be addressed through the integration of a specific global scale hypsographic curve that can fit most of the lakes morphometry. In order to satisfy global scale properties, the lake morphometry computation must remain computationally efficient and thus should be based on a relatively simple hypothesis that can capture the wide diversity. To do so, a specific study should be done to create a global scale dynamic data set in the SURFEX platform to account for semi-permanent areas surrounding lakes. Further developments will also require the integration of a strategy to correct the land cover type of 830 these flooded areas during the dry season. The dynamic will allow the introduction of wetlands that would interact through sub-surface fluxes with the lake and permit an improved estimation of the evaporation.
Last but not least, evaporation estimations will gain in accuracy with a fully coupled MLake-FLake system that will simulate the feedbacks of the lake within the global hydrological cycle. At the moment, CTRIP is coupled with SURFEX through the 835 ISBA model but MLake is only available for off-line simulations.
Hydrological and meteorological developments based on global scale models are at the core of the CNRM research. Recent changes allow an increased number of processes to be integrated into both climate (CNRM-CM6) and hydrological models (ISBA-CTRIP). Following the recent updates of groundwater and floodplain processes in the land surface and hydrological 840 model ISBA-CTRIP, the purpose of this study was to evaluate the performance of the inclusion of a non-calibrated massbalance lake model in the modelled water cycle. This offline evaluation has been conducted over four river networks using a unique validated atmospheric forcing dataset and a combination of both in situ and satellite measurements for river discharge and lake level observations.

845
Even if the main responses of ISBA-CTRIP to the inclusion of the new lake model are different among the selected test basins, several key improvements to the model simulations were identified. The addition of lakes in the river network reduced the average variability of river discharge by 34 % which lead to a lower number of simulated peak discharges and improved baseflow compared to observations. The Kling-Gupta Efficiency score is not improved to the same degree for all four studied basins. However, improvements are notable on all basins except the White Nile basin. These improved performance was size 850 dependent and the introduction of larger lakes drastically improved the streamflow simulation metrics. The average KGE score for Lake Geneva was 0.81 while it was 0.18 for Lake Baikal. Note that the NIC score has also been estimated which is more appropriate for assessing the actual improvements owing to the introduction of MLake. The mean NIC score for this study is 0.57 compared to the reference run (interval of [0.15-0.93]). The NIC scores are very sensitive to the chosen lake and improvements are notable for Lake Baikal with a mean score of 0.87. The inter-annual variability was improved even if some 855 discrepancies appear such a persistent over-estimation of the mass transferred affecting low flow simulations: the average ratio of river discharge was 0.998 (interval of [0.86-1.45]) with a particularly good simulation of average discharge of the Angara river (a mean ratio of 0.99). Moreover, the model did not correct the early peak discharge in the boreal zone coming from precocious snowmelt, but new model physics to be introduced into ISBA-CTRIP should improve this. It is also important to point out that simple assumptions made for retrieving natural outflows imply that the model is unable to retrieve observed 860 anthropized high frequency dynamics. Regarding lake levels variations, MLake is capable of simulating realistic lake dynamics in all study sites with a mean correlation of 0.56 (ranging from an average of 0.28 for Lake Geneva to an average of 0.83 for Lake Victoria). These results are particularly encouraging for Lake Victoria since droughts are well represented compared to the observations.
The new model parameter is the lake outlet size and an improved method to monitoring or estimate its width could increase 865 significantly the hydrological statistical performance metrics. For the four lakes studied, the simulation with an initial weir width divided by a factor two showed the best scores for both river discharge and lake level simulations. This result needs to be extended to the global scale in order to characterise the systemic improvement for an ensemble of climate and physiographic conditions. Last but not least, the introduction of lakes in the river network is of particular importance in terms of global water flux as the lake water dynamics not only have an effect on the local hydrology but also the streamflow at the outlet of the basin.
Finally the most important characteristic of the model is it's ability to improve the seasonal cycle for both lake level and river discharge for every study site. The next step will be to see if this result holds for more basins over the globe.
Lake dynamics are sensitive to external stresses such as anthropization (dam operating rules, irrigation extraction or water abstraction) which limits the capabilities of the model to retrieve observed discharges for many basins. The lake impacts are 875 heterogeneous in time and space among the watersheds and limit the possibility of capturing a specific and systematic pattern on streamflows. The performance can not be improved in such cases without degrading simulations elsewhere and a specific reservoir model is necessary to correct streamflows locally. On-going research is focusing on creating a global reservoir system that will be added to MLake to improve the representation of dam operating rules. Simulations were also constrained by the off-line use of the energy budget model FLake which computes over-lake evaporation, and further improvements to the model 880 system will be made by coupling MLake to FLake (meaning extracting evaporation computed by FLake from MLake as a first step). In order to propose a fully coupled dynamic model and to take riparian land cover changes into account, the introduction of dynamic cover maps in the SURFEX modelling platform is necessary. This integration of a dynamic cover fraction will replace a proportion of land covered by lakes during periods of high levels, and, conversely, it will lead to an increase of the land fraction during low levels. Thus, it is a first step towards implementing semi-permanent waters as proposed in the map 885 by Pekel et al. (2016). Currently, the main goal of the developments described in this study are to fully couple MLake within the SURFEX platform for improved representation of lake dynamics in global scale hydrological and climate studies. This updated SURFEX-CTRIP-MLake system will help in different domains such as drought risk management and water resource management in a context of global water resource scarcity. It will also contribute as an important component of earth system models by permitting a long-term quantitative assessment of the fully-coupled global water cycle, it's trend and its inherent 890 variability. Within this high-resolution river routing, it is not just a new stage that has been reached in global scale hydrology but also the upper limit of model without considering hydrodynamic processes. Improving the resolution would lead to the inclusion of hydrodynamics processes such as currents and internal waves but would also need a discretization of the sections of the largest rivers.
Code and data availability. The SURFEX v8.1 model platform code, which contains the ISBA and FLake codes, is available in the supplement. All post-processing codes are also available. Finally, model output for all of the study sites and forcing data are available in the supplement. The supplement is available at: https://doi.org/10.5281/zenodo.4013873 1 Performance skill

Discharges evaluation
The simulated river streamflows were evaluated using a set of statistical metrics that are widely used in hydrological modelling.

900
The dimensionless Nash-Sutcliffe efficiency score (NSE) ranges over the interval [-∞;1]. It assesses the performance of a hydrological model by comparing the simulated discharges to a simple model composed of the time-averaged observations.
Positive values provide information on the ability of the model to retrieve the observed discharge dynamic. A value of one indicates a perfect fit between observations and model simulations and a zero-value indicates that the model is able to produce the average of the observations. It is expressed as where t is the time, n the number of values (time steps here), q s,t the simulated river discharge at the time step t, q o,t is the observed river discharge at the time step t and q o the time-averaged observed river discharge.
The second metric used is the logarithmic Nash-Sutcliffe efficiency score which gives more weight to a model's ability to 910 retrieve low flows (compared to flood peaks) and determines more accurately the model systemic over-estimation or underprediction during these periods: Even if very popular, this score is more sensitive to extreme values (e.g. in snowmelt-dominated basins, there can be a relatively high variability in surface runoff with a few large peaks dominating the NSE). In order to prevent these effects, other hydrolog-915 ical metrics are chosen, such as the Kling-Gupta efficiency score (KGE; Gupta et al., 2009;Kling et al., 2012) which equally weights three components: the linear correlation coefficient, r, the coefficient of variation, γ, and the relative variability, α.
In contrast to the NSE, it gives more weight to the bias and the variability at the expense of the correlation coefficient. It is expressed as where α = σs µs / σo µo . σ s and σ o represent the standard deviation of the simulated and observed river discharges, respectively, and µ s , µ o are the time-averaged simulated and observed river discharges, respectively.
Finally, in order to evaluate the lake model contribution to the model performance the Normalized Information Contribution (NIC; Kumar et al., 2009)) was used. This metric provides information on the improvement brought by the considered model 925 relative to the maxim possible score improvement: here the NSE for the reference simulation ctrip_nolake. A positive value gives a measure of the improvement of the model whereas a negative value indicates a degradation of the model performance.
The NIC score, in this study, is applied to the NSE as: where NSE ctrip_mlake is the N DE for the CTRIP-MLake simulations and NSE ctrip_nolake is the NSE corresponding to the 930 reference CTRIP_nolake simulations.

Lake level evaluation metrics
The ability of MLake to reproduce the level variations was assessed using metrics such as the linear correlation coefficient r and the Root Mean Square Deviation (RMSD), which gives an estimate of the quadratic mean of the difference between the predicted and the observed lake level variations: Step 1 : Initialisation of lakes Step 2: CTRIP  Figure 7. Road map of the ISBA-CTRIP-MLake modelling system. In each step, the name of the routine and their extension are in the upper left square, bold text represents the binary or netcdf output files used.   58 https://doi.org/10.5194/gmd-2020-296 Preprint. Discussion started: 10 November 2020 c Author(s) 2020. CC BY 4.0 License.