A hydrological model for root zone water storage simulation on a global scale

The soil water stored in the root zone is a critical variable for many applications as it plays a key role in several hydrological and atmospheric processes. Many studies have been done to obtain reliable soil water information in the root zone layer. However, most of them are mainly focused on the soil moisture in a certain depth rather than the water stored in the entire rooting system. In this work, a hydrological model is developed to simulate the root zone water storage (RZWS) on a global scale. The model is based on a well validated lumped model and has been extended now to a distribution model. To 5 reflect the natural spatial heterogeneity of the plant rooting system across the world, a key variable that influencing the RZWS, i.e. root zone storage capacity (RZSC), is integrated into the model. The newly developed model is evaluated on runoff and RZWS simulation across ten major basins. The evaluation of runoff indicates the strong capacity of the model for monthly simulation with a good performance on time series and distribution depiction. Results also show the ability of the model for RZWS dynamics mimicing in most of the regions. This model may offer benefits for many applications due to its ability for 10 RZWS simulation. However, attentions need to also be paid for application as the high latitude regions are not investigated by this work due to the incomplete latitudinal coverage of the RZSC. Therefore, the performance of the model in such regions is not justified.


Introduction
Soil moisture is one of the critical variables in Earth system dynamics (Sheffield and Wood, 2008) and is claimed to be an essential climate variable by the World Meteorologi- 35 cal Organization due to its key role in several hydrological and atmospheric processes (Legates et al., 2011). The soil water stored in the plant root zone is of great importance in some fields of application, e.g., agriculture, as it represents the reservoir of the plant-available water and mediates 40 numerous subsurface processes (Sabater et al., 2007;Wang et al., 2015;Cleverly et al., 2016). A fundamental limiting factor that constrains crop yields is the water resources in the root zone (Tobin et al., 2017). The water stored in the root zone is also directly linked with one of the important wa-45 ter resources for ecosystems, i.e., green water resources, as green water is defined as the water that originates from precipitation that is stored in the unsaturated soil and eventually consumed by plants through evapotranspiration (Falkenmark and Rockström, 2006;Liu and Yang, 2010).

G. Mao and J. Liu: WAYS
Obtaining reliable root zone water storage is still challenging, as it cannot be directly observed (González-Zamora et al., 2016). Satellite remote sensing itself can only detect the soil water at the surface layer (in most cases, with a depth of 5 cm) and has the shortcoming that it cannot look at the direct observation of root zone water storage for evaluation, in the field of hydrological modeling.
In this study, a global hydrological model is developed to simulate root zone water storage, a key variable for ecohydrological studies. Though many global hydrological mod-60 els (GHMs) have already been developed, most of them are similar regarding the general hydrological component simulations (Sood and Smakhtin, 2015), and the developed model has its unique scheme for root zone process depiction; thus, it enables RZWS simulations with the ability to consider the 65 global spatially heterogeneous rooting systems. The model has input requirements similar to most of the existing GHMs and can also generate general hydrological variables in addition to RZWS. Since it simulates RZWS, which is of great importance for both hydrology and ecology, it will be fur- 70 ther developed in the future for water-and ecosystem-related applications. The newly developed model is named the Water And ecosYstem Simulator (WAYS). The ultimate goal of this study is to test the feasibility of WAYS for RZWS simulation on a global scale, an added-value feature useful for 75 many applications.
2 Model description 2.1 General overview WAYS is a hydrological model implementation in Python. It is a process-based model that assumes water balance at 80 the grid cell level. The development of WAYS is based on a lumped conceptual model with an HBV CE2 -like model structure, called the FLEXCE3 model (Fenicia et al., 2011;Gao et al., 2014a). The FLEX model has been widely used and validated at the basin scale to simulate the soil moisture con-85 tent and root zone water storage (Gao et al., 2014b;Nijzink et al., 2016;de Boer-Euser et al., 2016;Sriwongsitanon et al., 2016). Benefiting from its flexible modeling framework, we have now extended it to a spatially distributed global hydrological model. In addition, some improvements have been 90 made to increase the model capacity at the global scale, e.g., a more sophisticated soil water storage capacity strategy and more land cover support.
WAYS is a raster-based model that calculates the water balance and simulates the hydrological processes in a fully 95 distributed way. It works on a daily time step, and the model structure consists of five conceptual reservoirs: the snow reservoir S w (mm) representing the surface snow storage, the interception reservoir S i (mm) expressing the water intercepted in the canopy, the root zone reservoir S r (mm) describ- 100 ing the root zone water storage in the unsaturated soil, the fast response reservoir S f (mm), and the slow response reservoir S s (mm). Two lag functions are applied to describe the lag time from the storm to peak flow (T lagF ) and the lag time of recharge from the root zone to the groundwater (T lagS ). In ad-105 dition to the water balance equation, each reservoir also has process functions to connect the fluxes entering or leaving the storage compartment (so-called constitutive functions). Figure 1 provides a schematic representation of how the vertical water balance is modeled in WAYS, and the basic equations are shown in Table 1. In Fig. 1, the flowchart represents 5 the conceptualized hydrological cycle in the model, and the schematic drawing shows the corresponding water fluxes and stocks in the real world. Since some of the fluxes are intermediate variables, they are shown in the flowchart but not visualized in the schematic drawing. For instance, R f is the gener-10 ated preferential runoff in the root zone layer before the split of the runoff into surface runoff and subsurface runoff. The effective precipitation P e is the sum of snowmelt and precipitation throughfall. The conceptualized hydrological cycle of the model can be briefly described as follows. The precipita-15 tion that can fall as rainfall or snowfall depends on the temperature. The snowfall will be stored in the snow reservoir, and the rainfall will be intercepted by the canopy before it reaches the surface. After the interception, the rainfall penetrates the canopy and reaches the surface as precipitation 20 throughfall. The effective precipitation that consists of the throughfall and the snowmelt will partially infiltrate into the soil, and the rest runs away as runoff. The runoff is then split into surface runoff and subsurface runoff depending on the texture. A part of the infiltration will be stored in the soil for 25 plants, and the rest will percolate into the deep soil and reach the groundwater table as groundwater recharge. The parameters that regulate the different simulation steps are described below, and the changes we made to the original FLEX model are highlighted. The original lumped model (FLEX) has 28 30 parameters in total that consider four land use types in the basin (Gao et al., 2014a). To reduce the computation cost of calibration and avoid overfitting issues at the global scale, some calibrated parameters are replaced by the empirical values from the literature, e.g., the snowmelt ratio F DD , the ca- 35 pacity of the interception reservoir S i,max , the groundwater recharge factor f s , and the maximum value of groundwater recharge R s,max .

Interception and snow routine
In the WAYS model, the precipitation is allowed to be inter-40 cepted by the canopy or stored as snow before entering into the root zone reservoir.
Interception occurs during the days with rain when the temperature is above the threshold temperature T t . The interception reservoir stores the precipitation intercepted by the 45 canopy before it reaches the soil that will directly evaporate back into the atmosphere. The canopy water balance equation is shown in Eq. (1), where the precipitation P (mm d −1 ) is the inflow, and the precipitation throughfall P tf (mm d −1 ) and the interception evaporation E i (mm d −1 ) are the out- 50 flows. The calculation of the precipitation throughfall P tf is simply based on comparing the rainfall P r (mm d −1 ) to the water already stored in the interception reservoir S i (mm) and the capacity of the interception reservoir S i,max (mm) (Eq. 2). In the FLEX model, the interception evaporation E i is as-55 sumed to be the potential evaporation, and the interception capacity is a calibrated parameter. In WAYS, the interception evaporation E i is calculated based on the potential evaporation E 0 (mm d −1 ), the storage of the interception reservoir S i (mm), and the interception reservoir storage capacity S i,max 60 (Eq. 3) following Deardorff (1978). The interception capacity E i,max is calculated by using Eq. (4), where m c is 0.3 mm and L is the leaf area index, which is calculated based on a modified phenology model in Jolly et al. (2005) obtained by replacing the original vapor pressure stress function with the 65 soil moisture in the model (Wang-Erlandsson et al., 2014).
The snow simulation is based on a simple degree-day algorithm (Rango and Martinec, 1995) that has been successfully applied in hydrological models in many studies (Comola et al., 2015;Bair et al., 2016;Krysanova and Hatter-70 mann, 2017). The water balance in the snow reservoir is described in Eq. (5), and the constitutive equations are shown in Eq. (6) in Table 1. Below the threshold temperature T t ( • C), the precipitation P (mm d −1 ) falls as snow P s (mm d −1 ) and is added to the snow storage S w (mm). Above the thresh-75 old temperature T t , snow melts if it is available at a certain ratio per degree (F DD ). Both the threshold temperature T t and the snowmelt ratio F DD are parameters calibrated in the FLEX model. Following Müller Schmied et al. (2014), T t is set to 0 • C, and F DD is set for different land cover classifi-80 cations from 1.5 mm d −1 per degree to 6 mm d −1 per degree in WAYS. It is also important to be aware that the snowmelt water is conceptualized in the model as directly infiltrating into the soil in the model, thus effectively bypassing the interception reservoir. 85

Root zone routine
The effective root zone routine is the core of the WAYS model. It controls both the evapotranspiration and the runoff generation by precipitation partitioning. Similar to the interception and snow routine, the change of root zone water stor-90 age S rz (mm) over time t (day) is described in Eq. (8), with effective precipitation P e (mm d −1 ) as the inflow and soil moisture constrained evaporation E a (mm d −1 ) and runoff R (mm d −1 ) as outflows. In the FLEX model, the runoff generation is calculated based on the widely used beta function 95 of the Xinanjiang model (Zhao, 1992), which is a function of the relative soil moisture in the unsaturated soil layer. The beta function for calculation of runoff in WAYS is replaced by a modified version from the work of Sriwongsitanon et al. (2016) to link the function to the water storage in the root 100 zone layer. Depending on the root zone water storage S rz , a part of the effective precipitation turns into runoff, and the rest infiltrates into the soil and recharges the root zone layer. The runoff coefficient is determined by both the relative soil water content S rz /S rz,max in the root zone and the shape pa-105 rameter β describing the spatial process heterogeneity over pixels at the global scale. The root zone storage capacity used in WAYS is derived by applying the method in Wang-Erlandsson et al. (2016), which calculates the soil moisture deficit based on satellite-based evaporation and precipitation, while it is a calibrated parameter in FLEX. 5 The soil moisture constrained evaporation, sometimes also known as actual evapotranspiration, is calculated as a function of the potential evaporation leftover E 0 − E i (mm d −1 ), the relative soil water content S rz /S rz,max , the shape parameter β, and the scale parameter C e , which indicates the frac-10 tion of S rz,max above which the transpiration is no longer limited by soil moisture stress. Since the root zone routine connects both the runoff and evapotranspiration, and the runoff generation function has been modified, the actual evapotranspiration function in WAYS is also accordingly revised from 15 the original one in the FLEX model (Sriwongsitanon et al., 2016). The scale parameter C e is set to 0.5 in the FLEX model when applied at the basin scale, and it becomes a calibrated parameter in WAYS at the global scale.

Slow response routine 20
The water balance in the slow response reservoir S s (mm) is simple, with the groundwater recharge R s (mm d −1 ) as the inflow and baseflow Q s (mm d −1 ) as the outflow (Eq. 11). The groundwater recharge R s is depicted in WAYS by applying the splitter function described in Eq. (12). It separates the 25 runoff into preferential flow and groundwater recharge based on the groundwater recharge factor f s , which ranges between 0 and 1. In WAYS, the amount of groundwater recharge is also limited by the maximum groundwater recharge R s,max (mm d −1 ) for each grid cell, which is specified by the soil 30 texture, while there is no constraint on the maximum value for groundwater recharge in the FLEX model. The values of R s,max used in this study are 7, 4.5, and 2.5 for sandy soil, loamy soil, and clayey soil, respectively, following Döll and Fiedler (2008). 35 The groundwater recharge factor f s is a calibrated parameter in the FLEX model, while in WAYS, it is now determined by applying the approach developed by Döll and Fiedler (2008), in which it is a function of global digital maps of the slope, soil texture, geology, and permafrost. The method 40 is simple and computationally inexpensive, and it was validated at the global scale in many subsequent publications, e.g., Döll et al. (2012Döll et al. ( , 2014. All the related parameters are provided in look-up tables in the work of Döll and Fiedler (2008), and the only changes we made are that the input data 45 of the groundwater recharge method, e.g., the global relief data and the global soil texture map, have been accordingly updated based on the newly available data (Hanasaki et al., 2018). The outflow of the slow response reservoir, i.e., the baseflow, is modeled with the function described in Eq. (13), 50 where the baseflow coefficient K s is set to 100 globally, following the work of Döll et al. (2003).

Fast response routine
The preferential flow R f (mm d −1 ) is routed directly into the fast response reservoir S f (mm), and it is divided into surface 55 runoff Q ff (mm d −1 ) and interflow Q f (mm d −1 ). The water balance in the fast response reservoir is shown in Eq. (15). In the FLEX model, it is assumed that the preferential flow is routed into the fast response reservoir based on a lag function that represents the time lag between a storm and preferential runoff generation. In WAYS, we have assumed that the preferential flow will route into the fast response reservoir directly without any delay globally, as it is run at the daily 5 timescale.
Similar to the slow response reservoir, the fast response reservoir is also set as a linear-response reservoir, representing a linear relationship between water storage and water release. The surface runoff generation is only active when the 10 storage of the fast response reservoir exceeds the specified threshold S ftr , with a generation ratio K ff (Eq. 16), while the interflow Q f is simply calculated in proportion to the already stored water in the fast response reservoir using the fraction of 1/K f (Eq. 17).

Additional model adaptation
In addition to the abovementioned model description, some modifications and assumptions are necessary to adapt the model to the global scale. In WAYS, the actual evaporation from open water bodies is assumed to be the poten-20 tial evapotranspiration, and the freezing of open water bodies is not considered in the model. Potential evapotranspiration is derived by the Hamon equation (Hamon, 1961) in the FLEX model, and it is now replaced by the using the Penman-Monteith FAO 56 PM method (Allen et al., 1998) 25 for the following reason. The Hamon method is found to have less robustness in different climatic conditions as well as drawbacks in the daily variability of the PET CE4 simulation due mainly to the relatively simple equation in the Hamon method, as it only employs the average air temperature as an 30 input (Bai et al., 2016;Droogers and Allen, 2002). In contrast, the Penman-Monteith FAO 56 PM method is based on fundamental physical principles and is found to be the most reliable method for potential evapotranspiration estimation when sufficient meteorological data exist (Chen et al., 2005;35 Kingston et al., 2009). The Penman-Monteith FAO 56 PM method is recommended by FAO and other studies based on thorough analyses of PET method intercomparisons (Allen et al., 1998;Jian Biao et al., 2005;Vörösmarty et al., 1998). In the FLEX model, capillary rise from groundwater is also 40 considered. However, in WAYS, the feature for capillary rise simulation is currently disabled, as it cannot be taken into account when no information is available at the global scale. The WAYS model is written in Python version 3.6. To benefit from a supercomputer, the model is designed with full 45 support for parallel computation.

Model setup
For the assessment of model performance, the WAYS model is applied at the global scale with a spatial resolution of 0.5 • for the historical period from 1971 to 2010. Two simulations 50 are conducted based on two products of the global root zone storage capacity from Wang-Erlandsson et al. (2016). The model is calibrated in the period of 1986-1995 and validated in the period of 2001-2010 depending on the availability of the reference data. The model is driven by the climate data set from the Global Soil Wetness Project 3 (Kim, 2017), GSWP3 (http://hydro. iis.u-tokyo.ac.jp/GSWP3/TS2), for the historical period from 60 1971 to 2010. The GSWP3 data set is generated based on the 20th Century Reanalysis Project (Compo et al., 2011). It has been proven to be able to represent realistic submonthly variability over the entire 20th century  and has been used as a forcing data set in several other hydrological 65 modeling studies (Veldkamp et al., 2017;Liu et al., 2017;Tangdamrongsub et al., 2018). The climate variables used in this study include precipitation, minimum temperature, maximum temperature, relative humidity, surface downwelling longwave radiation, surface downwelling 70 shortwave radiation, and wind speed at 10 m. All the variables are on a daily scale and have a 0.5 • spatial resolution; in addition, the wind speed at 10 m is converted to the wind speed at 2 m based on the conversion function in Allen et al. (1998), as it is required by the Penman-Monteith FAO 56 75 PM method for potential evapotranspiration calculations.

Land use data
The land cover data that we used are the global mosaics of the standard MODIS land cover type data product (MCD12Q1) with a spatial resolution of 0.5 • in the year 2001, which 80 are derived from the International Geosphere-Biosphere Programme (IGBP) land cover type classification (17 classes) and are reprojected into geographic coordinates of latitude and longitude on the World Geodetic System (WGS) 1984 coordinate reference system (Friedl et al., 2010). 85

Root zone storage capacity
The root zone storage capacity (RZSC) data are a crucial parameter in WAYS. The global root zone storage capacity data used in this study are from Wang-Erlandsson et al. (2016), derived by using the "Earth observation-based" method. This 90 method determines the soil moisture deficit at the global scale by using the state-of-the-art observation-based precipitation data and satellite-based evaporation data, under the assumption that vegetation optimizes its root zone storage capacity to bridge critical dry periods and does not invest more 95 in its roots than necessary. This method has been well justified (de Boer-Euser et al., 2019) and overcomes the shortcomings of the traditional methods (look-up table approach; field-observation-based approach) at the global scale, such as 6 G. Mao and J. Liu: WAYS Table 1. Water balance and constitutive equations used in WAYS.

Reservoirs
Water balance equations Constitutive equations Reference Rango and Martinec (1995) Note: all the timescale-dependent parameters need to be divided by t to make the equations dimensionally correct and suitable for any other timescales. The "-" symbol in the reference column indicates that the formula is taken from the FLEX model. data scarcity, location bias, and risks of unlikely vegetation and soil combinations due to data uncertainty (Feddes et al., 2001). The method has been shown to increase the model performance at both the basin and global scales (Gao et al., 2014b;Nijzink et al., 2016;Wang-Erlandsson et al., 2016). 5 Moreover, it has been proven to be able to produce plausible root zone storage capacity in boreal regions by investigating the relationship between RZSC and numerous environmental factors, including climate variables, vegetation characteristics, and catchment characteristics (de Boer-Euser et al., 2019). Since there are two global root zone storage capacity products (S R,CHIRPS−CSM and S R,CRU−SM ) presented by Wang-Erlandsson et al. (2016) based on different precipitation and evaporation data sets, and there is no prefer-15 ence for either product, in this study, both RZSC products are used. S R,CHIRPS−CSM covers the latitudes from 50 • N to 50 • S and is derived based on the United States Geological Survey (USGS) Climate Hazards Group In-fraRed Precipitation with Stations (CHIRPS) precipitation 20 data (Funk et al., 2014) and the ensemble mean of three satellite-based global-scale evaporation data sets: the Commonwealth Scientific and Industrial Research Organization (CSIRO) MODIS Reflectance Scaling EvapoTranspiration (CMRSET) data (Guerschman et al., 2009), the Operational 25 Simplified Surface Energy Balance (SSEBop) data (Senay et al., 2013), and the MODIS evapotranspiration (MOD16) data (Mu et al., 2011). S R,CRU−SM covers the latitudes from 80 • N to 56 • S and is derived by using the Climatic Research Unit Time Series version 3.22 precipitation data (Harris et al.,30 2014), together with the ensemble mean of only SSEBop and MOD16 because CMRSET overestimates evaporation at high latitudes (Wang-Erlandsson et al., 2016). Since Wang-Erlandsson et al. (2016) suggested that a Gumbel normalization of RZSC by land cover types with different return peri-35 ods could further improve the model performance, we have accordingly adjusted the RZSC in this study. The two selected global root zone storage capacity products are shown in Fig. S13, and their mean latitudinal values are shown in Fig. S14. Similar patterns and magnitudes of RZSC can be 40 found, and there is good agreement between the two products at different latitudes, especially at low latitudes around the Equator, where the products reflect the fluctuation with high consistency. A large difference is seen mainly in the northern midlatitude area, where the absolute difference in 45 percentage is still less than 20 %.

Calibration data
The WAYS model has a few parameters, and while some of them are obtained independently from the literature, some have to be determined by model calibration (see Ta-50  ble 2). The WAYS model is calibrated against the International Satellite Land Surface Climatology Project (ISLSCP), Initiative II University of New Hampshire (UNH)/Global Runoff Data Centre (GRDC) composite monthly runoff data (Fekete et al., 2011(Fekete et al., ) from 1986 to 1995 at a 0.5 • resolu-55 tion, which are composite runoff data that combine simulated water balance model runoff estimates and monitored river discharge. The ISLSCP II UNH/GRDC composite monthly runoff data also comprise a standard data set in the second phase of ISIMIP (Inter-Sectoral Impact Model Intercomparison Project; ISIMIP2a)  for calibration and validation, as it assimilates discharge measurements at gauge stations and preserves the spatial specificity of the water balance while being constrained by the station 5 observations. The data can be downloaded from the Oak Ridge National Laboratory Distributed Active Archive Center (ORNL DAAC) (https://daac.ornl.gov/cgi-bin/dsviewer. pl?ds_id=994TS3).

Validation data 10
In this study, the ERA-Interim/Land runoff data are used for validation of the runoff simulation, and the NDII is used for the validation of the WAYS model for root zone water storage simulation. Considering the time period of coverage of both data sets (ERA-Interim/Land: 1979, NDII: 2000 present) and the study period  of this work, the period of 2001-2010 is selected as the validation period. For runoff evaluation, ISIMIP2a simulations are also included, as they use the same climate forcing as our study in the same period. The purpose of inclusion of the ISIMIP2a simulations 20 for comparison can be found in the model evaluation section (see Sect. 4).

ERA-Interim/Land runoff data
ERA-Interim/Land is a global land surface reanalysis data set produced by the European Centre for Medium-Range 25 Weather Forecasts (ECMWF) (Balsamo et al., 2015). The gridded data set of ERA-Interim/Land is selected for model evaluation mainly because the current version of the WAYS model does not include a runoff-routing model on the global scale. Therefore, the results are not comparable with ob-30 served gauge data. Since the ERA-Interim/Land data set is well assessed with a quality check through comparison with ground-based and remote sensing observations, it has been used as reference data for many studies (Xia et al., 2014;Dorigo et al., 2017). ERA-Interim/Land runoff data are one 35 of the variables in the ERA-Interim/Land reanalysis data set and are widely used as benchmark data (Alfieri et al., 2013;Orth and Seneviratne, 2015;Reichle et al., 2017) due to their good agreement with the Global Runoff Data Centre (GRDC) data set and large improvement compared to the 40 ERA-Interim runoff reanalysis data, which were used as one of the reference data sets (Wang-Erlandsson et al., 2014;Balsamo et al., 2015). The ERA-Interim/Land runoff data used in this study were downloaded from the ECMWF website (http://apps.ecmwf.int/datasets/TS4) at a 0.5 • resolution and It should be noted that there are other reanalysis runoff data available, such as ERA-Interim, Global Land Data Assimilation System (GLDAS), and National Centers for Environmental Prediction (NCEP) CE5 . However, they show 50 low robustness based on the available research results. For instance, GLDAS v1.0-CLM was found to overestimate runoff globally, and GLDAS v1.0-Noah generated more surface runoff over the northern middle-high latitudes (Lv et al., 2018). GLDAS v2.0-Noah showed a significant un-55 derestimation trend in exorheic basins (Wang et al., 2016). The snowmelt-runoff peak magnitude simulated by GLDAS v2.1-Noah was found to be excessively high in June and July (Lv et al., 2018). NCEP runoff was found to be too high during the winter and too low during the summer in the Missis-60 sippi River basin (Roads and Betts, 2000). ERA-Interim was found to be less close to the observed streamflows compared with ERA-Interim/Land data (Balsamo et al., 2015).

NDII data
NDII was developed by Hardisky et al. (1983) for satellite 65 imagery analysis based on calculations of the ratios of different values between near-infrared reflectance (NIR) and shortwave infrared reflectance (SWIR). NDII has been found to have a strong correlation with the vegetation water content and canopy water thickness (Serrano et al., 2000;Jack-70 son et al., 2004;Hunt and Yilmaz, 2007;Wilson and Norman, 2018). It can also be used to effectively determine the water stress of plants by taking advantage of the property of shortwave infrared reflectance, which has a negative relationship with leaf water content because of the large ab-75 sorption by leaves Friesen et al., 2012;van Emmerik et al., 2015). Recently, Sriwongsitanon et al. (2016) found a promising linkage between NDII and the root zone water storage. Even though NDII reflects the dynamics of RZWS better in moisture stress periods than 80 in moisture-stress-free periods, the general good correspondence between NDII and RZWS indicates that NDII has potential as a proxy for RZWS. Therefore, in this study, NDII is used as the benchmark to assess the performance of the model in RZWS depiction. 85 NDII is calculated by applying the following equation from Hardisky et al. (1983): where ρ 0.85 is the reflectance at the 0.85 µm wavelength and ρ 1.65 is the reflectance at the 1.65 µm wavelength. NDII is 90 a normalized index that ranges between −1 and 1. A low value of NDII indicates high canopy water stress, which also reflects that there is less water content in the root zone (Sriwongsitanon et al., 2016). In our work, NDII is computed based on the satellite 95 data of the MODIS level 3 surface reflectance product (MOD09A1) (Vermote, 2015), which provides an estimate of the surface spectral reflectance of Terra MODIS bands 1 through 7 corrected for atmospheric conditions such as gases, aerosols, and Rayleigh scattering in the sinusoidal projection. 100 The MOD09A1 product is available on an 8-day temporal scale with a 500 m spatial resolution globally from 24 Febru-8 G. Mao and J. Liu: WAYS

10
A global parameter optimization algorithm (Tolson and Shoemaker, 2007), dynamically dimensioned search (DDS), has been applied in this study for model parameter calibration. DDS is designed for computationally expensive optimization problems and has been used in many studies re-15 lated to distributed hydrological model calibration at global and regional scales (Moore et al., 2010;Kumar et al., 2013;Rakovec et al., 2016;Nijzink et al., 2018;Smith et al., 2018). Since the reference data, i.e., ISLSCP II UNH/GRDC data, are at a monthly temporal scale, the runoff simulated by 20 WAYS in the calibration period (1986)(1987)(1988)(1989)(1990)(1991)(1992)(1993)(1994)(1995) is also averaged to the monthly scale for consistency. The criterion of fit for calibration is the Nash-Sutcliffe efficiency (NSE) coefficient, and the DDS optimization algorithm is run with 2000 iterations for each grid cell for parameter estimation, as sug- 25 gested by the author of DDS (Tolson and Shoemaker, 2007).

Model evaluation
To evaluate model performance, simulated runoff and root zone water storage values are compared to the reference data (see Sect

Runoff evaluation
WAYS-simulated runoff values are compared to the ERA- 35 Interim/Land runoff as well as to the multimodel global runoff simulations from ISIMIP2a. ISIMIP is a community-driven global platform that supports model intercomparison studies at both global and regional scales, while ISIMIP2a focuses on the historical period, and all the models are driven 40 by four state-of-the-art climate forcing factors . Since WAYS uses the same driving data as the ISIMIP2a models and the ISIMIP2a simulations have been widely discussed in many studies Müller Schmied et al., 2016;Gernaat et al., 2017;Zaherpour 45 et al., 2018), we also perform a comparison between WAYS and the ISIMIP2a models to further evaluate our model. To make the climate forcing consistent with the WAYS model, only the GSWP3-driven simulations are used for comparison. The evaluation is performed at the monthly scale, even 50 though the WAYS model simulates the runoff at the daily scale, because only monthly runoff data are available for some of the ISIMIP2a models . Figure 2 shows the time series of runoff from reference data and different models. WAYS_CRU in the leg-55 end indicates the runoff simulated by the WAYS model with root zone storage capacity product S R,CRU−SM , and WAYS_CHIRPS denotes the simulation with RZSC product S R,CHIRPS−CSM . First, it can be seen that the two WAYS simulations with different RZSC products show extremely good 60 correspondence in all selected basins. This result is consistent with the investigation of RZSC data sets in Sect. 3.1.3, where there is a high consistency in the two used products, which even show that RZSC itself naturally exhibits high variability along the latitudes (see Figs. S13 and S14). This 65 result confirms the robustness of the RZSC products that we used in our WAYS model for runoff simulation. The results show good agreement between WAYS simulations and the reference data, i.e., ERA-Interim/Land in the selected basins, while the ISIMIP2a models present stark differences in the 70 simulated runoff. For example, the ISIMIP2a models show a clear trend of overestimation in some of the basins (Mississippi, Ganges, Yangtze, Paraná, and Murray-Darling), where the spread of the runoff ensembles is also large. This result occurs partly because some of the ISIMIP2a models are not 75 calibrated at all (Zaherpour et al., 2018), whereas WAYS is calibrated to a composite monthly runoff data set that assimilates the monitored river discharge (Fekete et al., 2011). In some plots, the red line is not visible and is covered by the blue line due to the small differences between the two WAYS runs. WAYS is calibrated using composite monthly runoff data, while the ISIMIP2a models are not calibrated for the simulation.
In the Mekong River basin, all the models show a high consistency in monthly runoff generation, with a narrow spread of the ensemble. This result may be due to the natural characteristics of the Mekong River, i.e., highly predictable timing and size of the wet-season peak. In addition, 5 precipitation in this region is concentrated in an extremely regular wet-season peak under the impact of tropical monsoons (Adamson et al., 2009). The manner in which WAYS outperforms the other models is also observed in the northernmost (Mississippi) and southernmost (Murray-Darling) 10 catchments of our selected basins, while the ISIMIP2a models show extremely large differences in the runoff simulations with large uncertainties. Good performance is particularly highlighted in the Murray-Darling Basin, where the monthly runoff is extremely low, due mainly to the anthro-15 pogenic climate impacts (Cai and Cowan, 2008;Potter and Chiew, 2011), which is extremely difficult for other models to capture without overestimation (see Fig. 2). A slight overestimation is found in the WAYS model in two African basins, i.e., the Nile and Niger. This result can be explained 20 by the general overestimation of the precipitation value in climate forcing data GWSP3 in these regions (Muller Schmied et al., 2016). In these two regions, the ISIMIP2a simulations also show dramatic overestimations. In contrast, the models show a trend of underestimation in another African basin, 25 the Congo. This result might be caused by both the quality of precipitation and the complexity of natural processes here (Tshimanga and Hughes, 2014). Wang-Erlandsson et al. (2014) reported in their work that Congo precipitation and runoff estimates are particularly uncertain in general. It is 30 worth highlighting that the WAYS model can still capture the monthly variability of runoff in this basin well.
To evaluate the ability of the WAYS model to replicate the distribution, a comparison study on the probability of exceedance is conducted, and the result is shown in Fig. 3. The 35 probability of exceedance reveals the model performance at different magnitudes. With a visual inspection, we can see that WAYS is able to reproduce the runoff distribution well with a good match to the ERA-Interim/Land data, especially in the Congo, Paraná, and Mississippi basins, while the 40 ISIMIP2a model simulated runoff is skewed differently than that of the ERA-Interim/Land runoff distribution. In a few basins (Nile, Ganges, Paraná, and Mississippi), some of the ISIMIP2a models even show a bear-sized shift of distribution relative to the reference data, highlighting that these models 45 struggle to simulate the monthly runoff at all different magnitudes. In the Nile and Niger basins, WAYS also shows a slight offset for both simulations, but it still lies within the uncertainty range. The results also show a large uncertainty in the runoff simulations in the upper tails, which reflects 50 Figure 3. The probability of exceedance for monthly runoff simulated by different models as well as the reference data in 10 selected basins. The solid lines in blue and red indicate WAYS simulations with two different RZSC products. The solid line in black indicates the ERA-Interim/Land data, and dashed lines represent the ISIMIP2a model simulations. In some plots, the red line is not visible and is covered by the blue line due to the small differences between the two WAYS runs. WAYS is calibrated using composite monthly runoff data, while the ISIMIP2a models are not calibrated for the simulation. the larger deviation in the high values produced than in the middle-and low-value simulations for the models. Such biases in reproducing the runoff distribution in the ISIMIP2a models, in turn, deliver large ensemble spreads in the time series. 5 To further assess the performance of the WAYS model, three general metrics for runoff comparison are selected for the evaluation, i.e., the NSE, root mean squared error (RMSE), and percent bias (PBIAS). The estimated scores from the monthly runoff time series for WAYS and the ISIMIP2a models are presented in Fig. 4. For better comparison, the NSE values are converted to the 1-NSE values; thus, numbers closer to 0 indicate better performance. The model performance of WAYS is generally better than that of the ISIMIP2a models, and the estimated scores based on dif-15 ferent criteria are also close to the benchmarks. The 1-NSE comparison (Fig. 4a) indicates that the model performance of WAYS in the selected basins, except for the Niger and Nile, is particularly favorable compared to the ISIMIP2a models. In these basins, both of the WAYS simulations (WAYS_CRU 20 and WAYS_CHIRPS) are ranked in the top five (14 model simulations in total in the comparison). In six basins, both of the WAYS simulations have 1-NSE metric scores less than 0.3, resulting in a value of NSE of greater than 0.7. In the Yangtze, Amazon, and Mekong, the WAYS model is even 25 ranked as the best one, with both of the simulations outper- forming the others. The relatively low performance of WAYS in the Niger and Nile is the result of the model slightly overestimating the middle and high runoff values (Fig. 3). The RMSE comparison (Fig. 4b) delivers information similar to the 1-NSE comparison, in which WAYS shows generally bet-5 ter performance. In the Amazon, all the model simulations show large RMSE due to the large value of monthly runoff in this catchment. By examining the percent bias (Fig. 4c), it is evident that the WAYS model performs well in most of the basins, as the scores of the two WAYS simulations are 10 close to the benchmark. A relatively poor performance of the WAYS model in the percent bias assessment is found in the Murray-Darling Basin, with PBIAS values of approximately 100 %, but they are still within the uncertainty range based on a check with other models. This large value may be caused 15 by the extremely low runoff-induced value of the benchmark, in which a slight difference in the absolute value will cause a large difference in the percentage.
Combining the time series analysis, the most commonly used metric examination in hydrology and probability of ex-20 ceedance assessment, our results show a comprehensive assessment of the model performance in runoff simulation. The strong performance of WAYS with the subtle difference between the runoff simulation and reference data in all the tests indicates the particularly favorable applicability of WAYS 25 in runoff simulations across major basins. Even though relatively poor performance is found for two African basins, the biases are still within the uncertainty range based on investigations of other models. Such a trend of overestimation could also be explained by the overestimation of the pre-30 cipitation value in the forcing data in these regions (Muller Schmied et al., 2016). In addition, it is worth acknowledging that global hydrological models show large differences in runoff simulations across basins. Previous studies emphasized that large ensemble spreads from GHMs could be 35 caused by model structural uncertainties (Haddeland et al., 2011;Gudmundsson et al., 2012). The lack of physical process representations, e.g., transmission loss, in the hydrological models can also explain some of the biases between the simulated runoff and the reference data (Gosling and Arnell, 40 2011).
The performance of WAYS is further evaluated against the gauge observations. Since WAYS does not have a native runoff-routing module at the moment, a third-party runoffrouting tool, CaMa-flood, is applied to route the WAYS-45 simulated runoff (Yamazaki et al., 2011). The evaluation results can be found in the Supplement.

Validation of root zone water storage
Similar to the runoff evaluation, the performance of the simulation of root zone water storage by the WAYS model is also evaluated at 10 major river basins in the period from 2001 to 2010. The spatial pattern as well as the RZWS dynamics 5 at different latitudes and in different months can be found in the Supplement of this paper. Since the NDII is a normalized index and on an 8-day temporal scale, the WAYS-simulated root zone water storage is first averaged over an 8-day temporal scale and then normalized to the range between 0 and 10 1 before the comparison. A few time steps are missing in the NDII data set. To keep the compared data sets consistent, only pairwise RZWS data are selected for the model evaluation.
The 8-day NDII values are compared to the 8-day aver-15 aged root zone water storage values of the WAYS model, and the results are shown in Fig. 5 and Table 3. Figure 5 shows a comparison of the time series of NDII and simulated RZWS in the selected basins, and Table 3 presents the corresponding rank correlation (Spearman's ρ) between NDII and RZWS.

20
The RZWS simulated by GEPCI-hydro is not compared to the other model, as the RZWS variable is not available in other GHMs. For ISIMIP2a, some models produced the root zone soil moisture within a fixed depth of the soil profile in the model structure. However, this is still a different variable 25 compared with the root zone water storage. First, it is clear that NDII shows totally different patterns in different basins. Clear seasonal cycles are shown in the Nile, Mekong, and Niger river basins and so on. Camel-like structures are observed in the Ganges and Congo basins, and rel- 30 atively complex patterns are represented in the Mississippi, Murray-Darling, and Amazon basins. The simulated RZWS shows good agreement in the time series with NDII in most of the selected basins. High values of rank correlation are also detected in these regions. Seven catchments of 10 have 35 a rank correlation value higher than 0.7, especially in the Nile, Niger, Paraná, and Mekong, where the correlation coefficients are even higher than 0.9, indicating the strong model performance of WAYS in these basins for root zone water storage simulation, as the NDII reflects the soil water con-40 tent in the root zone (Sriwongsitanon et al., 2016). The two simulated RZWS time series with different root zone storage capacity products also show identical behavior with subtle differences, except in the Yangtze River basin due to the relatively larger differences in the averaged RZSC of the two 45 products (S R,CRU−SM : 135 mm, S R,CHIRPS−CSM : 163 mm) in this basin. In the Ganges and Congo, the NDII time series show a two-humped structure, which the WAYS model can still capture, even though underestimations are detected in some years. The rank correlation coefficients in these two 50 catchments are higher than 0.8. In the Yangtze, a suddenly high value of NDII is found on 25 August in 2008. By investigating the NDII values a few days before and after this date and the precipitation amount in this period, the unrealis- tically high value might be caused by the quality of satellite 55 MOD09A1 data on that day, as it can be affected by many issues, including clouds, shadow, viewing angle, aerosol loading, and so on (Vermote, 2015). Relatively large differences between NDII and simulated RZWS are also found in some catchments. In the Missis-60 sippi, WAYS shows good performance in large-value simulations, while it struggles to simulate low values, with considerable overestimation of them. Therefore, the rank correlation is also relatively low in this catchment, with values of approximately 0.67. The Mississippi River basin is the 65 northernmost catchment of our selected basins. The NDII here shows a totally different pattern compared to the others, while the WAYS-simulated RZWS can barely show a clear seasonal variation. There could be multiple reasons for this overestimation: our model has a relatively simple 70 snowmelt module (degree-day method), which could consequently introduce biases into the simulation, especially in relatively cold regions. Additionally, the relatively uncertain forcing data could contribute to the mismatches between NDII and RZWS, as the largest uncertainties in precipita-75 tion occur mainly at the higher latitudes (Vinukollu et al., 2011). Some studies also reported that precipitation-induced spurious seasonal and interannual variations also exist in the soil moisture in this basin (Yang et al., 2015). In contrast, WAYS shows a trend of underestimation in the Murray-80 Darling Basin. A possible reason could be that deep-rooted plants are widespread across the Murray-Darling Basin and can tap into groundwater (Runyan and D'Odorico, 2010;Lamontagne et al., 2014); thus, the NDII may not be the correct proxy for moisture stress in this region. A vast amount 85 of groundwater drawing from the saturated zone to the root zone could explain such underestimation of RZWS (Leblanc et al., 2011). Other reasons behind these findings could be the underestimated RZSC in this region as well as the intensive human activities, including dam construction, a water 90 diversion system, and river management, which will impact both the RZSC estimation and RZWS simulation (Reid et al., 2002;Kingsford, 2000). In the Amazon, the model can only capture a few downward troughs but shows difficulty in representing the complete complex dynamics of NDII, resulting in the lowest value of the rank correlation (0.593 and 5 0.552) among all the selected basins. The primary reason for this low performance could be the inability of NDII to represent RZWS in relatively wet regions where water stress for plants is low (Sriwongsitanon et al., 2016). Among our selected basins, the Amazon has the highest averaged annual 10 precipitation amount, with a value of 2201 mm yr −1 in the validation period. In this case, the performance of WAYS in the RZWS simulation of such regions cannot be justified.
Overall, these model validation results over the 10 selected river basins deliver generally good evaluated values that sug-15 gest the capability of the WAYS model for RZWS simulation, especially for interannual variability simulation. However, attention should also be paid to some regions, e.g., the basins at high latitudes in the Northern Hemisphere as well as the regions with plenty of precipitation where moisture stress 20 might be low and NDII may not correctly reflect the RZWS dynamics (Sriwongsitanon et al., 2016).

Evaporation evaluation
RZWS has a close link to the total evaporation, as RZWS represents the available water that plants can use. In this sec- 25 tion, the performance of WAYS in evaporation simulation is evaluated against the FLUXNET2015 data. FLUXNET2015 is a global network of micrometeorological flux measurement sites that measure the exchange of CO 2 , water vapor, and energy between the biosphere and the atmosphere (Pas-30 torello et al., 2017). The tower-measured latent heat flux (LF; W m −2 ) is converted to ET (mm d −1 ) using the proportionality parameter between energy and depth units of ET (Velpuri et al., 2013) as follows: 35 where λ is the latent heat of vaporization (2.45 MJ kg −1 ). In total, 108 stations are selected based on the data availability in the period of 1971-2010. The flux tower latent heat is converted to evaporation before the comparison. The correlation coefficients between simulated evaporation and the 40 FLUXNET2015-derived evaporation are then calculated on the monthly scale. The results are shown in Fig. S15. The background is the annual averaged evaporation from WAYS for the period of 1971-2010. The points indicate the comparison results be-45 tween the flux tower and WAYS simulation. The locations of the points indicate the locations of the flux towers, and the colors indicate the correlation coefficient. WAYS is found to have relatively better performance in the US, Europe, and China than in Africa and Australia. However, a few stations 50 near the boundary of the US and Europe also show weak correlations between the simulations and flux tower data. Figure S16 shows the percentage of data points within different intervals of the correlation coefficient. The calculated correlation coefficient is crowded in the interval of 0.6-0.8, while more than half of the stations (56 %) show a correlation coefficient of more than 0.6. The relatively poor performance of the model in some regions could be partially explained by the following reason. FLUXNET2015 corre-5 sponds to point-based observation data, while WAYS simulates the evaporation on grid cells with a 0.5 • spatial resolution. For the comparison, the model simulation in a certain pixel is selected based on the distance between the flux tower and the center of the pixel. The model simulation actually represents an averaged value for a 0.5 × 0.5 • pixel. This averaging will inherently introduce errors when comparing the simulation to station-based data. Similar results are also found in other studies comparing FLUXNET2015 data to either model simulations or remote-sensing-derived evapora-15 tion (Lorenz et al., 2014;Velpuri et al., 2013).
Furthermore, the average monthly evaporation is compared to the FLUXNET2015 data at each flux tower, and the results are shown in Fig. 6. Good correspondence between the model simulation and flux tower data can be found by 20 visual inspection. The points with a higher correlation coefficient show a better relationship between the model simulation and flux tower observation and are distributed closer to the diagonal. The evaluation results confirm the generally good performance of WAYS in monthly evaporation simula-25 tion. The detailed results on evaporation evaluation against FLUXNET2015 are provided in the Supplement as Excel files. In addition, an evaluation of the evaporation simulation is further conducted against LandFluxEVAL, a merged benchmark synthesis product of evaporation at the global 30 scale (Mueller et al., 2013). The results can be found in the Supplement.

The effect of root zone storage capacity on hydrological simulation
RZSC is a key parameter of the WAYS model. Therefore, 35 it is important to investigate how RZSC could affect the model simulation. In addition to the model simulated with satellite-data-derived RZSC products (S R,CHIRPS−CSM and S R,CRU−SM ), we have additionally conducted WAYS simulations with RZSC derived from uncertain root depth and 40 soil data. The uncertain RZSC (S R,LOOKUP−TABLE ) is derived based on literature values of root depth and soil texture data (Müller Schmied et al., 2014;Wang-Erlandsson et al., 2016). Due to the global coverage of the RZSC data (S R,CRU−SM ), only the simulation with S R,CRU−SM is 45 used for comparison. The spatial distribution of the uncertain RZSC is shown in Fig. S17, and the differences between S R,CRU−SM and S R,LOOKUP−TABLE are shown in Fig. S18. It can be seen that there are large differences between the two RZSC products. The simulation with uncer-50 tain RZSC S R,LOOKUP−TABLE shows overestimation globally except for some regions around low-middle latitudes.
The latitudinal-averaged RZSC further confirms the overestimation of S R,LOOKUP−TABLE at middle-high latitudes (Fig. S19).

55
The large differences between these two RZSC data sets also introduce differences in simulated hydrological elements. Figure S20 shows the impacts of RZSC on the model simulation, including runoff, evaporation, and RZWS. A blue color (decrease of RMSE and increase of the ranked 60 correlation) indicates an improvement of the simulated results by replacing the uncertain RZSC (S R,LOOKUP −TABLE ) with satellite-data-derived RZSC (S R,CRU−SM ), while a red color implies the opposite. For comparison, reference data are used for different variables. For runoff, evaporation, and 65 RZWS, the reference data are ERA-Interim/Land (2001, monthly), LandFluxEVAL (1989, monthly), and NDII (2001, respectively. Generally, the model simulations are improved by using the RZSC S R,CRU−SM . This result emphasizes the importance of an ap-70 propriate representation of RZSC in WAYS. A decline of the model performance is also found in some regions at high latitudes and low latitudes. This result can be partially explained by the inherent uncertainty in the S R,CRU−SM data, as they are derived from other data sets. The RZSC deriva-75 tion method itself as well as the input data can also introduce biases (Wang-Erlandsson et al., 2016). Figure 7 shows the RMSE improvements of simulated monthly evaporation for different land cover types obtained by implementing the satellite-data-derived 80 RZSC (S R,CRU−SM ) instead of the uncertain RZSC (S R,LOOKUP−TABLE ). The analysis reveals that the satellitedata-derived RZSC (S R,CRU−SM ) has great potential to improve the evaporation simulation for all kinds of land cover. The largest improvements are found in broadleaf forests. The 85 improvements in the needleleaf forest, mixed forest, and sa- vanna are relatively low. The findings also resonate with another work that used a simple terrestrial evaporation to atmosphere model (STEAM) for evaporation simulation (Wang-Erlandsson et al., 2016).

5
In this study, a global hydrological model has been developed that aims to simulate the soil water volume stored in the entire root zone, a critical variable for ecohydrology-related studies, by considering the global spatial heterogeneity of the plant rooting system. The primary motivation behind the 10 development of WAYS is to improve the integrality of soil water simulation in hydrological models by acknowledging the key role played by RZWS in many applications, as it connects the climate, hydrology, and Earth surface systems (Savenije and Hrachowitz, 2017). Existing models represent 15 the soil profile with different schemes (Devia et al., 2015). However, they still suffer from the structure limitations of the models in reflecting the soil water dynamics for the entire rooting system (Bierkens, 2015;Sood and Smakhtin, 2015). A persistent weakness in the RZWS simulation in the hydro-20 logical models is the lack of direct observations for model evaluation (Sriwongsitanon et al., 2016).
Benefiting from recent progress made in the field of hydrology and remote sensing, the WAYS model is developed based on an advanced lumped model, FLEX (Fenicia 25 et al., 2011;Gao et al., 2014a), and evaluated with a proxy of RZWS, the remote-sensing-based index NDII (Hardisky et al., 1983). NDII is not new, but strong linkage between NDII and RZWS found by Sriwongsitanon et al. (2016) en-lightened our work. This potential candidate as a proxy of 30 RZWS bridges the gaps in the field, where RZWS cannot be directly observed at large scales. The FLEX model is widely used and has been validated for root zone water dynamics simulation but at the basin scale (Gao et al., 2014b;Nijzink et al., 2016;de Boer-Euser et al., 2016;Sriwongsitanon et al., 35 2016). A variety of modifications and extensions are made based on FLEX that allow WAYS to simulate the hydrological cycles at the global scale with an advanced schema in the root zone system. Another key parameter that allows appropriate RZWS simulation in WAYS is the global RZSC 40 recently produced by Wang-Erlandsson et al. (2016). Before that, it was usually obtained by look-up approaches with inherently large uncertainty. RZSC reveals the spatial heterogeneity of the plant rooting system and has a direct relation to RZWS. Moreover, RZSC is produced under the assump-45 tion that plants do not invest more in their roots than necessary to bridge a dry period. Thus, this assumption is also held by our work, and the root zone reservoir (Sect. 2.3) actually defines the part of the unsaturated zone that determines the dynamics of the runoff regime (Sriwongsitanon et al., 2016;50 Savenije and Hrachowitz, 2017).
The major goal of this study is to test the feasibility of WAYS for reliable RZWS simulation. The newly developed model is first validated for runoff and RZWS simulation in 10 major basins across the world and is then further eval-55 uated against station observations, including flux tower and gauge data. Despite regional differences, generally good performance is found for runoff and evaporation simulations. In addition, the WAYS model also shows a good representation of RZWS, with high values of rank correlation in most of the 60 validated regions. The evaluation results confirm the capacity of WAYS as a useful tool to simulate hydrological elements, particularly RZWS, at the global scale. However, we have to highlight that the model shows lower performance in some regions, e.g., the Amazon, in the RZWS simulation, where the reference NDII data may have shortcomings in reflecting RZWS. In these regions where NDII might not be a correct proxy for RZWS, an additional data set could be helpful for evaluation, e.g., the solar-induced fluorescence (SIF), which reflects photosynthesis and thus has a close relation-10 ship with the available water in the root zone. A combination of vegetation index data, such as EVI and NDVI, could also be an alternative, as it represents different characteristics of plants. However, further investigations need to be performed before this combination can be applied. It is also important 15 to note that the high-latitude regions are not covered by one of the key parameters, i.e., root water storage capacity, used by the WAYS model, and only major river basins at middle and low latitudes are investigated. Thus, the performance of the model in the other regions is not justified. This is one 20 of the limitations of this work, and further investigations are needed.
It should also be noted that during the evaluation of RZWS, the reference NDII data represent a normalized index based on surface reflectance and can reflect only the dynam- 25 ics of RZWS rather than the absolute value (Sriwongsitanon et al., 2016). Therefore, a real value-based evaluation could be much more helpful for the model application. This could be another limitation of the work. However, this fact also emphasizes the importance and necessity of this work from 30 the following two aspects: (1) the remote-sensing-based approach, e.g., NDII, is thus far one of the best available methods for root zone information retrieval (Tobin et al., 2017). However, it is still limited in its ability to reflect the real value, which urges model development, as the model has the 35 ability for absolute value simulation. (2) The remote-sensingbased approach works only for historical analysis, which limits its ability to be used in future impact studies. This issue also motivates model development, as the model can work for both past and future studies after appropriate evaluation. 40 Moreover, the current study does not consider the groundwater access and irrigation mainly due to the lack of global information. The groundwater table information is crucial for capillary rise simulation (Vergnes et al., 2014). Capillary rise simulation without proper water table information could sig- 45 nificantly overestimate the evaporation. Thus, the capillary rise flux is ignored in this study. A similar strategy has also been applied by other works due to the absence of the information on the global water table (Döll et al., 2003;De Graaf et al., 2015;Hanasaki et al., 2018). Observations of irriga-50 tion on the global scale are also not available (Leng et al., 2015). Although there are simulated irrigation data available on the global scale, the inherent uncertainties could be propagated in our model simulation. Therefore, irrigation is also not considered at this time. However, this neglect could po- 55 tentially introduce biases into the model simulation in irrigated areas and deep-rooted plant-distributed regions, as both irrigation and capillary rise are an additional supply of soil water recharge. The biases may cause an underestimation of evaporation, especially in the dry summertime (Vergnes 60 et al., 2014). This underestimation could consequently affect the simulation of RZWS and runoff because of the interlinkage of these three elements (Rockström et al., 1999). It is found that ignoring the capillary rise could reduce soil water content in the root zone (RZWS), while the runoff will 65 also be reduced (Vergnes et al., 2014). However, these shortcomings can be simply overcome once the global data are available.
In summary, the newly developed global hydrological model WAYS improves the integrality of soil water simula-70 tion in hydrological models, as it simulates the water stored in the entire root zone. This added-value feature could benefit many applications related to the root zone processes. For instance, the correct representation of RZWS could help researchers in the investigation of land-vegetation-climate-75 water integration, where RZWS plays a key role. The capability of RZWS simulation could also benefit the field of agriculture, as RZWS represents the plant-available water, which is closely linked to the crop yields. Moreover, this can also advance the hydrological model itself, as the wa-80 ter stored in the root zone controls the partitioning of the precipitation into evaporation, infiltration, and runoff in the model (Liang et al., 1994). The precise simulation of variables in the root zone could benefit the simulation of other elements in the model, thus advancing the model simula-85 tion toward an advanced philosophy, i.e., obtaining the right answers for the right reasons rather than simply obtaining the right answers (Kirchner, 2006). In addition, the WAYS model can be further improved by integrating a more sophisticated evaporation module, e.g., the STEAM model devel-90 oped by Wang-Erlandsson et al. (2014), which separates the evaporation fluxes in a more detailed way. Finally, a runoffgeneration module recently developed by Gao et al. (2019), HSC-MCT, could provide another possibility to improve the WAYS model, as it offers another venue for determining one 95 of the key parameters in WAYS (β) independently without calibration. This calibration-free module could actually benefit any conceptual hydrological model.
Code and data availability. The model code is provided through a GitHub repository: https://github.com/argansos/WAYS TS6 . The 100 meteorological data used in this work are available at the data center of the "Global Soil Wetness Project 3" (http://hydro.iis.u-tokyo. ac.jp/GSWP3/ TS7 ). The land use data are available at the Global Land Cover Facility (http://www.landcover.org TS8 ). The root zone storage capacity is collected from the work of Wang-Erlandsson 105 et al. (2016). The runoff data for model calibration are available at the Oak Ridge National Laboratory Distributed Active Archive Center (ORNL DAAC) (https://daac.ornl.gov/cgi-bin/dsviewer.pl? ds_id=994 TS9 ). The runoff data for model evaluation are available at the European Centre for Medium-Range Weather Forecasts (ECMWF) website (http://apps.ecmwf.int/datasets/ TS10 ). The NDII data and simulated hydrological data are available upon request from the corresponding author. 5 Supplement. The supplement related to this article is available online at: https://doi.org/10.5194/gmd-12-1-2019-supplement.

CE2
Please provide a definition.

CE3
Please provide a definition.

CE4
Please provide a definition.

CE5
Please confirm throughout.
Remarks from the typesetter TS1 The composition of Figs. 1-3 and 6-7 has been adjusted to our standards. This also includes language adjustments to Figs. 1-5 and 7.

TS2
Please provide date of last access.

TS3
Please provide date of last access.

TS4
Please provide date of last access.

TS5
Please provide date of last access.

TS6
Please provide date of last access.

TS7
Please provide a reference list entry including creators, title, and date of last access.

TS8
Please provide a reference list entry including creators, title, and date of last access.

TS9
Please provide a reference list entry including creators, title, and date of last access.

TS10
Please provide a reference list entry including creators, title, and date of last access.

TS11
Please note that there is a discrepancy between funding information provided by you in the acknowledgements and the funding information you indicated during manuscript registration, which we used to create this section. Please doublecheck your acknowledgements to see whether repeated information can be removed from the acknowledgement or changed accordingly. If further funders should be added to this section, please provide the funder names and the grant numbers. Thanks.

TS12
Please provide page range.