FORHYCS v1.0: A spatially distributed model combining hydrology and forest dynamics

We present FORHYCS (FORests and HYdrology under Climate Change in Switzerland), a distributed ecohydrological model to assess the impact of climate change on water resources and forest dynamics. FORHYCS is based on the coupling of the hydrological model PREVAH and the forest landscape model TreeMig. In a coupled simulation, both original models are executed simultaneously and exchange information through shared variables. The simulated canopy structure is summarized by the leaf area index (LAI), which affects local water balance calculations. On the other hand, an annual drought index is 5 obtained from daily simulated potential and actual transpiration. This drought index affects tree growth and mortality, as well as a species-specific tree height limitation. The effective rooting depth is simulated as a function of climate, soil and simulated above-ground vegetation structure. Other interface variables include stomatal resistance and leaf phenology. Case study simulations with the model were performed in the Navizence catchment in the Central Swiss Alps, with a sharp elevational gradient and climatic conditions ranging from dry inneralpine to high alpine. In a first experiment, the model was 10 run for 500 years with different configurations. The results were compared against observations of vegetation properties from national forest inventories, remotely sensed LAI and high-resolution canopy height maps from stereo aerial images. Two new metrics are proposed for a quantitative comparison of observed and simulated canopy structure. In a second experiment, the model was run for 130 years under climate change scenarios using both idealized temperature and precipitation change and meteorological forcing from downscaled GCM-RCM model chains. 15 The first experiment showed that model configuration greatly influences simulated vegetation structure. In particular, simulations where height limitation was dependent on environmental stress showed a much better fit to canopy height observations. Spatial patterns of simulated LAI were more realistic than for uncoupled simulations of the forest landscape model, although some model deficiencies are still evident. Under idealized climate change scenarios, the effect of the coupling varied regionally, with the greatest effects on simulated streamflow (up to 60 mm y−1 difference with respect to a simulation with static vegeta20 tion parameters) seen at the valley bottom and in regions currently above the treeline. This case study shows the importance of coupling hydrology and vegetation dynamics to simulate the impact of climate change on ecosystems. Nevertheless, it also

biomass and canopy cover Fuhrer et al., 2006). The increased frequency of extreme droughts will probably be a more important factor than a change in long-term averages (Fuhrer et al., 2006). Additionally, abandonment of high-mountain pastures, driven by socio-economic processes, is an important factor of land-use change (Price et al., 2016), interacting with climate change to allow the tree line shift upwards (Gehrig-Fasel et al., 2007). These developments and predictions highlight the need for an integrated simulation of hydrology, forest dynamics and land use change in Switzerland. 5

Coupled models of hydrology and forest dynamics
Interactions between hydrology and vegetation dynamics are included in various types of dynamic models, with widely different areas of application, levels of complexity, and spatial and temporal resolutions (Fatichi et al., 2016). One such domain are land surface models (LSMs), which represent land surface processes in climate models (e.g. CLM; Lawrence et al., 2011).
The main role of vegetation in these models is the partitioning of energy between sensible and latent heat fluxes. The latter 10 consist of evaporation and transpiration, thus representing the coupling between vegetation, hydrology and the atmosphere.
Over time, the representation of vegetation and evaporative fluxes in LSMs grew increasingly complex, moving from a simple vegetation-independent bucket model in early applications to a detailed description of vegetation processes, in particular physiological processes such as photosynthesis, carbon assimilation and nutrient cycling (see review by Seneviratne et al., 2010).
For a representation of vegetation dynamics, climate models are sometimes coupled with dynamic global vegetation models 15 (e.g. LPJ; Sitch et al., 2003), which can also be used for offline simulations. Although the primary objective of these models is to simulate vegetation patterns, they usually include a representation of terrestrial water balance, and are able to simulate river discharge (Gerten et al., 2004).
While these models typically operate at the global scale, and for computational efficiency simplify vegetation to a single average individual per plant type and coarse grid cell, some ecosystem models have been developed to depict vegetation 20 structure, such as the dynamic vegetation model LPJ-GUESS (Smith et al., 2001), which is based on individual plants. Like LPJ, this model contains a soil hydrology module that can be used to calculate discharge, but such predictions were greatly improved by adding a coupling to a routing scheme (Tang et al., 2013). A sensitivity analysis by Pappas et al. (2013) pointed out that, despite of its detailed mechanistic representation of transpirational demand and stomatal closure that affects carbon uptake, water stress effects are represented inaccurately in LPJ-GUESS. As the various effects of water shortage on trees are 25 well documented (McDowell et al., 2008), this may be seen as a weakness of LPJ-GUESS, especially since important plant functions, like cavitation or leaf area reduction under drought conditions are not implemented yet (Pappas et al., 2013;Manusch et al., 2014).
Another type of models representing the interface of hydrology and vegetation are forest water balance models. These usually simulate local water balance of forest stands to predict the influence of climatic change or forest management practices 30 on growth conditions for trees. Examples include WAWAHAMO (Zierl, 2001) or BILJOU (Granier et al., 1999). These models may be coupled with dynamic forest models (Lischke and Zierl, 2002;Seely et al., 2015), or run with vegetation parameters assimilated from forest inventories (Zierl, 2001;De Cáceres et al., 2015) or remote sensing (Chakroun et al., 2014). Moreover, most forest dynamics models include a water balance module (Bugmann and Cramer, 1998;Seidl et al., 2012). However, its role is usually restricted to quantifying soil moisture stress, and it is seldom used to predict streamflow.
Various hydrological models operating at catchment scale have been used to evaluate the effect of land use change, and the resulting change in vegetation, on streamflow. Such models include distributed models with a high spatial resolution, allowing for a detailed mapping of static vegetation parameters, e.g. DHSVM (Wigmosta et al., 1994). Other models include a simple 5 vegetation growth module, such as SWAT (Watson et al., 2008) or SWIM (Wattenbach et al., 2005). More complex models include a detailed representation of average plant biomass growth, carbon and nutrient cycling, and hydrological processes, such as RHESSys (Tague and Band, 2004), or Tethys-Chloris (Fatichi et al., 2012).
Furthermore, the impact of vegetation dynamics on streamflow has also been studied with models that were not originally developed to this effect. For example, Sutmöller et al. (2011) used the physically-based, distributed hydrological model 10 WaSiM-ETH (Schulla, 2015) to perform hydrological simulations with vegetation parameters derived from an individual based forest model driven by different forest management scenarios, which influence forest structure and species composition. Also, Schattan et al. (2013) applied the semi-conceptual hydrological model PREVAH (Gurtz et al., 1999) with vegetation parameters obtained from the forest landscape model TreeMig (Lischke et al., 2006). In the original versions of both these hydrological models, vegetation parameters were parameterized as a function of season and land cover class only. This one-way coupling 15 impacted mean annual streamflow by about 10 mm year −1 in two large catchments in Switzerland, whereas the effect on local water balance reached 40 mm year −1 in individual cells. Similarly, a modeling experiment by Köplin et al. (2013) showed that including transient land cover changes, such as forest cover increase or decrease or glacier retreat, could substantially affect water balance predictions. Such experiments highlight the importance of considering the dynamics and spatial variability of vegetation properties in hydrological simulations. 20 So far, most of the models combining hydrology and vegetation processes have a strong biogeochemical focus, whereas successional dynamics and interspecific competition are rarely considered. Couplings between models that explicitly simulate forest dynamics and hydrological models have so far mostly been the object of experimental studies. The results from these experiments show that coupling these processes may substantially alter model results and behavior (Sutmöller et al., 2011;Schattan et al., 2013). This gives an opportunity to increase the confidence in simulated impacts of climate change on forested 25 ecosystems. For reliable, country-level predictions of long-term climate impacts on water resources and forest structure and composition in a mountainous country such as Switzerland, a model should: explicitly simulate the feedbacks between forest properties and hydrology; return estimates of streamflow and (evapo)transpiration as well as tree species distribution and biomass; operate at a spatial resolution fine enough to account for the great variability in climate, topography and land cover;

30
be able to simulate the hydrology of non-vegetated areas such as glaciers, bare rock or built-up take into account the possibility of forest expansion and retreat not be too complex, so that it can be run for large areas and long periods at a reasonable computational cost.
None of the models discussed above fulfil all these criteria. This motivated the development of a spatially distributed model combining hydrology and forest dynamics, presented below.

Aims of this work
In this paper, we present a newly developed distributed ecohydrological model, FORHYCS (FORests and HYdrology under Climate change in Switzerland). This model combines two existing models, the hydrological model PREVAH (Gurtz et al., 5 1999) and the forest landscape model TreeMig (Lischke et al., 2006). FORHYCS is spatially distributed and operates on a grid of regular cells. The model outputs are hydrological quantities, such as catchment-integrated streamflow and maps of runoff, transpiration, evaporation or snow cover, and maps of forest properties, such as biomass, leaf area index (LAI), species distribution and tree density. In FORHYCS, the two source models are run simultaneously, and exchange information through shared variables. FORHYCS may be run in uncoupled mode (i.e. parallel simulations of hydrology and forest dynamics, 10 without any transfer of information between the two source models), with a one-way coupling (transfer of information from the forest landscape model to the hydrological model, or vice-versa), or in fully coupled mode.
Like its parent models, FORHYCS may be described as semi-conceptual. Hydrological processes (evaporation, transpiration, soil moisture dynamics and runoff generation) and forest dynamics (growth, mortality, establishment and migration of tree species) are based on physical and ecological theory as well as empirical approaches. Thus, the degree of complexity of 15 FORHYCS is lower than other coupled ecohydrological models, such as RHESSys (Tague and Band, 2004). Also, unlike these models, the focus of the ecological part is on forest dynamics, similarly to the addition to SWIM proposed by (Wattenbach et al., 2005). However, FORHYCS differs from the latter approach in that growth and mortality are simulated at the level of species and size classes, instead of a single biomass pool per cell. Both TreeMig and PREVAH were designed for applications at an intermediate spatial resolution, with a cell size between 100 m and 1 km. 20 The goal of this paper is to explore the interplay of hydrology and forest dynamics in a coupled model. The questions to be answered are (1) How does the coupling impact the behavior and the performance of both TreeMig and PREVAH, compared to uncoupled simulations? (2) Which aspects of the forest-hydrology coupling are of greatest importance for simulation results?
(3) What are the implications of model coupling for simulations under climate change?
The model is tested in five subcatchments of the Navizence catchment (27 to 87 km 2 ), located in the Swiss Central Alps. 25 To answer the first question, a full forest succession is modeled for the period 1500-2015, starting from bare soil. The forest model is run in uncoupled mode to serve as a reference simulation. For the coupled runs, various configurations are tested, with different aspects of the coupling between hydrology and forest dynamics switched on and off. The outputs are then compared to forest inventory data, as well as gridded datasets of leaf area index (LAI) and canopy height. To assess the effect of the coupling on hydrological predictions, simulated streamflow from coupled and uncoupled model runs is compared against a 30 time series of daily measurements. Furthermore, to assess the behavior of the coupled and uncoupled model under climate change, the model is run for a century under artificial climate change scenarios. In these future model runs, the significance of two further aspects of the forest-hydrology coupling are examined: the effect of elevated CO 2 on stomatal resistance, as well as the potential forest expansion into high-elevation meadows as a result of climate and land-use change.
2 Methods and Data

FORHYCS model description
The two original models operate at a different temporal resolution: while TreeMig simulates forest dynamics with an annual time step, PREVAH calculates daily values, with an internal time step of one hour. Both original models are run simultaneously over the same domain and exchange information through shared variables. Figure 1 a) shows the flow of a simulation: hydrology 5 is simulated on a daily basis with vegetation properties given by the forest model for the previous year. This is based on the assumption that effects of current-year water balance on the forest structure and composition are negligible. An annual drought index, influencing tree growth and mortality (Sect. 2.1.3), is calculated in each cell from the transpiration simulated in the hydrological model over the whole simulation year (Sect. 2.1.4). Based on the number of trees per species and height class in each cell, annual maximal values of Leaf Area Index (LAI) and fractional canopy cover (FCC) are calculated. These values are 10 converted to daily values using a temperature-dependent phenology module (Sect 2.1.2). The root-zone water storage capacity (SFC) is updated annually as a function of long-term climate (Sect. 2.1.5). Furthermore, the model includes the effects of snow-induced mortality for some species using an additional mortality function based on snow cover duration (Sect. 2.1.6).

Source models
The semi-conceptual hydrological model PREVAH (Gurtz et al., 1999) in its fully distributed form (Schattan et al., 2013;15 Speich et al., 2015) solves the water balance of each grid cell by calculating evapotranspiration, soil water balance and runoff generation at a sub-daily time step. The core of PREVAH is based on the structure of the widely used model HBV (Bergström, 1992), combined with a conceptual runoff routing scheme (Gurtz et al., 1999). Successive model developments enabled or improved the treatment of interception (Menzel, 1996), snowpack dynamics , glacier runoff (Klok et al., 2001) and groundwater runoff . PREVAH was used, among other applications, to estimate the impact of 20 climate change on discharge (Zappa and Kan, 2007). Climate impact studies were also conducted using the distributed outputs of PREVAH (Speich et al., 2015).
The spatially explicit forest landscape model TreeMig (Lischke et al., 2006) simulates forest establishment, growth and mortality, as well as seed dispersal. Originally based on a gap model (Lischke et al., 1998), TreeMig calculates the number of trees per species and height in each cell, and operates at an annual time step. Inter-and intra-specific competition is rep- 25 resented through light distribution within the canopy, which depends on the distribution of trees of different height within a stand, and thus on leaf area (Lischke et al., 1998). TreeMig was used to predict climate impact on tree species distribution in Switzerland , as well as to simulate forest response to land abandonment (Rickebusch et al., 2007) and the feedbacks between forests and avalanches (Zurbriggen et al., 2014). Abiotic drivers of forest dynamics are represented by three bioclimatic indices: mean temperature of the coldest month, degree-day sum and drought (see Sect. 2.1.3 for the latter).

Canopy structure and leaf phenology
Leaf Area Index (LAI) is calculated in TreeMig to account for mutual shading and competition for light (Lischke et al., 1998).
In FORHYCS, LAI is passed to the hydrological model, replacing the land cover-specific parameterization in PREVAH. The allometric equations of Bugmann (1994) relate leaf area to diameter at breast height (D [cm]). As D is allometrically linked to tree height, leaf area is calculated for each group of trees of the same species and height class, then summed to obtain the 5 value for the whole stand:  [-] are species-specific allometric parameters (Bugmann, 1994), and p d,sp is a dimensionless reduction term accounting for seasonal variations in leaf area, ranging from 10 0 to 1 (see below and in the supplement). The reference area in TreeMig is the size of a forest plot, set to 833 m 2 (1/12 ha), which is the reference area in many gap models (e.g. Bugmann, 1994), so that: where k c is a factor to convert from true to projected leaf area, set to 0.5 for broadleaves and 0.4 for conifers following Hammel and Kennel (2001). Fractional canopy cover f c is also calculated from the number of trees per species and height 15 class, using the equation of Zurbriggen et al. (2014): where CA sp,hc is the total crown area for trees of each species and height class: where k a1,sp and k a2,sp are species-specific allometric parameters, and h hc the (upper) height of height class hc. Minimum 20 values for LAI and f c are set to 0.2 and 0.1, respectively, to account for a minimal cover by grass and herbs, as well as bare stems. Annual maximum LAI may be reduced as a function of drought stress of the previous years, as described below in Section 2.1.3. It is worth noting that leaf area and crown area are calculated independently from each other, both using empirical relationships with tree size.
To obtain daily values for LAI and f c , a leaf phenology module has been implemented. The variable p d,sp reflects the 25 phenological status of species sp in a given cell, and varies between zero (no leaves) and one (full foliage). This module also defines the start and end of the growing season in each cell, which is required to calculate the climatic indices for the rooting depth module (Sect. 2.1.5). In each cell, the growing season lasts as long as p d,sp is greater than 0.5 for the dominant species.
Daily values of p d,sp are simulated in spring with the model of Murray et al. (1989), which depends on the number of chill days in winter and of accumulated growing degree days in summer. When p d,sp reaches one, the foliage is assumed to be fully developed. The onset of leaf senescence in autumn is simulated using the model of Delpierre et al. (2009), which depends 5 on temperature and photoperiod. For broadleaves, the end of the growing season is set 14 days after the onset of senescence, and p d,sp is linearly reduced from one to zero during this period. Although the leaf area is not varied throughout the year for evergreen conifers, p d,sp is still simulated to define the start and end of the growing season. For these species (as well as for the deciduous conifer Larix decidua), the development of p d,sp following the onset of senescence is calculated using the formulation of Scherstjanoi et al. (2014). The parameters for the empirical models of Murray et al. (1989) and Delpierre  (Defila and Clot, 2001). The species-specific parameters and a description of the calibration procedure are given in the supplement.

Drought and its effects on forest growth, mortality and structure
The drought index used in TreeMig is based on the ratio of annual to potential evapotranspiration, as calculated with a module modified from Thornthwaite and Mather (1957). This scheme requires monthly precipitation sum and temperature average 15 for each cell, as well as the plant-available water storage capacity. Evaporative demand is calculated using an empirical, temperature-dependent approach, and monthly soil water balance is simulated based on water supply and demand, after accounting for interception (see Bugmann and Cramer, 1998, for a full description of this scheme). This approach has a number of drawbacks. First, it does not consider the effect of variations in vegetation properties on evaporative supply and demand, thus neglecting feedbacks between vegetation density, transpiration and drought (see e.g. Kergoat, 1998). Also, the  Mather routine does not account for snow-related processes, which can lead to large errors in the estimation of evapotranspiration, diminishing the representativeness of the index for growth conditions (Anderegg et al., 2013). Furthermore, empirical evapotranspiration formulations such as the Thornthwaite-Mather formula rely to a large extent on calibrated values. These may not be transferrable to climatic conditions that differ from the calibration period (Bartholomeus et al., 2015), limiting the ability of the model to make predictions under a changing climate. For these reasons, this abiotic drought index was replaced 25 by a relative transpiration index (Speich et al., 2018a). This index is similar to the evapotranspiration deficit index presented above, but is based on transpiration rather than evapotranspiration: where de and ds are the first and last day of the period for which the drought stress is calculated, and E T,pot,d and E T,act,d are modeled daily canopy transpiration sums [mm/day]. Here, DI is calculated for the entire year, so that de = 1 and ds = 30 365. Furthermore, to account for delayed effects of drought on tree physiology (Hammel and Kennel, 2001), the average of the last three years is used here. Transpiration is reduced from potential rate through the effect of low soil moisture or high atmospheric vapor pressure deficit (VPD) on stomatal resistance (Eq. 14 in Speich et al. (2018a)). The rationale behind this index is based on the fact that stomatal closure is one of the first responses of a plant to water deficit. Therefore, the time during which stomatal resistance is increased due to drought is the time during which adverse physiological effects of water shortage (e.g. cavitation, reduced carbon uptake) are likely to occur, and DI serves as a proxy for all these processes. The effect of VPD was included to account for the effects of high evaporative demand on plant-internal hydraulics (Zierl, 2001). The drought 5 stress function f DS determines the relative drought-induced limitation of annual growth (Bugmann, 1994): where k D T is a species-specific drought tolerance parameter, indicating the value of DI at which growth is completely suppressed. This growth reduction function can take values between zero (complete growth suppression) and one (unstressed conditions). Annual growth of the trees of the same species and height class is the product of a species-specific maximal 10 growth, an environmental reduction function f env , and a further reduction term accounting for shading. This reduction function is the geometric mean of f DS and two other stress functions, representing the effects of temperature and nitrogen availability (Bugmann, 1994). While the effects of temperature are taken into account in this study, nitrogen availability is kept spatially and temporally constant. These two environment-dependent stress functions are further described in Section 1.2 of the Supplement (for a description of the effects of light competition and shading, we refer to Lischke et al. (2006)). The same reduction 15 function is used to simulate mortality in addition to background mortality, and applies if it is more severe than mortality caused by low productivity. Lischke and Zierl (2002) parameterized k DT for the 30 tree species represented in TreeMig by overlaying modelled DI with inventory-derived maps of species distribution. The values range between 0.27 and 0.5. However, this parameterization did not lead to satisfactory simulations of species composition in the case study of this paper. Therefore, species-specific k DT was defined based on a combination of the rankings by Lischke and Zierl (2002) and Niinemets and 20 Valladares (2006). Table S7 in the supplement lists the k DT values used in this study.
FORHYCS accounts for two additional effects of drought stress: a limitation of maximum height and a reduction of annual maximal LAI. The former is parameterized following Rasche et al. (2012), i.e. species-specific maximum tree height may be reduced as a function of the bioclimatic indices DI and DDEGS. The parameter k redmax , which is also species-specific, indicates the fraction of maximum height that can be attained by trees if one of the environmental vitality functions is at its 25 minimum. The more severe of the two reductions (drought or degree-days) is applied. Unlike in the formulation of Rasche et al. (2012), where the reduction is a linear function of the bioclimatic indices, the impact functions (Eq. 6 for drought and S8 for degree-day sum) are used here.
The LAI reduction function follows the formulation of Landsberg and Waring (1997), where the fraction of carbon allocated to roots increases under stress, whereas allocation to foliage and stem decreases. Since allocation is not explicitly simulated in 30 FORHYCS, the following formulation is purely phenomenological. For all size classes of a given species, leaf area is scaled by the ratio of the foliage allocation coefficient under current η l and unstressed conditions η l,u . Eq. 1 is thus modified as follows: The allocation coefficients for foliage are calculated as: where η r and η r,u are the allocation coefficients to roots, and η s and η s,u the allocation coefficients to the stem, under current and unstressed conditions, respectively. Following Landsberg and Waring (1997), η r,u is set to 0.229, and η r increases with increasing stress by the following relation: where f env is the geometrical mean of the drought and low temperature stress functions (Eqs. 6 and S8). The carbon allocated 10 to the stem is related to η r as follows: where p l,s is the ratio of the growth rates of leaves and stems, in terms of their change in relation to diameter at breast height D. In FORHYCS, p l,s is calculated using the allometric equations used to calculate leaf and stem biomass: 15 where k l,1 and k l,2 are allometric parameters for leaf biomass, and k s,1 and k s,2 are allometric parameters for stem biomass (Bugmann, 1994). It is important to stress that FORHYCS does not explicitly simulate carbon assimilation. Hence, Eqs. 9 and 10 are used only to determine a reduction factor for leaf area (Eq. 8) and have no direct influence on simulated tree growth, stem biomass and rooting depth.

Partitioning of transpiration and soil evaporation 20
The implementation of the new drought index (see Sect. 2.1.3 above) required some changes to the evapotranspiration routine in the hydrological model. While the relative transpiration index is based on estimates of actual and potential transpiration, PREVAH does not explicitly differentiate between transpiration and soil evaporation. Therefore, a new local water balance routine was implemented, based on the stand-alone model FORHYTM (Speich et al., 2018a). This module combines the soil water balance formulation of the HBV model (Bergström, 1992), which is also implemented in PREVAH, with the transpiration and evaporation scheme of Guan and Wilson (2009) and a Jarvis-type (Jarvis, 1976) parameterization of canopy resistance. A full description is given in (Speich et al., 2018a).
The parameterization of canopy resistance differs from the original formulation in two ways. First, the effect of atmospheric vapor pressure deficit (VPD) on stomatal conductance is represented with a negative exponential function instead of a linear function. Second, an additional canopy resistance modifier (f 5 ) was implemented, to account for the effect of atmospheric . This function is based on the results of Medlyn et al. (2001): where j c represents the fractional change in conductance in response to an increase in C a from 350 to 700 µmol mol −1 , and was set to 0.1 for coniferous forests, 0.25 for broadleaf forests and 0.18 for mixed forests (the forest type in each cell is determined based on the relative share of above-ground biomass belonging to conifers and broadleaves). These values were 10 set based on the results reported by Medlyn et al. (2001): coniferous species had a value of j c between 0 and 0.2, whereas broadleaves had values up to 0.4. Therefore, for conifers, a value of 0.1 was selected. For broadleaves, as there seemed to be some acclimation for trees growing in elevated CO 2 , a more conservative (than 0.4) value of 0.25 was chosen. The value for mixed forests corresponds to the arithmetic mean of the two. This affects both potential and actual transpiration, so that with all other factors kept constant, increases in C a will reduce the level of drought stress. The rationale for implementing this new 15 water balance scheme is to account for the effect of variations in vegetation properties (e.g. LAI) on physiological drought in forests. As FORHYCS includes the possibility of changing land cover classes in a cell, some non-forested cells may become forested over the course of a simulation. As vegetation parameters (such as LAI and effective rooting depth) are prescribed as a function of land cover for non-forested cells, this shift inevitably introduces an artificial discontinuity in the simulation. To reduce this discontinuity, the new water balance scheme is also used for potentially forested land cover types. On the other 20 hand, for land cover types that cannot become forested, the original water balance scheme of PREVAH (Gurtz et al., 1999) is applied.

Rooting zone storage capacity
The rooting zone water holding capacity, SFC, is calculated as the product of effective root depth Z e and soil water holding capacity κ (Federer et al., 2003). While κ is assumed to remain constant, Z e is assumed to vary as a function of vegetation 25 characteristics and climate. The approach used to parameterize Z e is the carbon cost-benefit approach of Guswa (2008Guswa ( , 2010.
This approach rests on the assumption that plants dimension their rooting systems in a way that optimizes their carbon budget.
The optimal rooting depth is the depth at which the marginal carbon costs of deeper roots (linked to root respiration and construction) starts to outweigh the marginal benefits (i.e. additional carbon uptake due to greater availability of water for transpiration). The implementation of this model in FORHYCS follows the procedure described by Speich et al. (2018b).

30
Effective rooting depth, expressed as an average over the whole cell, is calculated for both overstory (trees) and understory (shrubs and non-woody plants). The storage volume SFC for a given cell is defined as the sum of these two area-averaged rooting depths, multiplied with soil water holding capacity κ. A full description of this implementation is given in Speich et al. (2018b). The underlying equation is: where γ r is root respiration rate [mg C g −1 roots day −1 ], D r root length density [cm roots cm −3 soil], L r specific root of a year] and T mean daily transpiration[mm day −1 ] during the growing season. The left hand side represents the marginal cost of deeper roots, and the right hand side the marginal benefits, and solving for Z e gives the optimal rooting depth. Any equation can be used to relate d T to dZ e . In this implementation, the probabilistic models of Milly (1993) and Porporato et al. (2004) are used for the understory and overstory, respectively. These two models reflect the differing water uptake strategies of grasses and trees (Guswa, 2010). Both models estimate transpiration based on soil water holding capacity κ and long-term 10 averages of climatic indices. Evaporative demand is represented by potential transpiration, and rainfall is represented as a marked Poisson process characterized by the frequency (λ [events day −1 ]) and mean intensity (α [mm event −1 ]) of events. In FORHYCS, these variables are calculated as rolling means with a window of 30 years, including only the growing seasons.
Potential transpiration for the understory and overstory are taken from the calculations of the local water balance module (Sect. 2.1.4), and the rainfall characteristics are taken from modeled effective precipitation (i.e. after accounting for interception). 15 The start and end of the growing season are determined based on the phenology module (Sect. 2.1.2). In addition, mean daily air temperature is calculated over the growing season to adjust respiration rate. The plant-specific parameters in Eq. 13 are summarized in the variable P P o , defined as follows: where γ r,20 is the root respiration rate at 20°C. The actual root respiration rate is dependent on annually averaged temperature 20 via a Q 10 function. For further details, please refer to Speich et al. (2018b). A higher value of P P o indicates a greater difficulty for the plant to develop additional roots. In FORHYCS, P P o was set to 1.263 × 10 −4 for conifers and 1.01 × 10 −4 for broadleaved species. At cell level, P P o was averaged based on the relative share of aboveground biomass belonging to conifers and broadleaves. For the understory, the corresponding parameter P P u is set to 1.512 × 10 −4 .
2.1.6 Snow-cover induced seedling mortality 25 For seedlings of the high-mountain species Larix decidua, Pinus cembra and P. montana, the model also includes the effect of snow-induced fungal infections via the variable F DSA (final day of snow ablation), as described by Zurbriggen et al. (2014).
An additional mortality term is calculated: where a, b and c are empirical parameters fitted by Zurbriggen et al. (2014) for Larix decidua and for the two aforementioned Pinus species. If µ s is greater than background mortality or the mortality term integrating light, temperature and water stress, it is applied instead for the seedlings of these species. This aspect of the model was implemented to examine the feedback between forest dynamics and avalanches on a small spatial scale (Zurbriggen et al., 2014), but was never tested on landscape scale. Here, FDSA is defined as the last day of the year with more than 5 mm snow water equivalent. 5 2.1.7 Uncoupled mode and one-way coupling The methods have so far described the model FORHYCS in its fully coupled version. It is also possible to run FORHYCS in uncoupled mode (without any information transfer between the hydrological and forest models), or with a one-way coupling (information transfer from the forest model to the hydrological model only rameter values for different land cover types are given in Table S6. Non-vegetated land cover classes (e.g. built-up, bare rocks) use the same standard soil parameters as in original PREVAH (Gurtz et al., 1999). Another difference between FORHYCS and PREVAH is the parameterization of canopy resistance. Whereas PREVAH gives a minimum canopy resistance for each land cover class (i.e. normalized by leaf area index), the new water balance module requires a minimum stomatal resistance.
Following Guan and Wilson (2009), minimum stomatal resistance was set to 180 s m −1 for forests, 130 s m −1 for meadows 25 and grasslands, and 210 m −1 for shrubs.
One-way coupling is similar to the modeling experiment of Schattan et al. (2013): vegetation variables from TreeMig are passed to PREVAH, but there is no feedback from the hydrological to the forest model. This configuration uses the abiotic drought index calculated with FORCLIM-E (Bugmann and Cramer, 1998). In this study, TreeMig was run with two soil datasets (BEK and RA2015; see below). In all cases, the hydrological part of the model uses the RA2015 dataset. In this study, 30 rooting zone storage capacity SFC of the hydrological model was kept constant in one-way coupled mode, assuming a rooting depth of 1 m. Enabling climate-dependent adaptation of SFC in this mode would impact simulation results for the hydrology part, but not for the forest.
2.2 The Navizence case study

Catchment description
The Navizence catchment is located in the central Swiss Alps and covers an area of 255 km 2 . To enable the future migration of tree species that are currently not represented in the catchment, the modeling area extends beyond the catchment to form the rectangular area shown on Fig. 2 (1079 km 2 ). The catchment is characterized by a sharp elevational gradient, with elevations 5 ranging from 522 to 4505 m asl. Like in its neighboring valleys, this gradient is reflected in the hydro-climatic conditions.
Due to the shielding effect of mountain ranges, the Rhône valley, where the catchment outlet is located, is the driest region of Switzerland, with mean annual precipitation (MAP;1981 at Sion totaling 603 mm (MeteoSwiss, 2014). However, the valley presents a strong altitudinal precipitation gradient, with MAP exceeding 2500 mm at 3000 m asl, most of it falling as snow (Reynard et al., 2014). many forests were clear-cut between the Middle Ages and the nineteenth century, mostly for fuel (Burga, 1988). In the first half of the twentieth century, the quantitatively most important anthropogenic disturbance factors were litter collecting and wood pasture by goats (Gimmi et al., 2008). Nowadays, these practices have been largely abandoned, and timber harvesting plays a limited role in the region. As a result of past and current anthropogenic factors, the main deviations from potential natural forest composition are (1) silvicultural practices favoring certain species, such as Pinus sylvestris and Larix decidua (2) Effects 20 of litter removal and grazing and (3) a replacement of L. decidua and Pinus cembra by mountain pastures (Büntgen et al., 2006) and dwarf shrubs (Burga, 1988) near the treeline.
Currently, three major hydropower plants are operational in the valley, with a total installed capacity of 164 MW and a mean net annual production of 570 GWh. The main reservoir is the artificial lake Lac de Moiry, located in a lateral valley, with a storage capacity of 77 mio m 3 . A system of pipelines has been built to divert water from the Navizence, as well as from a 25 neighboring catchment, into the lake.

Input data
Three kinds of spatial data are needed to run the model: daily meteorological data, time-invariant physiographic data, and spatially distributed model parameters for PREVAH. These parameters represent environmental factors not related to vegetation, such as snow, glacier and runoff generation processes. Further information on the parameterization of PREVAH is given in Physiographic data consists of information on soil, topography and land cover. Soil is represented in terms of water holding capacity κ [mm water depth mm −1 soil depth] and soil depth. Two different datasets are used for soil properties. In previous applications in Switzerland, both PREVAH and TreeMig used grids of κ and soil depth from a country-wide agricultural suitability map (BfR, 1980; hereafter referred to as "BEK"). The resulting rooting zone storage capacities (SFC [mm]) in the forested cells of the study region range between 3.75 and 110 mm. As this dataset was not specifically developed for use in 5 forests, and some values are implausibly low, Remund and Augustin (2015) generated a new country-wide dataset (RA2015) of rooting zone storage capacity, on the basis 1234 forest soil profiles throughout Switzerland, combined with a lithological map. This dataset gives the volume of water that can be stored in the soil for a depth of up to 1 m, with lower values in cells where soil is assumed to be shallower. SFC in the new dataset ranges from 71 to 223 mm in forested cells of the study region. Figure S1 shows the rooting zone storage capacity in forested cells of the study region for both soil parameterizations. For 10 coupled simulations, only the RA2015 parameterization is used. To facilitate comparison with results from previous studies, the parent models PREVAH and TreeMig are also run with the BEK parameterization.

Comparison data and metrics of agreement
This section describes the data against which model outputs were compared, including three datasets of vegetation properties and one dataset of streamflow measurements. These datasets are used to plausibilize model outputs and serve as a basis for 15 the choice of model configuration. Daily streamflow was obtained from the operator of the power plants. This data includes an acconuting of the amount of water diverted through the different pipelines. From this, time series of natural streamflow were reconstructed, which were used as observations.
The simulated stem numbers and above-ground biomass were compared against data from the first Swiss National Forest Inventory (NFI; Bachofen et al., 1988). As the sampling plots of the NFI are distributed on a regular grid, each plot is a 20 randomly selected from all forest plots in that region, and may not be considered representative for a larger area. It is therefore not sensible to compare simulated and observed biomass at the scale of single inventory plots. Instead, the 245 NFI plots in the study area were aggregated to seven classes based on aspect and elevation, with four elevation bands for North-facing plots and three for South-facing plots. This way, each class has a sample size of at least 30 plots, which ensures that the averages are representative. For comparison with inventory data, simulated biomass was also averaged over the same strata. 25 Simulated LAI was compared to the remotely sensed LAI dataset provided by Copernicus at 300 m resolution (Copernicus Service Information, 2017). This dataset uses measurements of the satellite Proba-V and its temporal coverage starts in January 2014. The LAI 300m rasters were resampled to match the resolution and extent of the model input and output, then stratified as described above. For each cell of the original LAI 300m grids, the maximum value of the 10-day periods contained between May and July of the years 2014 through 2016 was used as an estimate of maximum LAI, independent of intra-annual 30 fluctuations. As forest properties are also shaped by local processes not represented in the model (e.g. disturbance, forest management), a direct cell-by-cell correspondence of simulated and observed values is not expected. Furthermore, cells of the remote sensing dataset are spatially heterogeneous and may include non-forested parts. However, it was still assumed that over larger domains, a good correspondence between simulated and observed LAI would be indicative of good model performance.
Therefore, the simulations and observations were also stratified by elevation and aspect. As the number of cells is relatively large, smaller elevation bands were chosen than for inventory data. The cells are divided between North and South facing aspects, then binned into nine elevation bands. These elevation bands have a fixed width of 200 m, except the lowest (< 700 m asl) and highest (> 2100 m asl). Each zone contains between 100 and 714 forested cells. Model outputs and observations are only evaluated over the areas classified as forests or shrubland in the model input.

5
The recently developed Switzerland-wide vegetation height dataset at 1 m resolution of Ginzler and Hobi (2016), derived from stereo aerial images, was compared against simulated canopy structure. To compare the level of agreement between observed and simulated canopy structure, two novel metrics were introduced. The rationale behind these metrics is illustrated in Figure 3. Both measures are based on the discrete height classes used in TreeMig, and use the cumulative fractional cover of each height class, starting from the highest class. In the illustrative example of Fig. 3 a), the cumulative fractional cover 10 of the height class "5-10 m" corresponds to the visible crown area of the height classes "5-10 m" and "10-15 m", divided by the potentially vegetated area (i.e. excluding the area occupied by a road, shown in grey). In each cell of the model grid, the fractional cover of each height class is calculated from simulated species-size distribution using the procedure described in Eqs. 3 and 4. For each height class i, the cumulative fractional cover f c,i is defined as follows: 15 The procedure used to calculate fractional cover is based on the assumption that trees are randomly distributed in space and accounts for overlap between crowns. For example, applying Eq. 16 to the upper 3 height classes will return the fractional cover for the trees belonging to these classes, accounting for overlap between them. This assumes that shading of lower parts of their crowns by smaller trees can be neglected.  Fig. 3 e) and f). The cumulative fractional cover of each class (starting from the top) is plotted for observations and simulations. The better the agreement, the closer the curves are to each other. Therefore, the second measure 30 of agreement, termed 1-ABC (where ABC stands for "Area Between the Curves"), is defined as the fraction of the plot area not contained between the curves. The plot in Fig. 3 e) applies this to the cell shown in Fig. 3 c), and represents a case with a poor agreement between simulations and observations. Indeed, this area was devastated by a wildfire, and is thus currently very sparsely forested. As this fire is not represented in the model, the simulations indicate a fully developed forest. Even in this extreme case, the ABC does not exceed 40% of the plot area. Therefore, a 1-ABC score of 0.6 can be considered a poor fit. On the other hand, the sample cell in Fig. 3 d) and f) shows a good agreement between observations and simulations, with 1-ABC exceeding 0.99. These two measures of agreement can be used to evaluate the performance of a model by examining their distribution over the whole simulation domain. A better performing model will have a higher proportion of cells with a dH95 close to zero and a 1-ABC close to one. Furthermore, the spatial distribution of dH95 and 1-ABC may give insight into 5 the factors that contribute to agreement or disagreement between simulations and observations.

Simulation experiments
To evaluate the behavior of the coupled model FORHYCS and the importance of the different forest-hydrology couplings implemented, two series of simulation experiments have been conducted. An overview of the different simulation runs is given in Table 1. In a first series of experiments, the simulations start with no forest, and a full succession is modeled. The names 10 for these simulations start with "S". The second part of the name indicates whether these experiments were conducted in fully coupled mode ("F") or in one-way coupled mode ("T"). The next part indicates which couplings are switched on and off, as per Table 1. Finally, for the one-way coupled runs, the fourth part of the name indicates the source of the soil water holding capacity ("BEK" or "RA15", see next paragraph). In a second series of experiments, starting with "C", the model was run with several climate change scenarios (see below). In this set of experiments, the model was run in uncoupled ("U"), one-way 15 coupled ("T") and fully coupled mode ("F"). In addition, a simulation with stand-alone PREVAH was run ("P"). For the fully coupled simulations, two additional experiments were run (see below), testing the effect of CO 2 ("_NCS") and land cover change ("_LC").
In consists of years bootstrapped from the period 1981 to 2000. In two cases (S_T_BEK and S_T_RA15), the model is run in uncoupled mode, and only the forest output is evaluated. This is equivalent to a standard TreeMig run. The difference between the two runs is the parameterization of the rooting zone storage capacity for the (abiotic) drought stress module (FORCLIM-E; Bugmann and Cramer, 1998). In the first case, the storage capacity in each cell is given by the soil depth and water holding capacity given in the Swiss soil map for agricultural suitability (referred to as "BEK"; BfR, 1980), as in previous TreeMig 25 applications in Switzerland (e.g. Bugmann et al., 2014). In the second case, the parameterization of Remund and Augustin (referred to as "RA15"; 2015) is used. As noted in Sect. 2.2.2, the soil water holding capacity is much larger in the RA2015 dataset for most cells. As a result, the (abiotic) drought index also shows great differences between the two model runs. Figure   S1 c) and d) shows the difference in mean annual drought index , and maximum annual drought index between the   seedling mortality). Based on pilot study results, the configuration S_F_noLAred (all couplings switched on, except leaf area reduction) was selected as the best configuration, and the other configurations in Table 1 differ from S_F_noLAred by only one process switched on or off.
The configuration S_F_noLAred is also used for the second set of model runs, which starts in the year 1971 and ends in 2100. In the "delta" set of model runs, the sensitivity to a ramp-shaped climate change is evaluated (see Fig. 4). In the period 5 1971-2015, observed forcing is used. From 2016 to 2100, years are randomly selected from the period 1981-2015 (excluding the abnormally dry and hot year 2003). Furthermore, from 2016 on, daily temperature is incremented by a given amount of degrees dT, and daily precipitation is scaled by a given factor dP. The values of these factors are given in Table 1. To emulate a gradual progression of climate change, these factors are scaled linearly between zero and their full value between 2016 and 2050. The runs C_F_NCS test the impact of the CO 2 effect on stomatal closing, implemented through Eq. 12. In these runs, the 10 CO 2 response function is always set to one, i.e. stomatal response to high CO 2 is switched off, whereas in all other runs, this effect is active. In all the runs presented so far, forest growth is restricted to the currently forested cells. In the runs C_F_LC, forest is allowed to grow in all potentially forested land cover classes. Thus, the potential ecohydrological consequences of land abandonment and rising treelines are examined. In addition to the runs with delta change, three runs were performed with meteorological forcing from downscaled regional climate simulations, generated in the CH2018 project (National Centre for  used in Speich et al., 2015) for reference. There is little difference between the scores of these three runs, and no model version consistently outperforms the others. The last four columns of Table 2    3.2 Forest spin-up with different model configurations Figure 6 shows the aboveground biomass simulated with FORHYCS using the configuration S_F_noLAred (analogous figures for the other configurations are provided in the Supplement; Fig. S9 to S16). In addition, the bar shows the average aboveground biomass of trees in plots of the first Swiss national forest inventory (1982-1986Bachofen et al., 1988). The simulations start 5 in 1500 with no trees. Biomass increases quickly at the beginning, so that in most zones, values close to 100 t ha −1 are reached within the first 40 years. The initial species composition consists of various broadleaf species, among which the maple species Acer campestre and A. pseudoplatanus. The former is more prevalent at lower elevations, and the latter at higher elevations (to avoid overloading the figure, the maple species are grouped together). After this initial state, Pinus sylvestris (mainly at lower elevations) and Larix decidua (mainly at higher elevations) start developing. Finally, Quercus species (lower elevations, mainly 10 Q. pubescens and Q. petraea), Abies alba (north-facing slopes) and Picea abies develop, with the latter becoming dominant at higher elevations. At the end of the simulation, the forest seems to have reached a state in which the relative importance of species changes little over time. At lower elevations, the state at the end of the simulation is close to that reached after the first 100 or 200 years. By contrast, at high elevations, the replacement of L. decidua by P. abies occurs over a much longer time.

Biomass and species composition
At low elevations, simulated biomass is higher than the inventory value, especially on north-facing slopes. By contrast, decidua makes up one third of observed biomass in that elevation band, while it is practically non-existent in the simulations.
Also at higher elevations, the simulated biomass of L. decidua is much lower than in the inventory data. height reduction (S_T_Hmax_BEK and S_T_Hmax_BEK; Figs. S11 and S12), species composition is similar to the results of standard TreeMig, but with much lower biomass.
The configuration S_F_Full differs from S_F_noLAred in that LAI is reduced as a function of drought or low temperatures.
Biomass simulated with this configuration (Fig. S13) is markedly higher than with S_F_noLAred, especially at lower and   (Table 1). The graphs show annual values, averaged over seven clusters of cells. The bar shows the aboveground biomass in the same area, from the first Swiss national forest inventory (1982-1986Bachofen et al., 1988). The limits of the elevation bands were set so that each cluster contains at least 30 forest inventory plots. The dashed line marks the year 1971, from when meteorological data are available. Simulation years before 1971 use meteorological data bootstrapped from the years 1981-2000.

Canopy structure
The distribution of the two metrics of agreement between observed and simulated canopy structure, is shown on Fig. 7 shows a distribution of dH95 that is very similar to S_F_noLAred and S_F_noSmort, its density distribution for 1-ABC differs from that of the other 2 configurations. S_F_Full has a lower density around 0.95, but a higher density between 0.7 and 0.85, indicating a lower degree of agreement between observations and simulation for this configuration.  Table 1. For the observations, the averages for the different elevation and aspect classes range approximately between 2 and 4. The lowest values occur on south-facing slopes below 700 m asl (i.e. close to the bottom of the Rhône valley). LAI of the south-facing slopes increases steadily with elevation up to 1700-1900 m asl, where it reaches a value of 4. For cells over 2100 m asl, LAI is again somewhat smaller. LAI is generally higher on north-facing slopes. There is little difference among the elevation bands between 700 and 2100 m asl. Average LAI is always around 4 in that elevational range. 20 Smaller values occur only in the lowest and highest elevation bands. The TreeMig runs differ greatly in the range and pattern of simulated LAI. For S_T_Hmax_BEK, parameterized with available water capacity (AWC) from the soil suitability map (BfR, 1980), the values range between 2 and 6. The highest values occur in the lowest elevation band and, for north-facing slopes, at 1700-1900 m asl. Except for the lowest elevation band, LAI on north-facing slopes is markedly higher than on south-facing slopes, with differences of up to 2.5. By contrast, for S_T_Hmax_RA15, which uses the AWC from Remund and Augustin 25 (2015), the absolute values are much higher and the variability much lower. For all elevation bands, values range between 6.5 and 7.5. The results for TreeMig runs with height limitation are almost equal to the standard version.

Leaf Area Index
There is less difference in patterns and absolute values among the FORHYCS runs. The absolute values range from 4 to 7.
Spread is lowest for the two highest elevation bands, where all configurations give a value of approximately 6, for both northand south-facing slopes. At lower elevations, there is a clear difference between the two aspect classes, with consistently higher 30 values for the north-facing slopes. The difference between configurations increases with decreasing elevation, and is somewhat higher on south-facing slopes. The configuration S_F_cSFC consistently returns the largest values. On north-facing slopes, the values are lowest for S_F_noLAred. This is also the case at higher elevations on south-facing slopes, whereas at lower Number of cells q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q  Table 1. More values closer to zero indicate a better agreement between observed and simulated canopy structure. As H95 uses the height classes of TreeMig, the results are given as discrete values with an interval of 4.2 m. The lower graph shows the density distribution of the 1-ABC scores (bandwidth=0.0075). The better the agreement between observed and simulated canopy structure, the more values are close to one. elevations, S_noHmax returns the lowest values. In most cases, the results of S_F_Full and S_F_noLAred are similar. Up to 1300-1500 m asl on north-facing slopes, S_F_Full gives somewhat higher values, whereas on south-facing slopes, the value for S_F_noLAred is higher. In all cases, the symbols for S_F_noSmort are indistinguishable from S_F_noLAred. The bottom graph shows the results of the various FORHYCS succession runs, with the configurations listed in Table 1. 3.3 Climate change runs 3.3.1 Differences between uncoupled, one-way coupled and fully coupled model runs Figure 9 shows how area-averaged LAI and rooting zone storage capacity SFC change for all future simulations (the results are shown here as rolling means with a 30-year window; a version without smoothing is given in Fig. S17). LAI and SFC are averaged over two of the strata used in Fig. 6, the lowest stratum of south-facing cells (Fig. 9 a) and c)) and the highest stratum 5 of north-facing cells (Fig. 9 b) and d)). Fig. 9 a) shows the results of the coupled run for the low-elevation south-facing cells.

TreeMig
At the end of the succession run, LAI is approximately 6, and over the next century, the values range between 3.5 and 6.5 for all runs except the dry GCM-RCM chain. Long-term average LAI is lower for warmer and drier runs. n particular, under the dry GCM-RCM chain, LAI starts to decrease sharply around 2050, with values less than 2 at the end of the simulation (Fig.   S17). The plot of annual values (Fig. S17) shows that under warmer scenarios, the variability of LAI is much higher, with 10 LAI decreasing much more in certain years than in less warm scenarios. In other years, annual LAI values strongly converge between scenarios. For SFC, values decrease for drying scenarios and increase for wetting scenarios, with consistently higher values for the GCM-RCM chains than for the delta change runs. The high-elevation north-facing cells ( Fig. 9 b), Fig. S17 b)) also show a decrease in long-term average LAI and an increase in variability under drying and warming scenarios. Also here, the dry GCM-RCM chain stands out, with a sharp decrease in LAI from the middle of the century. SFC initially increases in 15 all scenarios. Around 2050 (when the temperature and precipitation modifiers reach their maximum), SFC starts decreasing, with a faster decrease in drying delta change runs.
Due to the similarity of LAI simulated by TreeMig with and without height reduction (cf. Fig. 8), only the results for the version without height reduction are shown. LAI differs greatly between the two TreeMig simulations, especially for the lowelevation south-facing cells (Fig. 9 c), Fig. S17 c)). With TM_BEK, LAI is approximately at 3.5 at the end of the succession 20 run, and decreases to less than 1.5 in the warmest and driest scenario (T6_P-10). Unlike with fully coupled FORHYCS, the dry GCM-RCM chain leads to values within the range of the delta runs at the end of the fimulations. By contrast, under TM_RA2015, LAI decreases from 6.5 to 5 in the T6_P-10 scenario. Here, as for the fully coupled runs, the dry GCM-RCM chain leads to substantially lower values than the other runs. Inter-annual variability increases with TM_RA2015, but not with TM_BEK (Fig. S17 c)). In the high-elevation north-facing cells, the LAI trajectories diverge between the two TreeMig   show the difference in annual streamflow to the uncoupled run (Fig. 10 a)), for the one-way coupled runs (C_T_BEK and C_T_RA15) and the fully coupled FORHYCS runs (C_F), respectively. In all cases, streamflow is higher in the coupled runs.
For C_T_BEK, the difference is approximately 30 mm year −1 at the end of the succession and increases under idealized climate change. The increase is greater for warmer scenarios, with a difference of 60 mm year −1 for the warmest delta change scenarios (T6_Py) at the end of the simulation. For the GCM-RCM chains, the difference is usually smaller than for most delta change runs. The difference in annual streamflow is much less for C_T_RA15, and ranges between 5 and 20 mm year −1 .

5
Here, the difference in streamflow decreases with idealized climate change, with a more pronounced decrease for the warmer delta change scenarios. For the fully coupled runs, the initial difference is around 30 mm year −1 , which is similar to the initial difference for C_T_BEK. During idealized climate change, the difference does not increase as much as for C_T_BEK: the largest difference in the order of 40 mm year −1 . The only exception is the dry GCM-RCM chain, where streamflow difference starts increasing sharply around 2050 to reach 60 mm year −1 at the end of the simulation. Unlike for TM_BEK, the delta 10 change scenarios with the greatest difference by the end of the simulation are the three warming and drying scenarios.

Effect of CO 2 concentration
The increase of stomatal resistance due to increased CO 2 concentration (Eq. 12) has a minimal effect on streamflow and areaaveraged forest properties (see Fig. S22 in the Supplement). In C_F_NCS (the runs in which Eq. 12 is set to one), streamflow is consistently lower than in the C_F runs. The greatest difference occurs towards the end of the simulation period in the Chippis 15 subcatchment, with differences of up to 10 mm year −1 . The differences are largest in the simulations with a precipitation increase. Regarding vegetation properties, the differences in LAI and SFC, averaged over the strata used in Fig. 6, never exceed 0.05 m 2 m −2 and 1 mm, respectively. Figure 11 shows the difference in annual streamflow (30-year annual means) between a fully coupled run without land-use 20 change (C_F) and a run in which forest is allowed to grow in all cells with a "potentially forested" land cover type (see

Moiry
Streamflow difference [mm/year], (C_F_LC) − (C_F) Figure 11. Difference in simulated annual streamflow (30-year rolling means) between the standard fully coupled runs (C_F) and the coupled runs with land abandonment (C_F_LC) for three contrasting subcatchments. In the land abandonment scenario, forest is allowed to grow in all cells classified as "meadows" and "alpine vegetation". The Chippis subcatchment has relatively few meadows, whereas they occupy about a third of the high-elevation catchment Moiry (see Fig. 2). Forest expansion leads to streamflow reduction due to the higher leaf area, and potentially deeper roots. As most meadows are located at high elevations, the effect of forest expansion on streamflow greatly depends on the warming scenario.

Effect of coupling on hydrological simulations
The plausibilization of simulated streamflow (Sect. 3.1) showed that PREVAH in its original version, as well as coupled and uncoupled FORHYCS yielded similar goodness-of-fit scores in the four gauged subcatchments of this study. The differences between standard PREVAH and the two FORHYCS versions are much larger than between coupled and uncoupled FORHYCS.

5
As noted in Sect. 2.1.7, the main difference between standard PREVAH and uncoupled FORHYCS is the parameterization of rooting zone storage capacity SFC. As this difference is considerable (see Fig. S1), the differences in daily streamflow seen on Fig. 5 are to a large extent due to the differing SFC parameterizations. Due to the small differences between the results of coupled and uncoupled FORHYCS, it is not possible to conclude whether varying vegetation properties has led to an improvement of model performance. In the fully coupled version, simulated LAI is quite close to the default values in 10 PREVAH (cf. Fig. 8; standard summer LAI for forests in PREVAH is 8). As seen on Fig. 8, LAI is smaller when simulated with TM_BEK in most elevation bands. A comparison of the simulated streamflow shown in Sect. 3.1 with output from a oneway coupled simulation with TM_BEK (not shown) shows differences of only 3 mm year −1 between one-way and two-way couplings, except in the ungauged catchment 1 (Chippis), where the difference is 24 mm year −1 . This suggests that the effect of the coupling on hydrological model outputs varies spatially, in line with the findings of Schattan et al. (2013). 15 The relatively modest effect of the coupling on simulated streamflow, especially in the high-elevation subcatchments, is consistent with the findings of Schattan et al. (2013), whose study domain also included the Navizence catchment. They found a differential effect of transient vegetation parameters on simulated streamflow, with the greatest effects at low elevations (where LAI is much lower than the generic parameter value of the hydrological model) and above the current treeline (where the forest may expand in the future). The issue of scale is also of relevance: as forested cells make up a relatively small fraction 5 of each subcatchment (Fig. 2), even an important change in the water balance of some forested cells will have little influence on catchment-integrated streamflow. The most extreme example of change in vegetation parameters in this study in the runs using meteorological forcing from the GCM-RCM model chain representing dry conditions. In this run, forest LAI greatly decreases throughout the study region (Fig. 9), leading to a difference in simulated streamflow of up to 60 mm year −1 between the coupled and uncoupled runs (Fig. 10). This shows the value of including vegetation dynamics for hydrological modeling 10 under severe change.
While leaf area and rooting depth are among the most sensitive vegetation parameters for surface water partitioning (Milly, 1993;Nijzink et al., 2016;Speich et al., 2018a), FORHYCS does not represent all possible impacts of forest dynamics on hydrological processes. For example, forest properties have been related to hydrological model parameters relating to snow (Seibert, 1999) or soil properties (Johst et al., 2008). Badoux et al. (2006) found that forest site type was a good indicator of the 15 dominant runoff processes. While this does not imply a causal relationship between forest characteristics and runoff generation in all cases, some of the differences between runoff processes could be explained by forest properties, such as hydrophobicity of conifer needle litter, which promotes fast runoff processes. On the other hand, forest soils are often associated with low runoff coefficients. Johst et al. (2008) parameterized the soil moisture recharge BETA as a function of land cover type. This parameter, which controls the partitioning of precipitation between plant-available soil moisture and runoff generation, is also 20 used in the local water balance modules of PREVAH and FORHYCS (see Speich et al., 2018a). Speich et al. (2018a) found that BETA was a relatively sensitive parameter for the physiological drought index of FORHYCS. Currently, snow and runoff generation parameters are static in PREVAH and FORHYCS. As they have been obtained through regionalization, it may be challenging to relate their values to specific forest properties. However, representing the effects of forest dynamics on these processes might further reduce the dependence on calibrated and regionalized parameter values.

Effect of coupling on forest simulations
To assess the effect of various forest-hydrology couplings on the performance of the forest models, several outputs and state variables were compared against observations. For various reasons, a perfect match between simulated and observed vegetation properties cannot be expected. First, the model simulates the potential natural vegetation dynamics, without considering forest management or disturbances such as fire or avalanches, which again have impacts on stand age. Second, the succession is mod-of the model's skill. The fully coupled FORHYCS gave reasonable results for biomass, species composition and stand structure (simulated LAI is discussed in the next paragraph). It is important to remember that the coupled and uncoupled forest simulations used two different drought indices, one that depends on transfer variables from PREVAH equations, and one that is calculated before the simulations, respectively (see Sect. 2.1.3). These indices require two different sets of species-specific drought tolerance parameters. Therefore, it is difficult to assess to what extent the differences in model outputs are due to the 5 coupling, or to the different parameterization. In any case, representing the effect of water availability (and low temperatures) on maximum height greatly improved the simulation of canopy structure, also in uncoupled models. By contrast, reducing leaf area as a function of stress leads to poorer results regarding canopy structure, as well as unrealistic biomass fluctuations. Two main effects happen in the model as a result of LAI reduction: the drought index (Eq. 5) is lower than it would be without LAI reduction; and the light distribution is modified, i.e. lower height classes get more light than they would get without LAI reduc-10 tion. These two effects both promote tree growth, which explains why the model simulates higher biomass. This higher growth also eventually leads to greater mortality, after the number and size of trees have grown fast for some years. This explains the more dynamic pattern when LAI reduction is activated (Fig. S13).
Simulated leaf area index (LAI) varies greatly between the TreeMig and FORHYCS runs (Fig. 8). The pattern of LAI simulated with TM_BEK across elevation bands follows the distribution of soil moisture storage capacity (Fig. S1). The flat 15 areas at the bottom of the Rhône valley are the only areas where storage capacity exceeds 100 mm. Therefore, LAI is high for the lowest elevation band. By contrast, storage capacity on the slopes is much lower, so that LAI initially sharply decreases with elevation. With TM_RA2015, water is hardly limiting, so that LAI is high at all elevations. LAI simulated with FORHYCS shows a similar pattern as the remotely sensed data, with an initial increase with elevation, and consistently higher values on north-facing than on south-facing slopes, except at the highest elevations. The absolute values, however, are consistently higher 20 than the observations, sometimes offset by a factor of two. Various factors hinder a direct, quantitative comparison of measured and simulated LAI. First, the 300-by-300 m cells of the remotely sensed dataset may contain non-forested surfaces, such as pastures, clearings, roads or water bodies, even if the cell is classified as forest. The model does not consider this type of spatial heterogeneity. This is especially relevant in regions with a high spatial variability of land cover, as is the case in this study region. Second, remotely sensed LAI is subject to some uncertainty, due for example to the clumping of needles in coniferous 25 forests (Garrigues et al., 2008). Despite a good overall performance, the authors of the validation report for the Copernicus LAI 300 product (Camacho et al., 2016, p.83)  higher than this for this elevation band. At high elevations, the decrease in LAI is not as pronounced in the simulations as in the observations (Fig. 8). These results suggest that the spatial variability of LAI is somewhat underestimated by the model, especially where an environmental factor is particularly limiting.

Effect of coupling on model behavior under climate change
The models used in this study are very similar to those used in the simulation experiment of Lischke and Zierl (2002). In that 5 experiment, they coupled the gap model DisCForM, from which TreeMig was later derived, with a point-scale water balance model which is conceptually similar to the new local water balance module of FORHYCS. They found that the coupling of forest dynamics and water balance had a stabilizing effect on the simulated system under climate change. In their experiment, coupled simulations converged towards low LAI and lower levels of physiological drought. This effect is not visible to the same extent in the simulations conducted here. The effect of warming on forests is less pronounced in the fully coupled runs, 10 as evidenced by the evolution of streamflow differences on Fig. 10. However, it cannot be excluded that this is due to the different drought tolerance parameters, or to the lesser sensitivity of the FORHYCS drought index to changes in temperature.
In contrast to the study of Lischke and Zierl (2002), FORHYCS includes some additional mechanisms through which the system can react to changes in climate, such as the adaptation of rooting depth and maximum tree height. Some processes may even have a destabilizing influence on the system, such as the high fluctuations in biomass introduced by the stress-induced 15 leaf area reduction.
An exception to the generally resilient behavior of forests under climate change are the GCM-RCM runs representative for dry conditions. In these runs, the forest greatly deteriorates throughout the study region, even under the C_F configuration ( Fig. 9). Interestingly, under this meteorological forcing, precipitation is higher than in the delta change scenarios, where this severe reduction of forest LAI does not occur (see Sect. 2.2.4 and Fig. S2). Therefore, the high stress causing this LAI 20 reduction must have been caused mainly by changes in potential evaporation or temporal precipitation distribution. While this study has shown a range of possible model behaviors under various climate change scenarios, future research should examine more formally the effect of different bioclimatic factors on the behavior of forest models under climate change. For example, understanding the physiological significance of bioclimatic drought indices is important to interpret how a forest model responds to different scenarios (Speich, 2019). Also, rooting depth has been shown to be an important interface variable 25 for the coupling of hydrology and vegetation dynamics. In the runs presented in this study, ecoregion-integrated rooting zone storage capacity fluctuated by up to 30 mm over the course of a simulation covering 130 years. The rooting depth formulation used in FORHYCS responds to bioclimatic and edaphic conditions in complex and nonlinear ways (Guswa, 2008(Guswa, , 2010Speich et al., 2018b). As the inclusion of this variable is rather new in dynamic models, its magnitude and dynamics should be plausibilized against empirical evidence, using e.g. inverse modeling (Nijzink et al., 2016)  Results of this modeling experiment have shown that an increase in atmospheric CO 2 concentration has almost no effect on hydrological processes and vegetation dynamics as modeled by FORHYCS. In this implementation, the physiological effect of elevated CO 2 concentration is represented by an additional modifier function to the stomatal resistance parameterization.

5
All else being equal, an increase in C a leads to an increase in stomatal resistance, and thus to a decrease in potential and actual transpiration. This slows down canopy water use, and thus leads to lower levels of simulated physiological drought.
This formulation does not account for other physiological impacts of elevated C a , such as enhanced photosynthetic rates, or possible acclimation effects. The physiological effect of increased C a is a source of uncertainty in forest models, due to widely differing process formulations among models (Medlyn et al., 2011). For example, our results contrast with the simulations of 10 Scherstjanoi et al. (2014), who applied a modified version of LPJ-GUESS and found a crucial influence of C a on simulated future forest biomass in Switzerland. These differences between models are partly due to the knowledge gaps regarding the underlying processes. According to Medlyn et al. (2011), models that do not consider the physiological effects of C a at all are likely to underestimate future forest productivity, whereas some other models are likely to yield overestimates due to an improper representation of other limiting factors. From an ecohydrological point of view, the large-scale effects of increased 15 C a have been the object of a number of recent studies. For example, Trancoso et al. (2017) found that decreases in streamflow in Australian catchments were caused by vegetation greening, which was in turn driven by elevated C a . These studies suggest that the stomatal effects of increased C a (transpiration reduction) are more than offset by enhanced vegetation growth. This is not the case in this study, where the only visible effect was an increase in streamflow, whereas vegetation properties were not affected at all.

Land cover change
In this simulation experiment, allowing the forest to grow in areas currently covered by meadows caused a reduction of streamflow of up to 60 mm year −1 at subcatchment level (Fig. 11). This is a substantially greater effect than in the simulation experiment of Schattan et al. (2013), who found a change in annual runoff in the order of 10 mm year −1 in regions currently above the treeline as they become forested under simulated climate change. Under the scenario used in their study, temperature 25 was projected to increase by 3-4 K by the end of the century. A major difference with this study is that they only varied LAI, whereas in this study, the development of both LAI and rooting depth were simulated. Both of these variables probably had a major impact on simulated streamflow. These spectacular results must be considered in the light of several potential sources of uncertainty in the model formulation. First, FORHYCS does not account for competition by other vegetation types, which may slow down the expansion of forests. Also, other factors that make the current treelines an extreme environment are not 30 considered by FORHYCS, such as the steep slopes and shallow soils. For example, it was shown that during the warmest period of the Holocene, only stunted trees were able to establish at high elevations, although the climate would have allowed a forest to grow (Theurillat and Guisan, 2001). Another aspect to consider is that at the beginning of the simulation, meadows and alpine vegetation types were parameterized with a prescribed rooting depth of 22 cm. This value was set arbitrarily, and if it was actually higher for these vegetation types, the hydrological impact of forest expansion would be exaggerated in the simulations.

Conclusions and Outlook
This study presented a proof-of-concept for a dynamic, spatially distributed model combining hydrological processes and forest 5 dynamics. The main interface variables are leaf area index, rooting depth, as well as a physiological drought index. This model was applied in a case study in a valley with a sharp topographical and hydro-climatic gradient.
The motivation behind developing this model was to apply it to climate change impact studies in which the spatio-temporal forest dynamics and water balance of Switzerland are simulated together. The closer integration of these ecosystem processes would increase the confidence in these model projections, compared to uncoupled models that do not account for changes 10 in the environment besides climate. The research questions were (1) how model coupling impacts the results of simulated water balance and forest dynamics, (2) which aspects of the coupling were particularly relevant, and (3) how model coupling affects simulation results under climate change. From the hydrological point of view, the coupling had only a modest effect on catchment-integrated streamflow, although this effect was not uniform in space: the greatest effects occurred at low elevations, and in regions currently above the treeline. Regarding forest simulations, model results were compared against multiple data 15 sources to examine model behavior and pinpoint potential weaknesses. In a comparison with a new high-resolution canopy height dataset, two new indices of agreement between observed and simulated forest structure were developed. This comparison confirmed the importance of specifying an environmental limitation on maximum tree height, as this greatly improved the realism of simulated canopy structure and biomass. Also, a dynamic parameterization of rooting depth led to better model performance. In combination with remotely sensed LAI data, this model-data comparison showed that the coupled model was 20 better able to reproduce observed spatial patterns, although it also highlighted potential deficiencies in the way drought impacts are represented. Under (idealized) climate change, the forests in the coupled model show greater resilience, which translates into a reduced sensitivity of mean annual streamflow to changes in temperature and precipitation. In some cases, the behavior of the model seems exaggerated, but demonstrates the importance of explicitly modeling relevant processes. This was the case with regard to the possible expansion of forests above the current treeline. On the other hand, the effects of increased 25 CO 2 concentration on plant physiology are less than what observations suggest, highlighting the challenges of incorporating physiological principles into phenomenological models. As these areas are the object of active research, it is expected that new analyses will give the opportunity to test model behavior under these novel conditions, and possibly to improve process formulations.