A large-scale simulation model to assess karstic groundwater recharge over Europe and the Mediterranean

Karst develops through the dissolution of carbonate rock and is a major source of groundwater contributing up to half of the total drinking water supply in some European countries. Previous approaches to model future water availability in Europe are either too-small scale or do not incorporate karst processes, i.e. preferential flow paths. This study presents the first simulations of groundwater recharge in all karst regions in Europe with a parsimonious karst hydrology model. A novel parameter confinement strategy combines a priori information with recharge-related observations (actual evapotranspiration and soil moisture) at locations across Europe while explicitly identifying uncertainty in the model parameters. Europe’s karst regions are divided into four typical karst landscapes (humid, mountain, Mediterranean and desert) by cluster analysis and recharge is simulated from 2002 to 2012 for each karst landscape. Mean annual recharge ranges from negligible in deserts to > 1 m a in humid regions. The majority of recharge rates range from 20 to 50 % of precipitation and are sensitive to subannual climate variability. Simulation results are consistent with independent observations of mean annual recharge and significantly better than other global hydrology models that do not consider karst processes (PCR-GLOBWB, WaterGAP). Global hydrology models systematically under-estimate karst recharge implying that they over-estimate actual evapotranspiration and surface runoff. Karst water budgets and thus information to support management decisions regarding drinking water supply and flood risk are significantly improved by our model.


Introduction
Groundwater is the main source of water supply for billions of people in the world (Gleeson et al., 2012). Carbonate rock regions only constitute about 35 % of Europe's land surface (Williams and Ford, 2006), yet contribute up to 50 % of the national water supply in some European countries (COST, 1995) because of their high storage capacity and permeability (Ford and Williams, 2007). Climate conditions have a primary control on groundwater recharge (de Vries and Simmers, 2002). Climate simulations suggest that in the next 90 years Mediterranean regions will be exposed to higher temperatures and lower precipitation amounts (Christensen et al., 2007). In addition, shifts in hydrological regimes (Milly et al., 2005) and hydrological extremes (Dai, 2012;Hirabayashi et al., 2013) can be expected. To assess the impact of climate change on regional groundwater resources as groundwater depletion or deteriorations of water quality, large-scale simulation models are necessary that go beyond the typical scale of aquifer simulation models (∼ 10-10 000 km 2 ) Additionally, we expect the future variability of climate to be beyond that reflected in historical observations, which means that model predictions should derive credibility via more indepth diagnostic evaluation of the consistency between the model and the underlying system and not from some calibration exercise (Wagener et al., 2010).
Currently available global hydrology models discretize the land surface in grids with a resolution down to 0.25-0.5 • . Parts of the vertical fluxes are well represented, e.g. the energy balance (Ek, 2003;Miralles et al., 2011). But groundwa-ter recharge and groundwater flow are represented simply by heuristic equations (Döll and Fiedler, 2008) or assumptions of linearity (Wada et al., 2010(Wada et al., , 2014. They do not explicitly simulate a dynamic water table or regional groundwater flow. Global models also assume homogenous conditions of hydrologic and hydraulic properties in each of their grid cells, rather than variable flow paths, and they completely omit the possibility of preferential flow. This was criticized in the recent scientific discourse about the need for large-scale hyper-resolution models (Beven and Cloke, 2012;Wood et al., 2011).
The assumption of homogeneity is certainly inappropriate for karst regions. Chemical weathering of carbonate rock and other physical processes develop preferential pathways and strong subsurface heterogeneity (Bakalowicz, 2005). Flow and storage are heterogeneous ranging from very slow diffusion to rapid concentrated flow at the surface, in the soil, the unsaturated zone and the aquifer (Kiraly, 1998). A range of modelling studies have developed and applied karst specific models at individual karst systems at the catchment or aquifer scale (Doummar et al., 2012;Fleury et al., 2007;Hartmann et al., 2013b;Le Moine et al., 2008) but a lack of a priori information of aquifer properties and observations of groundwater dynamics have prohibited their application on larger scales .
Compared to the limited information about the deeper subsurface there is much better information about the surface and shallow subsurface, including maps of soil types and properties (FAO/IIASA/ISRIC/ISSCAS/JRCv, 2012), observations of soil moisture (International Soil Moisture Network; Dorigo et al., 2011) and of latent heat fluxes (FLUXNET; Baldocchi et al., 2001), as well as river discharge (GRDC, 2004). Surface and shallow subsurface information is used for the parameterization and evaluation of the surface routines of present large-scale models. But, although these data also cover Europe's karst regions, it has not been used for the development of large-scale models to simulate karstic surface and shallow subsurface flow and storage dynamics.
The objective of this study is to develop the first largescale simulation model for karstic groundwater recharge over Europe and the Mediterranean. Despite much broader definitions of groundwater recharge (e.g. Lerner et al., 1990), we focus on potential recharge, that is, vertical percolation from the soil below the depth affected by evapotranspiration. We use a novel type of model structure that considers the subgrid heterogeneity of karst properties using statistical distribution functions. To achieve a realistic parameterization of the model we identify typical karst landscapes by cluster analysis and by a combined use of a priori information about soil storage capacities and observations of recharge-related fluxes and storage dynamics. Applying a parameter confinement strategy based on Monte Carlo sampling we are able to provide large-scale simulation of annual recharge including a quantification of their uncertainty. Figure 1. (a) Schematic description of the model for one grid cell including the soil (yellow) and epikarst storages (grey) and the simulated fluxes, (b) its gridded discretization over karst regions and (c) the subsurface heterogeneity that its structure represents for each grid cell.

Data and methods
Due to chemical weathering (karstification) karst systems have a strong subsurface heterogeneity of flow and storage processes (Bakalowicz, 2005) that have to be considered to produce realistic simulations . In this study, large-scale karst recharge is estimated by a modified version of the VarKarst model (Hartmann et al., 2013a. The model has shown to be applicable at various scales and climates over Europe (Hartmann et al., 2013b). To simulate karst recharge we discard the groundwater routines but we use exactly the same surface and shallow subsurface routines. The resulting recharge simulation model, VarKarst-R, is described in the proceeding subsection. The new feature of the large-scale application of the VarKarst-R model is the estimation of its parameters. While previous applications of the model could rely on calibration by observations at the karst system outlet the simulation of large-scale recharge requires a different approach. We developed a new parameter estimation procedure that separates the study area into four karst landscapes by cluster analysis and estimates model parameters and their uncertainty by a step-wise parameter confinement process (explained in Sect. 2.3).

The model
The structure of the VarKarst-R model (Fig. 1a) is based on the conceptual understanding of the surface and shallow subsurface processes of karst regions (Fig. 1c). Their most characteristic feature is the existence of the epikarst that evolves close to the surface because of stronger carbonate rock dissolution. It can be seen as a temporal storage and distribution system for karst recharge (Aquilina et al., 2006;Williams, 1983a). Depending on the rates of infiltration, variability of soil thicknesses and hydraulic conductivities, it can produce slow and diffuse vertical percolation into the carbonate rock or it can concentrate infiltration laterally towards dissolutionwidened fissures or conduits . Ap-plied on a 0.25 × 0.25 • grid (Fig. 1b), VarKarst-R simulates potential recharge, which is the water column vertically percolated from the soil and epikarst. Hence, the previous version of the model is reduced to include only the soil and the epikarst simulation routines but still using the same statistical distribution functions that allow for variable soil depths, variable epikarst depths and variable subsurface dynamics (Fig. 1). This leads to a parametrically efficient process representation. Comparisons with independently derived field data showed that these distribution functions are a good approximation of the natural heterogeneity .
Heterogeneity of soil depths is represented by a mean soil storage capacity V soil [mm] and a variability constant a [-]. The soil storage capacity V S,i [mm] for every compartment i is defined by where V max,S [mm] is the maximum soil storage capacity and N is the total number of model compartments. For the application of a priori information on mean soil storage capacities (Sect. 2.3) V max,S has to be derived from the mean soil storage capacity V soil by (Hartmann et al., 2013b) Preceding work (Hartmann et al., 2013a) showed that the same distribution coefficient a can be used to derive the epikarst storage distribution V E,i from the mean epikarst storage capacity V epi [mm] (via the maximum epikarst storage V max,E , likewise to V max,S in Eq. 2): At each time step t, the actual evapotranspiration from each soil compartment E act,i is derived by reducing potential evaporation according the soil moisture deficit: The temporal water storage of the epikarst is drained following an assumption of linearity (Rimmer and Hartmann, 2012), which is controlled by the epikarst storage coefficients where V epi,i [mm] is the water stored in compartment i of the epikarst at time step t. Again, the same distribution coefficient a is applied to derive K E,i from the mean epikarst storage coefficient K epi . The latter is obtained from the mean epikarst storage coefficient K epi using When infiltration exceeds the soil and epikarst storage capacities, surface flow to the next model compartment Q Surf,i+1 [mm] initiates: To summarize, the model is completely defined by the four parameters a, K epi , V soil , and V epi (Table 2).

Data availability
Forcing for the VarKarst-R model is derived through GLDAS-2, which assimilates satellite-and ground-based observational data products to obtain optimal fields of land surface states and fluxes (Rodell et al., 2004;Rui and Beaudoing, 2013). While precipitation, temperature and net radiation are mainly merged from satellite and gauge observations, snow water equivalent is derived using data assimilation as well as the snow water equivalent simulations of the NOAH land surface model v3.3 (Ek, 2003) driven by GLDAS-2 forcing. Europe's and the Mediterranean's carbonate rock areas are derived from a global map (vector data) of carbonate rock (Williams and Ford, 2006). Each cell of the 0.25 • simulation grid intersecting a carbonate rock region was considered a karst region. The model was calibrated and evaluated with observations of actual evapotranspiration from FLUXNET (Baldocchi et al., 2001) and with soil water content data from the International Soil Moisture Network (ISMN; Dorigo et al., 2011). Only stations within carbonate  rock regions and with ≥ 12 months of available data were used (Fig. 2). Months with < 25 days of observations were discarded. In addition, months with ≥ 50 % mismatch in their energy closure were discard from the FLUXNET data set (similar to Miralles et al., 2011).

Parameter estimation
A lack of a priori information and observations of discharge and groundwater levels that can be used for calibration are the primary reasons why karst models have not been applied on larger scales yet . The parameter assessment strategy we present in the following is meant to overcome this problem by using a combination of a priori information and recharge-related variables. We define typi-cal karst landscapes over Europe and the Mediterranean and apply this combined information to a large initial sample of possible model parameter sets. In a stepwise process we then discard all parameter sets that produce simulations inconsistent with our a priori information and our recharge-related observations.

Definition of typical karst landscapes
Our definition of typical karst landscapes is based on the well-known hydrologic landscape concept (Winter, 2001), which describes hydrological landscapes based on their geology, relief and climate. Constraining ourselves to karst regions that mainly develop on carbonate rock, we assume that differences among the karst landscapes are due to differences in relief and climate, and the consequent processes of landscape evolution including the weathering of carbonate rock (karstification). The carbonate rock regions in Europe and the Mediterranean are divided into typical landscapes using simple descriptors of relief (range of altitude RA) and climate (aridity index AI and mean annual number of days with snow cover DS) within each of 0.25 • grid cells and a standard cluster analysis scheme (k means method). We test the quality of clustering for 2-20 clusters by calculating the sums of squared internal distances to the cluster means. The so-called "elbow method" identifies the point where adding additional clusters only leads to a marginal reduction in the internal distance metric, i.e. the percentage of variance explained by adding more clusters would not increase significantly (Seber, 2009).

A. Hartmann et al.:
A large-scale simulation model to assess karstic groundwater recharge 1733

Model parameters for each karst landscape
We initially sample 25 000 possible model parameter sets from independent uniform distributions using parameter ranges derived from previous catchment-scale applications of the VarKarst-R model over Europe and the Mediterranean (Table 2). We use a priori information and recharge-related observations to assess parameter performance for each karst landscape. A priori information consists of spatially distributed information about mean soil storage capacities as provided by several preceding mapping and modelling studies (Ek, 2003 Fig. 2). Soil moisture is related to recharge because it indicates the start and duration of saturation of the soil during which diffuse and preferential recharge can take place. Actual evaporation is related to recharge because usually no surface runoff occurs in karst regions due to the high infiltration capacities (Jeannin and Grasso, 1997). The difference of monthly precipitation and actual evaporation is therefore a valid proxy for groundwater recharge at a monthly timescale or above. The new parameter confinement strategy is applied to each of the karst landscapes in three steps: 1. Bias rule: retain only the parameter sets that produce a bias between observed and simulated actual evaporation lower than 75 % at all FLUXNET locations within the chosen karst landscape: where m sim,i and m obs,i are the sum of simulated and observed actual evapotranspiration at location i, respectively. The value 75 % was found by trial-and-error, which reduced the initial sample to a reasonable number. The bias rule was not applied on the soil moisture since porosities of the soil matrix were not available, prohibiting a comparison of simulated and observed soil water contents.
2. Correlation rule: retain only the parameter sets that produce a positive coefficient of (Pearson) correlation between observations and simulations of both actual evaporation and soil moisture, at all locations: where AET sim,j and AET obs,j , and θ sim,j and θ obs,j , are the monthly means of simulated and observed actual evapotranspiration, and soil water content at locations i/j , respectively.
3. Application of a priori information: retain only parameter sets in which V soil falls within the feasible ranges that can be derived from a priori information about the maximum soil storage capacity in different karst landscapes (Ek, 2003; FAO/IIASA/ISRIC/ISSCAS/JRCv, 2012; Miralles et al., 2011). We add less than the usual a priori information at the last step to evaluate if the posterior distributions of V soil already adapt to the ranges defined in this confinement step. If they do not, we would conclude that the recharge-related information applied in confinement steps 1 and 2 is biased. If they do, we have indication that the data applied in all three steps is complementary.
Each step reduces the initial parameter sample differently for each of the karst landscapes. The posterior parameter distributions within the confined samples should be different among the karst landscapes if the karst landscapes are properly defined. The rather weak thresholds in step 1 and 2 were chosen to take into account the uncertainties resulting from the differences in scales of observations (point) and simulations (grid cell), and from the indirect observation of recharge (actual evaporation and soil moisture as rechargerelated variables).

Recharge simulations over Europe and the Mediterranean
Recharge is simulated over the carbonate regions of Europe and the Mediterranean from 2002/03 to 2011/12 using the confined parameter samples for each of the identified karst landscapes and the available forcings (Table 1). The mean and standard deviation of simulated recharge for each grid cell and time step are calculated by uniform discrete sampling of a representative subset of 250 parameter sets from each of the confined parameters sets which we regarded to be large enough to provide a reliable measure of spread.

Model evaluation
To assess the realism of simulated groundwater recharge we compare simulated with observed mean annual recharge volumes derived independently from karst studies over Europe and the Mediterranean (Table 3). In addition, we compare our results to the simulated mean annual recharge volumes of two well-established global simulation models: PCR-GLOBWB (Wada et al., 2010(Wada et al., , 2014 and WaterGAP (Döll and Fiedler, 2008;Döll et al., 2003). We furthermore apply a global sensitivity analysis strategy, called regional sensitivity analysis (Spear and Hornberger, 1980), to evaluate the importance of the four model parameters at different simulation timescales ranging from 1 month up to 10 years. This analysis shows (1) which simulated process and characteristics are dominant at a given timescale and (2) which parameters will need more careful calibration when the model is used in future studies. We use  Wellings (1984) the same sample of 25 000 parameter sets that was created for the parameter estimation strategy (Sect. 2.3.2) and assess the sensitivity of four model outputs representative of different timescales: coefficient of variation (CV) of simulated monthly recharge volumes (monthly), CV of simulated 3-month recharge volumes (seasonal), CV of annual recharge volumes (annual), and total recharge over the entire 10-year simulation period (decadal). We do not consider temporal resolutions of less than a month given the assumption that the difference of precipitation and actual evapotranspiration can be a proxy for groundwater recharge and due to uncertainties related to differences in simulation (grid cell) and observation (point).
For each of the identified karst landscapes we choose the 10 locations that are closest to their cluster means (Euclidean distances to relief and climate descriptors; Sect. 2.3.1) as representative locations. In the regional sensitivity analysis ap-proach, we split the parameter sets into two groups, those that produce simulations above the simulated median of one of the four model outputs and those that produce simulations below. We then calculate the maximum distance D(x) between marginal cumulative distribution functions (CDFs) produced by these two distributions for each of the parameters -a large distance D(x) suggests that the parameter is important for simulating this particular output (Fig. 3).

Definition of typical karst landscapes
Cluster analysis resulted in four clusters, which are generally spatially contiguous (Fig. 4) and have quantitatively distinct  cluster means (Table 4). We can attribute particular characteristics to each cluster using the mean values of the clustering descriptors (Table 4): (1) humid hills and plains (HUM) are characterized by an aridity index < 1, a significant number of days with snow cover and low elevation differences.
(2) High range mountains (MTN) have an aridity index of ∼ 1, they also have a significant number of days with snow cover and they show very large topographic elevation differences.
(3) Mediterranean medium range mountains (MED) show high aridity index, only few days with snow cover and high elevation differences. (4) Desert hills and plains (DES) are described by similar altitude ranges as the humid hills and plains but they have a high aridity indices and almost no days with snow cover. The karst landscapes order from north (HUM) to south (DES) based on increasing temperatures and decreasing precipitation amounts. While HUM and DES appear to be separated clearly, MTN and MED mix in some regions, for instance Greece and Turkey where mountainous regions are in close proximity to the coast.

Model parameter estimates for each karst landscape
The three steps of the new parameter confinement strategy resulted in a significant reduction of the initial sample of 25 000 parameter sets (Fig. 5). Each step has a different impact on the reduction among the identified landscapes. For the humid karst landscapes, the correlation rule appears to have the strongest impact while for the mountain and Mediterranean landscapes the bias rule results in the strongest reduction. For the desert landscape only step 3, i.e. application of a priori information, reduces the initial sample because no data were available to apply steps 1 and 2. Considering the parameter ranges for each landscape after the application of the confinement strategy (Table 5), we only achieved a confinement of the distribution parameter a, the soil storage capacity V soil , and slight confinement of the epikarst storage coefficient K epi . The impact of the three confinement steps becomes more obvious when considering their posterior distributions (Fig. 6). The distributions of parameters a, K epi and V soil evolve significantly away from their initial uniform distributions along the confinement steps. In general, changes of the posterior distributions of each landscape's parameter samples are in accordance with the reductions in their number (Fig. 5), though changes are pronounced differently among the parameters. While a and V soil change strongly for HUM, MTN and MED, V epi maintains a uniform distribution across all steps. K epi also exhibits strong changes for HUM but they are less pronounced for MTN and MED. The posterior distributions of the DES landscape do not change except for step 3 due to the lack of information to apply confinement steps 1 and 2.
Step 3 results in a tailoring of the distribution of V soil for all landscapes. For HUM, MTN and MED it can be seen that confinement steps 1 and 2 already pushed the parameter distributions towards their final shape, meaning that the changes in parameter distributions induced by the comparison with observations are consistent with the a priori information about the physical characteristics of the karst.

Recharge simulations over Europe and the Mediterranean
The parameter confinement strategy allows us to apply VarKarst-R over all of Europe and the Mediterranean and to obtain recharge simulations for the hydrological years 2002/03-2011/12. Thanks to the 250 parameter sets that we sampled from the posterior parameter distributions we can include an estimate of uncertainty for each grid cell (Fig. 7). Mean annual recharge ranges from almost 0 to > 1000 mm a −1 with the highest volumes found in northern UK, the Alps and former Yugoslavia. The lowest values are found in the desert regions of Northern Africa. The vast majority of recharge rates range from 20 to 50 % of precipitation. Considering the simulations individually for each karst landscape reveals that the mountain landscapes produce the largest recharge volumes followed by the humid and Mediterranean landscapes (Fig. 8a). The desert landscapes produce the lowest recharge volumes. However, the recharge rates reveal that on average the Mediterranean landscapes show the largest recharge rates, followed by the highly variable mountains (Fig. 8c). Humid and desert landscapes exhibit lower recharge rates. Uncertainties, expressed by the  standard deviation of the 250 simulations for each grid cell, are rather low, seldom exceeding 35 mm a −1 (Fig. 8b). However, expressed as coefficients of variation, most of them range from 5 to 25 % for the humid, mountain and Mediterranean landscapes but for the desert landscape they can reach up to 50 % of the mean annual recharge (Fig. 8d).

Model evaluation
We compare the simulated recharge volumes of our model with recharge volumes assessed from independent and published karst studies over Europe and the Mediterranean (Fig. 9a). Even though there is a considerable spread across the simulations, their bulk plots well around the 1 : 1 line achieving an average deviation of only −58 mm a −1 (Table 6). Considering the individual karst landscapes, there is an over-estimation of recharge for the humid landscapes and an under-estimation for the mountain landscapes. The best results are achieved for the Mediterranean landscapes with only slight under-estimation (Fig. 9a). When we compare the same observations to the simulated recharge volumes of the PCR-GLOBWB (Fig. 9b) and WaterGAP mod- els (Fig. 9c) we find a strong tendency of under-estimation that is strongest for the mountain and Mediterranean landscapes but still significant for the humid landscapes (Table 6). For the humid landscapes absolute deviations are similar for PCR-GLOBWB and VarKarst-R. In addition to comparing simulated and observed annual averages, sensitivity analysis on the model output gives us insight into the realism of the model and the importance of individual model parameters at different timescales (Fig. 10). Our results show that parameters a and V soil have the overall strongest influence on the simulated recharge from a monthly to a 10-year timescale, but their influence decreases toward shorter timescales. Simultaneously, the epikarst parameter K epi gains more importance. This behaviour is most pronounced for the Mediterranean and desert landscapes. The   same is true for V epi , but its overall importance remains much lower, which was also found in the parameter confinement strategy (Fig. 6).

Identification of karst landscapes
The identification of different karst landscapes is a crucial step within our new parameter estimation strategy. The four karst landscapes we identified depend mostly on the choice of climatic and topographic descriptors (Table 4) and the selected number of clusters. Even though neglecting several factors as depositional environments, fracturing by tectonic processes or regional variations in rain acidity, our choice of descriptors is well justified from our understanding of dominant hydrologic process controls as formalized in the hydrologic landscape concept (Winter, 2001) and applied similarly at many other studies (Leibowitz et al., 2014;Sawicz et al., 2011;Wigington et al., 2013). The appropriate choice of clusters for the k means method is less unambiguous (Ketchen and Shook, 1996). The change in number of clusters when the sum of squared distances to our cluster centres only reduces marginally was not clearly definable (Fig. A1). However, choosing only three clusters instead of four would have resulted in unrealistic spatial distribution of clusters. The attribution of north African regions with northern Europe to the same cluster occurred because of their similarity of altitude ranges (Table 4). On the other hand, a selection of five clusters would have resulted in a cluster with properties just between the MTN and the MED clusters and, because of a much stronger scattering, weaker spatial distinction between them. With four clusters, our karst landscapes are similar to the Köppen-Geiger climate regions (Kottek et al., 2006), in particular to the oceanic climate (HUM), the hot and warm summer Mediterranean climate (MED), and the hot desert climates (DES). However, we see deviations when comparing the polar and Alpine climate regions of the Köppen-Geiger with our high range mountain karst landscape, since our landscapes are also defined by their elevation ranges. The borders of these hydrologic landscapes are also uncertain. Natural systems usually do not have straight borders that fall on a grid, as assumed by this analysis. Typical transitions between landscape types are continuous and hence transitions from a parameter set representing one landscape to another parameter set of another cluster should be graded, as well. This will be discussed in the following subsection.

Confinement of parameters
How the three steps of the parameter confinement strategy reduce the initial sample shows which type of data provides the most relevant information for each of the karst landscapes. While the timing of actual evapotranspiration and soil saturation that is expressed by the correlation rule appears to be most relevant for the humid landscapes, the bias rule, which represents the volumes of monthly evapotranspiration, is more relevant for the mountain and Mediterranean landscapes. Swapping the order of the correlation rule and the bias rule would provide the same results for HUM and MTN. But for MED the alternative order increases the importance of timing expressed by the correlation rule, indicating the similar importance of both confinement steps.
The thresholds we set in confinement steps 1 and 2 are not very strict, and the ranges of soil storage capacity we used as a priori information in step 3 are quite large. This compensates for the fact that (1) only recharge-related variables are available rather than direct recharge observations, (2) these variables are not available at the simulation scale (0.25 • grid) but at a point scale, and (3) the transition between the landscapes is more continuous than discrete. Despite these rather weak constraints, the initial parameter sample of 25 000 reduces to quite low numbers between 679 (HUM) and 2731 (MED). All posterior parameters overlap except for the soil storage capacities that are tailored by the a priori information (confinement step 3). Hence, a small number of parameter sets for one landscape is also acceptable for some of the other landscapes, thereby taking into account the continuous transition between them.
All model parameters, except for V epi , show different shapes in their cumulative distribution functions across the karst landscapes. The desert landscape parameters only differ from the initial sample for the V soil parameter due to the lack of information to apply confinement steps 1 and 2. The distribution parameter a is found at the lower values of its feasible range for the humid and mountain landscapes, indicating a significant contribution of preferential recharge. Since altitude ranges are rather low for HUM this may be attributed to a significant epikarst development (Perrin et al., 2003;Williams, 1983b). For MTN, a mixture of epikarst development and topography driven interflow at the mountain hill slopes and valleys can be expected to control the dynamics of karstic recharge (Scanlon et al., 2002;Tague and Grant, 2009). At the Mediterranean landscapes the a parameter adapts to ranges that are rather found at the higher values of its initial range, indicating that there is a stronger differentiation between diffuse and concentrated recharge. This may be due to the generally thinner soils ( Table 5) that limit the availability of CO 2 for karst evolution (Ford and Williams, 2007). Instead, local surface runoff channels the water to the next enlarged fissure or crack in order to reach the subsurface as concentrated recharge (Lange et al., 2003). The epikarst storage coefficient K epi for HUM and MED is at lower values of the initial range, indicating realistic mean residence times of days to weeks (Aquilina et al., 2006;Hartmann et al., 2013a). The MTN landscapes show larger K epi values indicating slower epikarst dynamics most probably due to the reasons mentioned above. The application of a priori information in confinement step 3 automatically tailors the values Figure 9. Observations of mean annual recharge from independent studies (Table 3) versus the simulated mean annual recharge by the VarKarst-R and PCR-GLOBWB models (no data for the DES region available). Figure 10. Sensitivity of simulated recharge to the model parameters at different timescales and in the different karst landscapes. Sensitivity is measured by the maximum distance (D) between the distribution of parameter sets that produce "low" recharge (i.e. below the median) and the distribution producing "high" recharge (above the median). Parameter sets are initially sampled from the ranges in Table 2. of V soil to ranges that we assume to be realistic. The fact that confinement steps 1 and 2 already push the shape of their posteriors towards the a priori ranges corroborates that assumption.
The little changes that occur to the initial distributions of the DES parameter sets elaborate the flexibility of our parameter assessment strategy. The posterior distribution evolves only where information is available (for this landscape on V soil ). This is also evident in the behaviour of parameter V epi . The available information is just not precise enough to achieve identification beyond its a priori ranges. For parameter a in HUM, MTN and MED, a lot of information is derived from the available data and its posteriors differ strongly from its initial distribution, while there is less information to determine K epi . This explicit handling of uncertainties in the parameter identification process allows us to provide recharge simulations over Europe's karst regions with uncertainty estimates that represent confidence for each of the identified karst landscapes.

Realism of spatial patterns
Simulated mean annual recharge amounts for the period 2002/03-2011/12 show a wide range of values, from 0 to > 1000 mm a −1 (Fig. 7). Total water availability (mean annual precipitation) appears to be the main driver for its spatial pattern in many regions, for instance in the former Yugoslavia or northern UK. This is consistent with findings of other studies Samuels et al., 2010). When we normalize the recharge rates by the observed precipitation amounts we find that water availability is not the only control on mean annual recharge volumes. A strong relation of evapotranspiration and karst characteristics and processes was shown in many studies and is also found here (Heilman et al., 2014;Jukic and Denic-Jukic, 2008). Potential evaporation is generally increasing from north to south and has an important impact on recharge rates as well, for instance in the Arabian Peninsula and the Alps. Mountain ranges are considered to be the water towers of the world (Viviroli et al., 2007). Here the MTN landscapes also show the largest recharge volumes due to the large precipitation volumes they receive, though with a considerable spread in our study. HUM and MED landscapes behave similarly with significantly less recharge than MTN. Not surprisingly, there is not much recharge in the desert landscapes at all. But the differences among the clusters shift when con-sidering recharge rates. Due to their thin soils and therefore low soil storage for evaporation (Table 5), the DES karst landscapes transfer up to 45 % of the little precipitation they receive into recharge. The MED landscapes show similarly high recharge rates. Though since their soils are generally thicker than the DES soils, the typical seasonal and convective rainfall patterns of the Mediterranean climate (Goldreich, 2003;Lionello, 2012) might have an important impact, too.
Even though there is still considerable spread in our confined parameter sets, the uncertainty in simulated mean annual recharge volumes is quite low. The uncertainties that follow the limited information contained in the observations are revealed more clearly when we relate the standard deviation of simulated recharge to its mean volumes with the coefficient of variation. The uncertainty for the DES landscape is the largest among the clusters because a priori information is only available for V soil . The uncertainty reduces for the MED and MTN landscapes. The low uncertainties for the coefficient of variation of our recharge simulations for the HUM landscape indicate that the available data contained significant information for confining the model parameter ranges.

Relevance of different recharge processes to simulation timescales
The mean annual water balance of a hydrological system is dominated by the separation of precipitation into actual evapotranspiration and discharge (Budyko and Miller, 1974;Sivapalan et al., 2011). Actual evapotranspiration is controlled by the soil storage capacity V soil and the distribution coefficient a within the VarKarst-R model. Regional sensitivity analysis shows that both parameters are most sensitive to the 10-year and annual timescales (Fig. 10). Both parameters loose some impact at higher temporal resolutions (seasonal or monthly timescale) in favour of the parameters that control the dynamics of the epikarst. This behaviour is consistent with evidence from field and other modelling studies that showed that the epikarst can be considered as a temporary storage and distribution system for karstic recharge Williams, 1983b) -potentially storing water for several days to weeks (Aquilina et al., 2006;Hartmann et al., 2013a). Parameter V epi does not show much sensitivity across all landscapes as suggested by the posterior distributions of the confinement strategy. First of all, this finding indicates that the data we used for our confinement strategy do not bias the general model behaviour. It also shows that for the epikarst storage and flow dynamics, K epi is much more important when simulating at monthly or seasonal resolutions. Furthermore, the results of the regional sensitivity analysis show which parameters are most important at a given timescale. Depending on the purpose, a new study may start with the initial ranges of the model parameters or it might continue with the confined parameter ranges that we found here. The latter would result in slightly different sensitivities ( Fig. A2). For both cases, the epikarst parameters will require more attention when applying the VarKarst-R model for simulations at seasonal or monthly timescales. When working at a smaller spatial scale, combined analysis of spring discharge and its hydrochemistry may provide such additional information (Lee and Krothe, 2001;Mudarra and Andreo, 2011). When working at a timescale of > 1 year, the variability constant a and the soil storage capacity V soil require most attention if one starts from the initial ranges. The distribution parameter is most important when using the confined ranges. Again, spring discharge analysis may help in understanding the degree of karstification (Kiraly, 2003) and the distribution of concentrated and diffuse recharge mechanisms that are controlled by a. In addition, more precise digital elevation models or soil maps may help in better identification of a and V soil . A limitation of the regional sensitivity analysis approach used here is that parameter interactions are only included implicitly, considering parameter interactions with more elaborate methods (Saltelli et al., 2008) may reveal even more characteristics of the VarKarst-R model at different simulation timescales. But this is beyond the scope of this paper.

Impact of karstic subsurface heterogeneity
Even though some deviations occur among the individual karst landscapes, the general simulations of the VarKarst-R model follow well the observations of mean annual recharge rates over Europe and the Mediterranean (Fig. 9). On the other hand, the widely used large-scale simulation models PCR-GLOBWB (Wada et al., 2010(Wada et al., , 2014 and WaterGAP (Döll and Fiedler, 2008;Döll et al., 2003) generally underestimate groundwater recharge (Table 6). The reason for this is the representation of karstic subsurface heterogeneity within the VarKarst-R model, i.e. the inclusion of preferential flow paths and of subsurface heterogeneity. Based on the conceptual understanding of soil and epikarst storage behaviour (Fig. 1c) it allows (1) for more recharge during wet conditions because surface runoff is not generated, and (2) for more recharge during dry conditions because the thin soil compartments will always allow for some water to percolate downwards before it is consumed by evapotranspiration. During wet conditions, both PCR-GLOBWB and WaterGAP will instead produce surface runoff that is subsequently lost from groundwater recharge. During dry conditions, due to its non-variable soil storage capacity, the PCR-GLOBWB model will not produce any recharge when the soil water is below its minimum storage. Separating surface runoff and groundwater recharge by a constant factor, the WaterGAP model will produce recharge during dry conditions, but a constant fraction of effective precipitation will always become fast surface/subsurface runoff resulting in reduced recharge volumes.
This does not mean that the representation of recharge processes in models like PCR-GLOBWB or WaterGAP is gener-ally wrong, but that it can be limited since our analysis shows that the structures of such models need more adaption to the particularities of different hydrologic landscapes. In particular, it adds to the need for incorporating subgrid heterogeneity in our large-scale simulation models (Beven and Cloke, 2012). Karst regions comprise about 35 % of Europe's land surface and our results indicate that presently their groundwater recharge is under-estimated, while surface runoff and actual evaporation are over-estimated. Given the expected decrease of precipitation in semi-arid regions, such as the Mediterranean, and an increase of extreme rainfall events at the same time in the near future (2016-2035, Kirtman et al., 2013), current large-scale simulation models will overestimate both the vulnerability of groundwater recharge and the flood hazard in karst regions in Europe and the Mediterranean. The same is true for the long-term future (end of 21st century; Collins et al., 2013). Of course, an over-estimation of vulnerability and hazard might be the "lesser evil" compared to an over-estimation. But, at times of limited financial resources, excessive investments in ensuring the security of drinking water supply and flood risk management for potential future changes may unnecessarily aggravate the socioeconomic impacts of climate change.

Conclusions
In this study we have presented the first attempt to model groundwater recharge over all karst regions in Europe and the Mediterranean. The model application was made possible by a novel parameter confinement strategy that utilized a combination of a priori information and recharge related observations on four typical karst landscapes that were identified through cluster analysis. Handling the remaining uncertainty explicitly as posterior parameter distributions resulting from the confinement strategy, we were finally able to produce recharge simulations and an estimate of their uncertainty. We found an adequate agreement with our new model when comparing our results with independent observations of recharge at study sites across Europe and the Mediterranean. We further show that current large-scale modelling approaches tend to significantly under-estimate recharge volumes.
Overall, our analysis showed that the subsurface heterogeneity of karst regions and the presence of preferential flow paths enhances recharge. It results in high infiltration capacities prohibiting surface runoff and reducing actual evapotranspiration during wet conditions. On the other hand it allows for recharge during dry conditions because some water can always percolate downwards passing the thin fraction of the distributed soil depths. This particular behaviour suggests that karstic regions might be more resilient to climate change in terms of both flooding and droughts. Drinking water and flood risk management is liable to be based on erroneous information for at least 35 % of Europe's land surface since this is not considered in current large-scale modelling approaches.
However, using recharge directly as a proxy for "available" groundwater resources may not be good in all cases, neither in karst regions nor in other types of aquifers (Bredehoeft, 2002). To precisely estimate the sustainable usable fraction of groundwater the aquifer outflow should be known rather than just the inflow. Furthermore, pumping strategies should consider the geometry and transmissivity of the aquifer. Hence, recharge estimation can be considered only as a first proxy of available groundwater and future studies should focus on the large-scale simulation of karst groundwater flow and storage to further improve water resources predictions in karst regions. A2 Results of the regional sensitivity using initial ranges Figure A2. Sensitivity of simulated recharge to the model parameters at different timescales and in the different karst landscapes, as in Fig. 10 but with sampling parameters from the confined parameter ranges of Table 5.