Evaluating the Atibaia River hydrology using JULES6.1

. Land surface models such as the Joint UK Land Environment Simulator (JULES) are increasingly used for hydrological assessments because of their state-of-the-art representation of physical processes and versatility. Unlike statistical models and AI models, the JULES model simulates the physical water ﬂux under given meteorological conditions, allowing us to understand and investigate the cause and effect of environmental changes. Here we explore the possibility of this approach using a case study in the Ati-baia River basin, which serves as a major water supply for metropolitan regions of Campinas and São Paulo, Brazil. The watershed is suffering increasing hydrological risks, which could be attributed to environmental changes, such as urbanization and agricultural activity. The increasing risks high-light the importance of evaluating the land surface processes of the watershed systematically. We explore the use of lo-cal precipitation collection in conjunction with data from a global meteorological reanalysis to simulate the basin hydrology. Our results show that key hydrological ﬂuxes in the basin can be represented by the JULES model simulations.


Introduction
The Atibaia River basin serves as a major water supply for Campinas and São Paulo (Demanboro et al., 2013;Nobre et al., 2016).The basin is subjected to human impacts such as urbanization and agricultural activities.Increasing hydrological risks have emerged with historic floods in 2009 and 2010 (Campos and Carneiro, 2013), and drought in 2014 and 2015 (Marengo et al., 2015;Nobre et al., 2016), which highlight the importance of evaluating the hydrology systematically.Hydrological models are an essential tool in hydrological sci-ence and catchment management for evaluating the hydrological impacts of climate change or land-use and land-cover change (Buytaert and Beven, 2011).There have been several recent research activities centered on the Atibaia River basin (dos Santos et al., 2020;Prochmann, 2022) due to its importance in the water supply.
Physically based hydrological models are often used to simulate the physical water flux under given meteorological conditions, allowing us to understand and investigate the cause and effect of environmental changes.One such physically based model, commonly used, is the Soil Water Assessment Tool (SWAT), which has been used for flow and sediment estimation for the Atibaia River basin (dos Santos et al., 2020).However, the accuracy of the model highly depends on the model structure as well as the availability and quality of input data.Also, local calibration is usually required due to the few empirical approximations in each model.
The JULES model was developed from the Met Office Surface Exchange Scheme (MOSES) by the UK Met Office (Cox et al., 1999).It can be coupled to an atmospheric global circulation model but is also used as a standalone land surface model that simulates the carbon fluxes (Clark et al., 2011), water, energy, and momentum (Best et al., 2011) between the land surface and the atmosphere.The model is driven by a large dataset of meteorological variables using a physically based approach and has been increasingly used for hydrological assessment (Le Vine et al., 2016;Martínezde la Torre et al., 2019;Zulkafli et al., 2013).Therefore, we examine the model's ability to simulate the land surface processes of the Atibaia watershed.The primary soil types in this area are Ferralsols, Acrisols, Leptosols, and Cambisols (FAO/IIASA/IS-RIC/ISSCAS/JRC, 2012; Ottoni et al., 2018;Rossi, 2017).The required soil parameters are obtained by using the pedotransfer functions (PTFs) of Hodnett and Tomasella (2002), which generate parameters from physical and chemical properties of soil obtained from a large-scale soil database at 0.083 • resolution (FAO/IIASA/ISRIC/ISSCAS/JRC, 2012).The primary land cover is rural (53.0 %), followed by forest (27.6 %), and then urban (12.0 %).Higher percentages of forest are found in the upper basin, whereas urban areas concentrate in the lower basin.We classified the land cover into five vegetated plant functional types (Harper et al., 2018), including tropical broadleaf evergreen trees (BET-Tr), needleleaf evergreen trees (NET), C 3 grasses (C 3 ), C 4 grasses (C 4 ), evergreen shrubs (ESH), and two non-vegetated types: bare soil (BS) and urban (U) using MODIS data (Friedl and Sulla-Menashe, 2015) and reclassified by Houldcroft et al. (2009).
The study region's rainfall presents a seasonal pattern with rainy summers and dry winters (Cavalcanti et al., 2017;Dias et al., 2013).The rainfall regimes are influenced by the passage and frontal systems' intensity (Maddox, 1983;Silveira et al., 2016).The maximum precipitation occurs during the austral summer associated with the South Atlantic Convergence Zone (SACZ) and in the winter the high pressing system of the South Atlantic is predominant (Jones and Carvalho, 2013).For this study, rainfall time series from 2009 to 2019 are provided by three stations from Campinas-IAC and five stations from DAEE (Fig. 1).The missing rainfall records are completed with the records from a neighbor using multi-regression.The temperature, specific humidity, and surface pressure are observed by the Center for Meteorological and Climate Research Applied to Agriculture (CEPAGRI).The air temperature is elevation-adjusted with the lapse rate (γ ) 1.4 • C per 100 m obtained from Campinas-IAC and CEPAGRI data during the study period.
Other meteorological data required include downward short-wave radiation, long-wave radiation, and wind speed, all of which are extracted from the NCEP-DOE Reanalysis II dataset (Kanamitsu et al., 2002).The dataset is available on a T62 Gaussian grid with 192 × 94 points (approximately 2 • scale) and provides a 6-hourly temporal resolution from January 1979 up to the present.The 6-hourly resolution was disaggregated into hourly data using linear interpolation.

The JULES model
JULES simulates the energy exchange between different land surface processes described by Best et al. (2011) and Clark et al. (2011).For each sub-basin, distinct parameters are used to calculate the energy balance of surface temperatures, shortwave and long-wave radiative fluxes, sensible and latent heat fluxes, ground heat fluxes, canopy moisture contents, snow masses, and snow melting rates for each surface type in a grid box.The sub-grid surface heterogeneity is described using a tiled model upon a shared four-layer soil column with a thickness of 0.1, 0.25, 0.65, and 2.0 m from top to bottom.In JULES, precipitation is intercepted by the canopy storage, and then throughfall is then partitioned into surface flow and infiltration into the soil based on the Hortonian infiltration excess mechanism.JULES incorporates one of two different hydrologic models: (1) Probability Distributed Model (PDM) described by Moore (1985) and (2) TOPMODEL dehttps://doi.org/10.5194/gmd-15-5233-2022 Geosci.Model Dev., 15, 5233-5240, 2022 scribed by Beven and Kirkby (1979).In our model setup, we have calculated saturation excess flow by first using the PDM, with the sub-grid distribution of soil moisture (θ) described by a probability function (Clark and Gedney, 2008).
The saturated fraction (f sat ) is described as follows: . (1) S is the areal fraction of grid-box soil water storage.S 0 is the minimum soil water content below which there is no surface runoff saturation excess production by PDM. S max is the maximum grid-box storage, which is equal to the volumetric soil water content (θ s ) multiplied by the soil depth (Z pdm ).The shape parameter (B) is initially set as 1, whereas an alternative value of 0.1 or 0.5 can be used to better represent a more subsurface-flow-dominated hydrology, and a value of 10 to represent a more flash hydrological response.
For the TOPMODEL approach, the saturated fraction (f sat ) is found by integrating the probability distribution function (pdf) of the topographic index (λ).Numerical integration using a two-parameter gamma distribution can be found in Gedney and Cox (2003) as follows: in which a s and c s are fitted parameters for each sub-basin, λ ic is the critical value of the topographic index at which the water table reaches the surface, and f is a decay parameter describing the decrease in the saturated hydraulic conductivity.The mean value and standard deviation of the topographic index data are obtained from Marthews et al. (2015).
For both PDM and TOPMODEL, precipitation over the saturated fraction of the grid generates surface runoff.An instantaneous redistribution of soil moisture is assumed for the infiltration following the Darcy-Richards diffusion equation.The gravity drainage generates the subsurface flow at the lower boundaries.
We evaluated the sensitivity of modeled streamflow to the hydrological parameters shown in Table 1 and calibrated the model to select the most suitable approach for the study region.Soil hydraulic characteristics are estimated using the relationship of Van Genuchten (1980).

Model evaluation
The runoff fluxes simulated by the JULES model require an external river routing model for a reasonable comparison to observed river flows (Best et al., 2011).In this study, we applied a simple delayed function to account for the routing delay in the modeled flow (Q sim ) in each time step (t).Following the routing process, the flow (Q rout ) aggregated from each sub-basin (n) in the outlet is as follows: The delay time (t i ) for each sub-basin is given by dividing the distance from the sub-basin to the outlet (d i ) by flow speed (C), which is set constantly as the average flow speed of 0.40 m s −1 (from 2009 to 2019).We evaluated the sensitivity of hydrological parameters of PDM and TOPMODEL to determine the most suitable model in the upper (3E-063), middle (3D-006), and lower basin (4D-009) using the simulated results from the first year.Soil depth (dz_pdm), shape factor (b_pdm), and the fraction of maximum storage (s_pdm) are evaluated for PDM, and a decay parameter describing the decrease in the saturated hydraulic conductivity (f ) for the TOPMODEL.Following the sensitivity analysis, the full model is run with the parameter combination with the highest Nash-Sutcliffe efficiency (NSE) performance.We evaluated the overall model performance over the modeling period (N) using the NSE, RMSEobservation standard deviation ratio (RSR), and percent bias (P BIAS ), following Moriasi et al. (2007), where Q obs, mean is the mean of observed flow, and Q obs,t and Q mod,t are the observed flow and modeled flow at a daily time step (t).

Sensitivity analysis
We evaluated the sensitivity of hydrological parameters in PDM and TOPMODEL.For PDM, we found lower flow simulated with increasing soil depth (Fig. 2a).However, the average flow only reduced by 2.5 % when soil depth was increased from 0.8 to 2.0.In contrast, we found that the shape factor (Fig. 2b) and minimum soil water content (Fig. 2c) had a higher impact on the simulated flow.When we increased the minimum soil water content, the simulated flow was reduced with more water held in the soil.The average flow was reduced by 17.5 % when the minimum soil water content increased from 0 to 0.4, whereas gradual changes were found in the flow regime.For the shape factor, a lower value simulates a more gradual flow regime (Clark and Gedney, 2008), which in turn lowers the peak level flow estimation and hence increases baseflow generation (Fig. 2b).In this case, the shape (b = 0.1) can change the flow into a subsurface-flowdominated regime (Fig. 3a).We found that the shape factor of 0.5 best describes the flow in the study basin.
We examined the model performance with a combination of soil depth, shape factor, and the minimum soil water content and found the highest performance with the combination dz = 1.0, b = 0.5, and s = 0 in the lower basin, which altered the shape factor alone from the default setup.In the middle and upper basin, an increased value of the minimum soil water content simulated higher performance (dz = 1.0, b = 0.5, s = 0.1).Therefore, we run the full-time series of modeling with these parameter combinations.
For the TOPMODEL, we found that the lower decay parameter increases the baseflow (Figs. 2d,3b) and reduces the magnitude of peak flows.In terms of model performance, we found a value of f = 3.0 simulates the flow in the lower and middle basin well, whereas a lower value of f = 2.0 is more suitable in the upper basin.These values were then selected to be used for the full model simulation. https://doi.org/10.5194/gmd-15-5233-2022 Geosci.Model Dev., 15, 5233-5240, 2022

Hydrological modeling using the JULES model
Table 2 summarizes the evaluation of the JULES model in the upper, middle, and lower basin.For the entire basin (lower), the TOPMODEL shows "good" modeling performance (0.75 > NSE > 0.65, 0.6 > RSR > 0.5), and the PDM shows "satisfactory" modeling performance (0.65 > NSE > 0.50, 0.7 > RSR > 0.6) (Moriasi et al., 2007).Figure 4 shows that lower peak flow and higher baseflow is simulated using TOP-MODEL, which is more suitable for our study basin.In the middle basin, both models are marked as having "good" performance."Unsatisfactory" is marked for both TOPMODEL and PDM in the upper basin.We attributed one possibility to the rainfall data recorded in the upper basin.Although the difference in the average annual rainfall is below 4 % in the whole basin, the variation might be not well represented in the upper basin.However, TOPMODEL still simulates an acceptable result (NSE > 0.15) for daily flow simulation considering the difficulty and high variation at daily time step (dos Santos et al., 2020).In terms of average flow, the bias is marked as "very good" (P BIAS < 10) for all the simulations.
Figure 5 shows the modeling performance of daily flow in the lower basin yearly using TOPMODEL.The modeling performance is marked as "very good" in 3 years, "good" in 1 year, and "satisfactory" in 4 years, with the overall performance marked as "good".The highest modeling performance is simulated in 2010 (NSE = 0.89), whereas a lower score is simulated in 2014 (NSE = 0.32).In the lower basin, most of the peak events and the level of baseflow are well simulated in 2010 (Fig. 6a).The flow regimes are also present in the middle (Fig. 6b) and upper basin (Fig. 6c).In contrast, lower model performance is simulated in the lower basin in 2014.One possibility is that the rainfall in 2014 (858 mm) is considerably lower than the average level (1302 mm).The soil moisture could be lower than the level simulated by the model, which leads to lower evapotranspiration.We found most of the peak events are still detected (Fig. 7), but the level of peak flow and baseflow is generally higher than the observed values.
Our results show that it is possible to use the JULES model for hydrological simulation in the Atibaia River basin.The model performance for daily flow in our study (NSE = 0.74) is higher than the SWAT model's estimation (NSE = 0.61 in the validation period) (dos Santos et al., 2020).However, both research pieces have pointed out that rainfall uncertainty could be one possible reason for the reduction in the model performance.We both found lower modeling performance in the upper basin, since the effects of the under-and overestimated rainfall in a single site could be amplified.Nevertheless, the simulation shows that these rainfall data are still representative during most of the modeling period.

Conclusions
We implemented the JULES6.1 model in the Atibaia River basin for hydrological estimation to evaluate the model performance.We evaluated the sensitivity of hydrological parameters to select the most suitable approach for the study region.We find that both TOPMODEL and PDM can reasonably estimate the flow with appropriate parameter sets.Our results show that the JULES setup can detect most peak events and reasonably estimate baseflow.The uncertainty of rainfall data could reduce the model performance in some periods with higher spatial rainfall variation.Nevertheless, our results show that the model performance is high in terms of daily flow estimation over the modeling period.The JULES model could thus be considered as an available and appropriate option for hydrological evaluation.
Figure 1.Atibaia River basin and the location of the weather stations, flow monitoring stations, and dams.

Figure 2 .
Figure 2. Sensitivity analysis of daily mean flow in the lower basin with (a) soil depth, (b) shape factor, and (c) threshold of minimum soil water content, using PDM, and (d) decay parameters, using TOPMODEL.

Figure 4 .
Figure 4. Daily modeled flow in the lower basin using PDM and TOPMODEL.

Figure 5 .
Figure 5. NSE score of modeling daily flow in the lower basin using TOPMODEL summarized by year.

Figure 6 .
Figure 6.Daily modeled flow, observed flow, and rainfall in the (a) lower, (b) middle, and (c) upper basin in the year 2010 using TOP-MODEL.

Figure 7 .
Figure 7. Daily modeled flow, observed flow, and rainfall in the lower basin in the year 2014 using TOPMODEL.

Table 2 .
Modeling performance as calculated from the observed and modeled flow time series in lower, middle, and upper basins, using (1) TOPMODEL and (2) PDM.NSE: Nash-Sutcliffe efficiency.RSR: RMSE-observation standard deviation ratio.PBIAS: percentage of bias.