Interactive comment on “ Simultaneous parameterization of the two-source evapotranspiration model by Bayesian approach : application to spring maize in an arid region of northwest China ”

General comments and overall evaluation: Bayesian statistics, based on probability theory, is a logical choice for model calibration; it provides parameter estimates by quantifying the uncertainties in the data and model structure. The authors employ Bayesian method to calibrate the Shuttleworth-Wallace model, using eddy-covariance evapotranspiration measurements and daily soil evaporation. The work is interesting but some technical aspects should be clarified and additional analyses should be carried out.


Introduction
In agriculture ecosystems, more than 90 % of all water inputs is lost by evapotranspiration (ET) (Morison et al., 2008), which is defined as the sum of water loss by evaporation (E) from soil and transpiration (T ) from plants (Rana and Katerji, 2000).E and T are influenced by different abiotic and biotic factors (Wang and Yakir, 2000), and the contributions of E and T to the total ecosystem ET are highly variable in space and time (Ferretti et al., 2003).Thus, accurate measurement or estimation of ET and its components (E and T ) is essential for many applications in agriculture, such as irrigation scheduling, drainage, and yield forecasts (Wallace and Verhoef, 2000;Flumignan et al., 2011;Sun et al., 2012).The Shuttleworth-Wallace model (S-W model) (Shuttleworth and Wallace, 1985) takes the interactions between the fluxes from soil and canopy into account, and is physically sound and rigorous.Previous studies have proved that it performs well for row crops such as maize, wheat, cotton, sorghum and vine (Stannard, 1993;Tourula and Heikinheimo, 1998;Anadranistakis et al., 2000;Teh et al., 2001;Lund and Soegaard, 2003;Kato et al., 2004;Ortega-Farias et al., 2007;Zhang et al., 2008).
Despite these studies, there are still some insufficiencies in the application of the S-W model (Hu et al., 2009;Zhu et al., 2013).First, the S-W model is sensitive to the errors in the values of canopy and soil resistances (Lund and Soegaard, 2003).Previous studies mainly focused on the parameterization of the canopy resistance (Hanan and Prince, 1997;Samanta et al., 2007;Zhu et al., 2013), and less attentions have been committed to the parameterization of the soil surface resistance (Sellers et al., 1992;van de Griend and Owe, 1994;Villagarcía et al., 2010).In crop ecosystem, E may contribute significantly to the total ET when leaf area index (LAI) is low (Lund and Soegaard, 2003;Zhang et al., 2008).Thus, simultaneous parameterization of the canopy and soil resistances in the S-W model, based on direct measurement of ET and its components by using a combination of micrometeorological (e.g., eddy covariance methods, Bowen ratio), hydrological (e.g., chambers, microlysimeters) and ecophysiological techniques (e.g., sap flow, stable isotopes) (Williams et al., 2004;Scott et al., 2006), is important to reduce the model error.However, such studies are relatively rare or nonexistent.Secondly, as far as the parameterization method is concerned, abundant evidence has shown that the Bayesian method provides a powerful new tool to simultaneously optimize many or all model parameters against all available measurements, and to quantify the influence of uncertainties (Clark and Gelfand, 2006).Although some pioneering efforts have been made (e.g., Samanta et al., 2007;Zhu et al., 2013), the Bayesian method has been much less frequently used in parameterization of ET models than in the other environmental sciences (van Oijen et al., 2005).Moreover, the Bayesian method, to our knowledge, has not been used to simultaneously optimize the parameters of the S-W model against multivariate data sets (Sect.2.5).Finally, arid northwestern China, one of the driest areas in the world (Zhu et al., 2007(Zhu et al., , 2008)), is characterized by a widely distributed Gobi desert interspersed with many oases in different sizes and shapes.Land surface processes of this heterogeneous region are much more complex than in other regions (Zhang and Huang, 2004).Thus, the applicability of the S-W model in such regions needs to be investigated in detail.
Based on direct measurements of different components of ET obtained by using the eddy covariance technique and microlysimeters over a spring maize field in the arid region of northwestern China from 27 May to 14 September in 2012, the objectives of the present study were to (1) simultaneously parameterize the S-W model using the Bayesian method against multivariate data sets, and (2) verify the performances of the S-W model, and identify the causes of failure and success in simulating ET over the crop ecosystem in arid desert oases of northwestern China.It is expected that this study can not only promote the development of ET model parameterization, but also help us to find a proper direction in modifying the S-W model used in arid regions.

Study site
The study site is located in the Daman (DM) Oasis, in the middle Heihe River basin, Gansu Province, China (100 • 22 20 E, 38 • 51 20 N; 1556 m a.s.l.; Fig. 1).The annual average temperature and precipitation was 7.2 • C and  125 mm (1960-2000), respectively.The potential evaporation is around 2365 mm year −1 , and the dryness index according to the World Atlas of Desertification (UNEP, 1992) is 15.9.The soil type is silt clay loam on the surface and silt loam in the deeper layer.
The study area has an agricultural development history of over 2000 years owing to its flat terrain, adequate sunlight and convenient water resources from the Qilian Mountains.The main crops in the DM Oasis are spring wheat and maize.The spring wheat (Triticum aestivum L.) is generally sown in late March and harvested in the middle 10 days of July, while the maize (Zea mays L.) is sown in late April and harvested in the middle 10 days of September.Stand density of the spring maize is about 37 plants m 2 with row spacing of 40 cm and planting spacing of 7 cm.

Measurements and data processing
The field observation systems at this site were constructed in May 2012 as part of the Heihe Watershed Allied Telemetry Experimental Research (HiWATER) project (see details in Li et al., 2013b).The fluxes of sensible heat (H ), latent heat (λET) and carbon dioxide were measured at the height of 4.5 m using the eddy covariance (EC) system (Liu et al., 2014, manuscript in preparation), which consists of an open-path infrared gas analyzer (Li-7500, LiCor Inc., Lincoln, NE, USA) and a 3-D sonic anemometer (CSAT-3, Campbell Scientific Inc., Logan, UT, USA).The EC data were sampled at a frequency of 10 Hz by a data logger (CR5000, Campbell Scientific Inc.), and then were processed with an average time of 30 min.Post-processing calculations, using EdiRe software, included spike detection, lag correction of H 2 O/CO 2 relative to the vertical wind component, sonic virtual temperature conversion, planar fit coordinate rotation, the WPL (Webb-Pearman-Leuning) density fluctuation correction and frequency response correction (Xu et al., 2014).About 85 % of the energy balance closure (the sum of H + λET against the available energy) was found in EC data (Liu et al., 2011).In addition, the flux uncertainties are directly related to the likelihood function of Bayesian inference (Sect.2.5).Thus, determining the uncertainties in EC measurements is essential for proper parameter estimates.Recently, Wang et al. (2014) systemically studied the flux uncertainties of EC systems equipped in the Hi-WATER experiment.Generally, uncertainties for H (σ r (H ); W m −2 ) by using the method of Mann and Lenschow (1994) tended to be σ r (H ) = 0.14H + 2.7(R 2 = 0.95), and uncertainties for λET (σ r (λET); W m −2 ) tended to be σ r (λET) = 0.13λET + 6(R 2 = 0.93) (Wang et al., 2014).Data gaps due to instrument malfunction, power failure and bad weather conditions were filled using artificial neural network (ANN) and mean diurnal variations (MDV) methods (Falge et al., 2001).The ANN method was applied when the synchronous meteorological data were available; otherwise, the MDV method was used.The gap-filling data were used only to analyze the seasonal and annual variations in ET.
Continuous complementary measurements also included standard hydro-meteorological variables.Rainfall was measuring using a tipping bucket rain gauge (TE525MM, Campbell Scientific Instruments Inc.).Air temperature, relative humidity (HMP45C, Vaisala Inc., Helsinki, Finland) and wind speed/direction (034B, Met One Instruments Inc., USA) were measured at heights of 3, 5, 10 15, 20, 30 and 40 m above the ground.Downward and upward solar and longwave radiation (PSP, The EPPLEY Laboratory Inc., USA) and photosynthetic photon flux density (PPFD) (LI-190SA, LI-COR Inc.) were measured at a height of 6 m.Soil temperature (Campbell-107, Campbell Scientific Instruments Inc.) and moisture (CS616, Campbell Scientific Instruments Inc.) was measured at 0.02, 0.04, 0.1, 0.2, 0.4, 0.8, 1.2 and 1.6 m depths.Three heat flux plates (HFT3, Campbell Scientific Instruments Inc.) were randomly buried at the depths of 0.01 m.The average soil heat fluxes were calculated using the three randomly buried plates.These data were logged every 10 min by a digital micrologger (CR23X, Campbell Scientific Inc.) equipped with an analog multiplexer (AM416) used for sampling and logging data.
Daily soil evaporation was measured using three microlysimeters randomly placed between crop rows (one in the middle of the rows and the other two close to plants on each side of the rows).The microlysimeters with an internal diameter of 10 cm and a depth of 20 cm were filled with an intact soil core and pushed into soil with the top slightly above the soil surface (Daamen et al., 1993;Liu et al., 2002).The average weight loss of these microlysimeters measured using electronic scales with 0.01 g precision was nearly equal to soil evaporation.In order to keep the soil moisture in microlysimeters similar to that between the rows, the soil in the microlysimeters was replaced daily or every 2 days.LAI was measured using the AM300 portable leaf area meter (ADC BioScientific Ltd., UK).The fraction of land cover (f ) was estimated by measuring the projected crop canopy area of selected stands in a corresponding field plot.LAI, f and crop height were measured approximately every 10 days during the growing season, and the gaps were linearly interpolated to a daily interval.

Description of the S-W model
In the S-W model, the ecosystem evapotranspiration (λET; W m −2 ) is separated into evaporation from the soil surface (λE; W m −2 ) and transpiration from the vegetation canopy (λT ; W m −2 ) (Fig. 2), which are calculated as (Shuttleworth and Wallace, 1985;Lhomme et al., 2012) where ET s and ET c are terms to describe evaporation from soil and transpiration from the plant (W m  of evaporation (J kg −1 ); is the slope of the saturation vapor pressure versus temperature curve (kPa K −1 ); ρ is the air density (kg m −3 ); C p is the specific heat capacity of dry air (1013 J kg −1 K −1 ); D and D 0 (kPa) are the air water vapor pressure deficit at the reference height (3 m) and the canopy height, respectively; γ is the psychrometric constant (kPa K −1 ); r c s and r s s are the surface resistance for plant canopy and soil surface (s m −1 ), respectively; r c a and r s a are aerodynamic resistances from the leaf to canopy height and soil surface to canopy height (s m −1 ), and r a a is aerodynamic resistances from canopy height to reference height (s m −1 ).A and A s (W m −2 ) are the available energy input above the canopy and above the soil surface, respectively, and are calculated as where R n and R ns are net radiation fluxes into the canopy and the substrate (W m −2 ), respectively; G is the soil heat flux (W m −2 ).R ns was calculated using a Beer's law relationship of the form in which K A is the extinction coefficient of light attenuation, This can be measured on site (see Sauer et al., 2007), and was set to be approximately 0.41 for spring maize (Mo et al., 2000).

Calculation of resistances in the S-W model
The resistance network of the S-W model is shown in Fig. 2. In this paper, the three aerodynamic resistances (i.e., r a a , r c a and r s a ) were calculated using the same approach suggested by Shuttleworth and Wallace (1985), Shuttleworth and Gurney (1990) and Lhomme et al. (2012).
The canopy resistance (r c s ), which is the equivalent resistance of all the individual stomata in a canopy and depends on the environmental variables, can be calculated using the Jarvis-type model (Jarvis, 1976): where r STmin represents the minimal stomatal resistance of individual leaves under optimal conditions.F i (X i ) is the stress function of a specific environmental variable X i , with 0 ≤ F i (X i ) ≤ 1.Following Stewart (1988) and Verhoef and Allen (2000), the stress functions were expressed as where k 1 − k 3 are constants (units see Table 1); R s is the incoming solar radiation (W m −2 ); T a is the air temperature ( • C) at the reference height; T a,min and T a,max are the lower and upper temperatures limits ( • C), respectively, which are T a values when F 2 (T a ) = 0 and are set at values of 0 and 40 • C (Harris et al., 2004); θ r is the actual volumetric soil water content in the root zone at depth of 0-60 cm (m 3 m −3 ); θ wp is water content at the wilting point (m 3 m −3 ); and θ cr is the critical water content (m 3 m −3 ) at which plant stress starts.The values of θ wp and θ cr are set as 0.11 m 3 m −3 and The bold number means that this parameter was well constrained by the data.
The soil surface resistance (r s s ; Fig. 2) was expressed as a function of near-surface soil water content (Sellers et al., 1992;Verhoef et al., 2006Verhoef et al., , 2012;;Zhu et al., 2013): in which b 1 and b 2 are empirical constants (s m −1 ); θ s is soil water content in the top layer of soil (at depth of 2 cm); and θ sat is the saturated soil water content (m 3 m −3 ), which was estimated empirically through the near-surface soil texture.In summary, there are six site-and species-specific parameters that needed to be estimated in the S-W model associated with soil and canopy resistances, which are b 1 , b 2 , r STmin and k 1 to k 3 (see Supplement 1).

Bayesian inference framework and assimilation scheme
With Bayes' theorem (a complete description was presented in Supplement 2), the posterior distribution of parameter c is generated by where p(c) represents prior probability distributions of parameter c, which is chosen as uniform distributions with specified allowable ranges (Table 1).In general, the parameter ranges were wide enough to include the actual parameter values and to give the optimization freedom.In the test study, we run the S-W model using 4000 parameter vectors which were sampled from the prior distribution using the Latin hypercube sampling (LHS) method (Iman and Helton, 1998), and found that the observed data in most cases were in the range of predicted values (Supplement 1); p(O|c) is the likelihood function, which reflects the influence of the observation data sets on parameter identification; and p(c|O) is the posterior distribution after Bayesian inference conditioned on available observations O.
For each data set (e.g., λET and E), the modeldata mismatch e i (t) (i = 1, 2), which represents a relative "goodness-of-fit" measure for each possible parameter vector (van Oijen et al., 2011(van Oijen et al., , 2013)), is expressed by where O i (t) and f i (t) are observed and modeled (Eqs. 1 or 9) values of the ith data set at time t, respectively.Assuming the model-data mismatch e i (t) follows a Gaussian distribution with a zero mean, the likelihood function for the ith data set (O i (•)) can be expressed by where n i is the number of observations of the ith data set; and σ i (i = 1, 2) represents the residual errors, or standard deviation about model-predicted output of the ith data set.Here, we assumed σ i is the same over the observation time for the ith data set (Braswell et al., 2005).Traditionally, σ i can be included into the analysis explicitly (i.e., assuming σ i is uniform over log σ i ; Gelman et al., 1995) and treated as one the model parameters, which yields a complete posterior distribution of σ i .However, this method artificially increased the parameter dimension of the problem and may result in unreasonable estimations of the parameter values (Kavetski et al., 2006).In this study, σ i was estimated by using the analytical method (Hurtt and Armstrong, 1996;Braswell et al., 2005), which is to find the value of σ i that maximizes log(p(O i (•)|c)) for a given parameter vector.By differentiating log(p(O i (•)|c)) with respect to σ i , we can obtain We then used σ a i to replace σ i in Eq. ( 23).The likelihood function for the multivariate data sets, p(c|O), used for parameter estimation is then defined as the product of the individual p(O i (•)|c)s (Richardson et al., 2010): where m is the number of data sets; When a particular data set O i (•) was not being used in the optimization, we simply set the corresponding likelihood function p(O i (•)|c) to 1. Thus, this framework can be easily used when additional observations are available.In this study, the two data sets used to simultaneously optimize the parameter values were EC-measured half-hourly evapotranspiration (λET; W m −2 ) and microlysimeters-measured daily soil evaporation (E; mm day −1 ).

Metropolis-Hasting algorithm and convergence test
The posterior distribution was sampled using the Metropolis-Hasting (M-H) algorithm (Metropolis et al., 1953;Hastings, 1970), a version of the Markov chain Monte Carlo (MCMC) technique.To generate a Markov chain in the parameter space, the M-H algorithm was run by repeating two steps: a proposing step and a moving step.
In the proposing step, a candidate point c new is generated according to a proposal distribution P (c new |c k−1 ), where c k−1 is the accepted point at the previous step.In the moving step, point c new is treated against the Metropolis criterion to examine if it should be accepted or rejected.It was well recognized that the efficiency of the M-H algorithm was strongly effected by the proposal distribution function.To find an effective proposal distribution P (c new |c k−1 ), a test run of the M-H algorithm with 10 000 simulations was made by using a uniform proposal distribution (Braswell et al., 2005): where c k−1 is the current accepted point, r is a random number uniformly distributed between −0.5 and +0.5, and c min and c max are the lower and upper limits of parameter vector c (Table 1).Based on the test run, we then constructed a normal proposal distribution c new ∼ N(c (k−1) , cov 0 (c)), where cov 0 (c) is the covariance matrix of the parameter vector c from the initial test run (Xu et al., 2006).The detailed description of the MCMC sampling procedure and the code written in MATLAB were presented in Supplement 2.
We ran at least four parallel MCMC chains with 20 000 iterations each, evaluated the chain convergence using the Gelman-Rubin (G-R) diagnostic method (Gelman and Rubin, 1992) (Supplement 3), and thinned the chains (every 20th iteration) when appropriate to reduce within chain autocorrelation, thereby producing an independent sample of 3000 values for each parameter from the joint posterior distribution.

Evaluation of model output estimates
Since the primary interest in application of the S-W model was to reproduce the pattern of water vapor fluxes from different sources (i.e., soil and vegetation) during the whole study period, we used all available data to construct the likelihood function (Eq.25) and to obtain the posterior distribution of the parameters.Then, the performance of the S-W model was evaluated using the coefficient of determination of the linear regression between measured and estimated values of water vapor fluxes, R 2 , representing how much the variation in the observations was explained by the models.Also, the root mean square error (RMSE), mean bias error (MBE), index of agreement (IA) and model efficiency (EF) (Legates and McCabe, 1999;Poblete-Echeverria and Ortega-Farias, 2009) were included in the statistical analysis, which are calculated as follows: where n i is the total number of observations of the ith data set O i (t) is the observed values at time t of the ith data set, O i is the mean of the observed value of the ith data set, and f i (t) is the simulation which was calculated using the posterior median parameter values, and other parameter vectors selected from the parameter chains generated by the MCMC iteration (van Oijen et al., 2013).

Environmental and biological factors
Detailed information on the seasonality of key environmental and biological variables is essential to assess seasonal variation in the actual ET and its partitioning.The seasonal change in net solar radiation (R n ; MJ m −2 day −1 ), air temperature (T a ; • C), air water vapor pressure deficit (D; kPa), wind speed (u; m s −1 ) at the height of 3 m, rainfall and irrigation (mm), soil water content (θ; m 3 m −3 ), and leaf area index (LAI; m 2 m −2 ) are illustrated in Fig. 3.During the study period (DOY (day of year) 147-257), the daily mean R n varied from 2.6 to 18.5 MJ m −2 day −1 with an average value of 11.9 MJ m −2 day −1 .The peaked values were recorded from the end of June to the middle of July .The variation of mean daily T a has a similar trend to R n , varying from 8.8 to 24.9 • C with an average value of around 19.0 • C. D exhibited large diurnal variation ranging from 0 to 3.5 kPa, and the daily mean D was relatively small when the LAI was larger than 3 m 2 m −2 (DOY 197-230).Daily mean u ranged from 0.5 to 3.2 m s −1 , and was close to normal longterm values.Total precipitation during the study period was 104.2 mm with eight events over 5.0 mm (Fig. 3).θ varied greatly over the whole growing season.The variability of θ mainly depended on the irrigation scheduling by the local government (irrigation quota and timing).Soil water content had a peak value (about 0.35 m 3 m −3 ) after irrigation and gradually reduced till the next irrigation (Fig. 3).The LAI showed a clear "one peak" pattern over the whole growing season with relatively high values of 3.5 m −2 m −2 from early July to late August (DOY 184-221).

Posterior distribution of S-W model parameters
The posterior parameter distributions are shown as histograms in Fig. 4 and summarized in Table 1 by posterior medians and 95 % probability intervals.The results showed that the Bayesian calibration against the multivariate data sets was in most cases successful in reducing the assumed prior ranges of the parameters values.Parameters r STmin , b 1 , b 2 and k 2 showed relatively large uncertainty reductions (defined as 1 − CI posterior /CI prior , where CI is the length of the 95 % credible interval) (Fig. 5), and their posterior distributions became approximately symmetric with distinctive modes, while parameters k 1 and k 3 have relatively large variability (widely spread on the prior bounds) (Fig. 4).The global sensitivity analysis with the first-order impact ratio (FOIR) values (Supplement 1) reveals the importance of input parameters in affecting total ecosystem ) also corresponded to the greatest degree of updating in the Bayesian inference.Thus, we thought that the key parameters in the S-W model were well optimized by the Bayesian method against the multivariate data sets.In addition, the correlation coefficient between the posterior distribution of parameters can be used to find groups of parameters that tend to be constrained together (Knorr and Kattge, 2005).In this study, the six calibrated parameters were not significantly intercorrelated with each other with correlation coefficients lower than 0.1 (Supplement 2).The responses of soil surface resistances (r s s ) to soil water content computed using our posterior mean b 1 and b 2 values were very similar to that calculated using equation developed by Ortega-Farias et al. ( 2010) based on direct soil evaporation measurements, but seemed to be more sensitive to changes in soil water content compared with some other studies (e.g., Sun, 1982;Sellers et al., 1992;Zhu et al., 2013;Fig. 6).When just using EC-measured λET data, a relatively wider posterior distribution of b 2 was observed (see Supplement 2).Thus, the daily soil evaporation data helped to well constrain estimates of b 1 and b 2 .The posterior mean value of r STmin from our study was very close to that (20 s m −1 ) reported for spring maize growing in northwestern China obtained by using the least squares fitting method (Li et al., 2013a).The variations of the minimal stomatal resistance (r STmin ) for many natural and cultivated plants have been widely investigated by previous studies (Korner et al., 1979;Pospisilova and Solarova, 1980).Typical values for r STmin vary considerably from about 20-100 s m −1 for crops to 200-300 s m −1 for many types of trees.Thus, our results fell within the range of previous studies.However, some parameters related to canopy surface resistance (i.e., k 1 and k 3 ) seemed to be not well updated (Fig. 4).This may be due to the fact that these parameters may be insensitive to the present available data sets.

Model performance compared with measurements
Having parameterized the S-W model as described above, we ran the model to simulate the half-hourly λET (Eq. 1) and λE (Eq.9) values (W m −2 ).The daily estimations of evapotranspiration (ET; mm day −1 ) and soil evaporation (E; mm day −1 ) were obtained by summing up the half-hourly simulated values.The statistical analysis of observed versus estimated values of water vapor fluxes at different timescales are summarized in Table 2.These results indicated that the parameterized S-W model was able to predict λET on a halfhourly basis with values of R 2 , IA and EF equal to 0.83, 0.93 and 0.74, respectively.However, significant differences exist between measured and modeled half-hourly λET values for the spring maize in the arid desert oasis.The slope (0.84) of the regression equation between the measured and modeled half-hourly λET values was lower than 1 (Table 2, Fig. 7a), which indicated that the S-W model tended to underestimate the half-hourly λET with a MBE value of 24.2 W m −2 .Ortega-Farias et al. ( 2010) also reported that the S-W model underestimated on half-hourly time intervals compared to the EC-measured λET over a drip-irrigated vineyard in a Mediterranean semiarid region during the growing season in 2006.On the contrary, some studies showed that the S-W model overestimated half-hourly λET (e.g., Li et al., 2013a;Ortega-Farias et al., 2007;Zhang et al., 2008).Therefore, the performance of the S-W model seemed to be variable for different crops and places, and there is a need to identify the causes that induced the disagreements between observed and modeled values (discussed below).
The fluctuation of measured and estimated daily ET and E is illustrated in Fig. 8.In this case, a good agreement between measured and estimated daily E was obtained with values of R 2 , IA and EF equal to 0.82, 0.94 and 0.76 (Table 2).The points in plots of measured-versus-modeled daily E fell tightly along the 1 : 1 line (slope = 1.01 and intercept = 0.01 with RMSE = 0.05 and MBE = −0.01;Fig. 7b, Table 2).Also, the 95 % posterior prediction intervals of simulated soil E was narrow.Thus, we thought that the soil resistance in the S-W model was properly parameterized for the spring maize by the measured soil evaporation data.From Fig. 8, we can also observe that the estimated daily ET generally fluctuated tightly with the measured values with relative narrow uncertainties (95 % posterior predication intervals).The values of Table 2. Statistical analysis of measured and estimated, using the median parameter values, half-hourly evapotranspiration (λET; W m −2 ), daily soil evaporation (E; mm day −1 ), and daily evapotranspiration (ET; mm day −1 ) for the spring maize in arid desert oases during the study period.RMSE, MBE, IA and EF were equal to 0.05, 0.14 mm day −1 , 0.94 and 0.79, respectively (Table 2).However, there are 12 days during the study period (111 days) with observations beyond the upper boundary of the 95 % posterior predication intervals (Fig. 8).For example, on 5 July, the estimated, using the median values of the parameters, and measured daily ETs were 2.9 and 4.3 mm day −1 , respectively (Fig. 8).Thus, the causes of the underestimations of ET by the S-W model on these days needs to be carefully checked based on detailed micrometeorological data.This work would help us to modify the model in a correct way and improve the precision of prediction.

Identification of the disagreement/agreement between observed and modeled ET values
The diurnal variation of R n , H and λET (measured and modeled) above the spring maize ecosystem for some selected days is presented in Fig. 9.The uncertainties of H and λET increased with the flux magnitude (Fig. 9), and tended to be approximately 14 and 13 %, respectively (Wang et al., 2014).The relative error for R n was relatively small and estimated to be 1.24 % (Xu et al., 2013).Resulting from the high surface heterogeneities, one special phenomenon, known as the "oasis effect" (Lemon et al., 1957;Oke, 1978) or "cold island effect" (Wang and Mitsuta, 1992;Zhang and Huang, 2004), was often observed on clear days in July and August in the study area and it is characterized as follows: (1) H is very small and even negative (downward) in the afternoon (Fig. 9a-c) due to the microscale advection of hot dry air over the desert to the crop's surface in the oasis (Oke, 1978;Hu, 1994).For example, on 5 July, H was continuously negative from 12:00 to 20:00 BST (Beijing standard time) (Fig. 9a).
A strong advection process can be distinctly detected from the temperature and relative humidity profiles (Fig. 10a, b), in which the highest temperature occurred at a height of 8-18 m. (2) Measured actual λET often exceeded (Fig. 9a) or was equal to (Fig. 9b, c) the local net radiation because of the added energy in the form of downward fluxes of H to the ET process (Evett et al., 2012).Under such conditions, the S-W model significantly underestimated the actual ET values due to the real atmospheric flows that do not correspond to its assumption of horizontal homogeneities (Rao et al., 1974).Thus, to properly represent the advection process in the S-W model, special attention should be paid in simulating ET over crop ecosystems in arid desert oases in future studies.In addition to this situation, slight underestimations were also observed on or shortly after rainy days (Fig. 8).For example, the simulated half-hourly λET was lower than that measured by EC after the rainfall event occurred at 13:00 BST on 17 June (Fig. 9d).We thought that the underestimations by the model on or shortly after rainy days were mainly due to ignoring the direct evaporation of liquid water intercepted in the crop canopy, because no downward H and temperature inversion were observed on this day (Fig. 10c, d).Until now, several canopy interception models have been developed (e.g., Rutter et al., 1971;Mulder, 1985;Gash et al., 1995;Bouten et al., 1996).However, many of them were developed for simulating the rainfall interception by forest ecosystems, and their suitability for crops need to be further investigated.
The diurnal variation of simulated half-hourly λET by the parameterized S-W model has a similar trend to the measurements on clear and advection-absent days during the whole study period (Fig. 9e-h).On these days, H was positive (upwards) at day time (Fig. 9e-h) and no temperature inversion was observed (Fig. 10e, f).Thus, we thought that the parameterization schedule adopted in this study worked well.It also demonstrated that the properly parameterized S-W model can be used in simulating and partitioning ET for homogeneous land surfaces.Hu et al. (2009) reported that the S-W model parameterized by using the Monte Carlo method can successfully simulate ET at four uniform grasslands in China; our previous studies (Zhu et al., 2013) also illustrated that the parameterized S-W model can be used to simulate and partition ET over a vast alpine grassland in the Qinghai-Tibet Plateau.

Discussion
The assessment of model errors remains an outstanding challenge in hydrology (Beven et al., 2008).Identifying the uncertainties related to model parameters and structure needs to have a prominent position in hydrological modeling (Bastola et al., 2011;Brigode et al., 2013).An important issue in identifying the parameter uncertainty is equifinality, where different parameters of the same model yield similar results, making it difficult to distinguish which is correct (see Franks et al., 1997).A variety of recent studies corroborated that the multi-objective calibration against the multiple (orthogonal; see Winsemius et al., 2006) data sets can produce robust parameter estimates (e.g., Engeland et al., 2006;Fenicia et al., 2007;Moussa and Chahinian, 2009;Richardson et al., 2010;Hrachowitz et al., 2013a).In this study, we constructed a Bayesian inference framework to constrain the model parameters using the EC-measured ET and microlysimetersmeasured daily E data sets simultaneously.The results indicated that four of the six main parameters were considerably updated, and simulated λET and E were comparable to the measurements with relatively narrow uncertainties (95 % posterior predication intervals).Using just EC-measured ET data in our test study (see Supplement 2), the optimized S-W model on the simulations of λET were not significantly different from that optimized by the multivariate data sets procedure, but it significantly underestimated E with great uncertainties (Supplement 2).Thus, we can not ensure that the S-W model, optimized using only the EC-measured ET data, can properly partition the total ET into its different components (soil evaporation and plant transpiration), even thought the simulated λET values were in good agreement with measurements.Limited success in estimating processbased model parameters using EC-measured data alone were also reported in previous studies (e.g., Wang et al., 2001;Knorr and Kattge, 2005;Richardson et al., 2010).
With the developments of observation technologies and strategies, major steps forwards have been made in extracting a wide variety of environmental data (Hrachowitz et al., 2013b).Thus this study provided a convenient way to simultaneously constrain model parameters when the new observation data sets are available.However, even with all data sets (EC-measured λET and microlysimeters-measured daily E), some parameters related to canopy surface resistance seemed to be not well updated (Fig. 4).We thought that this may be due to the insensitivity of these parameters (e.g., k 1 , k 3 , T amax , T amin and K A ) to the present available data sets.Thus, direct observations of plant transpiration using sap flow or stable isotope (δ 2 H and δ 18 O) technologies (see Williams et al., 2004), canopy temperature using infrared thermometer and continuous within-and above-canopy radiation using the fourcomponent net radiometer (see Sauer et al., 2007) are needed in the future studies.
The method, as implemented here, used all observations simultaneously to constrain parameters and obtain an optimal match between data and model.After parameter optimization, the main source of model error can be attributed to the model structure.Thus, this method facilitates the detection of the model's structural failures.Until now, numerous models, retaining the S-W model as basis, have been developed for estimating ET or its different components, and they tended to be more and more complex (see Lhomme et al., 2012).However, increasing model complexity is always accompanied by a great danger of equifinality and large uncertainties in forward runs (Beven, 1989;Franks and Beven, 1997).Most importantly, we must ensure that we are on the right direction in modifying the model.In this study, we found that the S-W model applied in arid areas generally failed when local advection occurred (Fig. 9).Thus, we thought that the main structural error of the S-W model as well as its various extensions comes from the ignorance of the effects of advection on the ET processes.A potential solution is to add the additional energy (negative H ) to the available energy term defined in Eq. ( 12) (see Parlange and Katul, 1992).
The distribution of the model-minus-observation residuals, through the likelihood function, may also have an influence on the estimation of posterior parameter distributions (Raupach et al., 2005).However, a priori assessment of these errors may be not easy (Beven, 2001).Figure 11 shows the distribution of the residuals between simulated and observed data sets.The results indicated that the model-minusobservation departures of half-hourly λET flux was better approximated by a double-exponential distribution, which was in agreement with previous studies (Hollinger and Richardson, 2005;Richardson et al., 2006).Thus, the two-tower approach (Hollinger and Richardson, 2005), which can give a prior estimate of the flux data uncertainties, should be applied in the Bayesian inference in future studies.The Cauchy distribution gave a more appropriate approximation for the daily E departures.However, the Cauchy distribution may be not a good choice for the purpose of Bayesian inference, since its first four moments are undefined (Richardson et al., 2008).

Conclusions
This study illustrated the use of the Bayesian method to simultaneously parameterize a two-source ET model against the multivariate data sets for a crop ecosystem in a desert oasis of northwestern China.The posterior distributions of the model parameters in most cases can be well constrained by the observations.Generally, the parameterized model has a good performance in simulating and partitioning ET.However, underestimations were observed on days when the "oasis-effect" occurred.Therefore, in future studies, special attention should be given to proper descriptions of the effects of advection on estimating ET for heterogeneous land surfaces.In addition, the canopy interception model should be coupled with the two-source ET model in long-term simulations.
The Supplement related to this article is available online at doi:10.5194/gmd-7-1467-2014-supplement.

Figure 1 .
Figure 1.Experimental location and instrumentation setting at the Daman (DM) superstation.

Figure 2 .
Figure 2. Schematic diagram of the S-W model.From right to left, r cs and r c a are the bulk resistances of canopy stomatal and boundary layer (s m −1 ), respectively; r s a and r a a aerodynamic resistances from soil to canopy and from canopy to reference height (s m −1 ), respectively; r s s soil surface resistance (s m −1 ).λT transpiration from canopy (W m −2 ), λE evaporation from soil under plant (W m −2 ), and λET total evapotranspiration (W m −2 ).

Figure 4 .
Figure 4. Histograms of samples from the posterior distributions of the parameters.The dashed vertical lines indicate median parameter values.

Figure 5 .
Figure 5. Relative uncertainty reductions in the length of 95 % credible interval form prior to posterior distribution.

Figure 6 .
Figure 6.Comparisons of responses of soil surface resistance (r s s ; s m −1 ) to soil surface water content (θ ; %) .

Figure 8 .
Figure 8. Seasonal variation in daily evapotranspiration (ET; mm day −1 ) and soil evaporation (E; mm day −1 ) measured by the EC system and modeled by the S-W model during the study period in the Daman Oasis.A gap in the time series is caused either by the absence of flux measurements or missing ancillary data.

Figure 9 .
Figure 9.Diurnal variations in R n (W m −2 ), H (W m −2 ), and modeled and measured evapotranspiration flux (λET;W m −2 ).(a-c) represent conditions at which microscale advection occurred at 12:00, 15:00 and 17:00 BST, respectively, (d) represents a rainy day, and (e-h) represent clear and advection-absent days during the study period.A gap is caused either by the absence of flux measurements or missing ancillary data.Modeled λET was presented as median ±95 % posterior predication intervals.

Figure 10 .
Figure 10.The diurnal evolutions of temperature (T a ; • C) and relative humidity (RH; %) profiles from 3 to 40 m above the ground on 5 July 2013 (a).An obvious advection process can be detected from 13:00 to 17:00 BST with high temperature and a low RH layer at the height of 8-18 m on 17 June 2013 (b).A precipitation event occurred at 13:00 BST and resulted in uniform vertical distributions of T a and RH, but no temperature inversion was observed on 11 June 2013 (c).It represented a typical clear and advection-absent day.

Figure 11 .
Figure 11.Histograms depicting the frequency distribution of the model-minus-observation departures for (a) half-hourly λET (W m −2 ) and (b) daily soil evaporation E (mm day −1 ).
, it is critical to assess to what extent the uncertainty in model parameters and model predictions is reduced by the use of additional data and what new observation is required.The Bayesian inference framework used in www.geosci-model-dev.net/7

Table 1 .
Prior distributions and the parameter bounds for the S-W model.These values are derived from the literature; the posterior parameter distribution estimated by MCMC is based on observed data in our site, and are characterized by the mean and 95 % high-probability intervals (lower limit, upper limit).
Legates and McCabe (1999) is the determination coefficient; RMSE is the root mean square error; MBE is mean bias error between measured and modeled values; IA is the index of agreement; ET is model efficiency.These statistical parameters are calculated using formulas given byLegates and McCabe (1999)and Poblete-Echeverria and Ortega-Farias (2009). n