Developing a Parsimonious Canopy Model (PCM v1.0) to Predict Forest Gross Primary Productivity and Leaf Area Index

. Temperate forest ecosystems play a crucial role in governing global carbon and water cycles. However, unprecedented global warming poses fundamental alterations to forest ecological functions (e

a conversion ratio when maximum GPP has been reached. However, it has been shown that modelled GPP saturates at high LAI values (Lee et al., 2019). This may potentially introduce uncertainty when calculating the conversion ratio to simulate LAI. Another more general challenging aspect for these models is the specification of effective model parameters such that they can seamlessly operate at different scales and locations. Previous applications often have been limited to a calibration site (Francés et al., 2007); but they need to be thoroughly cross-validated for their applicability across a diverse range of climatic 100 conditions. The overarching aim of this study are to propose a parsimonious model (i) to simulate daily dynamics of GPP and LAI over deciduous broad-leaved forest at a medium level of complexity (ii) also suitable for integration in existing hydrologic and ecologic models. We simulate processes related to the carbon cycle in the canopy at a forest stand of undetermined size, utilizing the LUE approach with implementation of a phenology submodel. The parsimonious approach and level of model complexity are adapted based on readily available observational datasets across eddy flux tower stations. We apply a global 105 sensitivity analysis to investigate model parameters' sensitivity to the model's output variables (i.e., GPP and LAI). Finally, we assess the generality and robustness of the underlying model parameterizations and demonstrate the model applicability over different sites conducting a cross-location transferability experiment. The PCM model developed and presented in this study aims at providing a parsimonious representation of daily development of biomass of leaf (Bl) coupled to simulated gross primary productivity (GPP) over deciduous broad-leaved forest (DBF) ecosystems. Analogous to most of the LUE models treating the entire vegetation canopy as a big extended leaf (Guan et al., 2021), the PCM operates over forest stand scale and adapts parameters mainly from a biome properties look-up table (BPLUT) . Parameters such as specific leaf area index (SLA) in PCM represent an effective community-weighted 115 parameters. Figure 1 shows a schematic representation of the PCM structure including carbon fluxes/stocks and interconnected processes related to plant canopy for DBF biomes. We focus on simulating Bl, which is related to LAI via the specific leaf area index parameter. The simulated LAI is, in turn, used in the calculation of the GPP.
PCM uses a daily time step during which it simulates the processes of carbon uptake, leaf respiration, carbon allocation, and carbon decay from the leaf pool (canopy) using a mass balance equation (Istanbulluoglu et al., 2012;Yue and Unger, 2015; 120 Pasquato et al., 2015;Melton and Arora, 2016;Ruiz-Perez et al., 2017). The main governing equation to simulate the daily development of GPP(t) and Bl(t) is: PCM. The LAI (related to Bl(t) in Eq. 1) is defined as: where SLA is specific leaf area index, and f cov is the vegetation fractional coverage. In the following sections, the modeling approaches implemented for each submodel component are described in detail. A summary of the model inputs and underlying parameters is provided in Tables 2 and 3, respectively.

Gross Primary Productivity
The theoretical soundness and practical convenience of the LUE concept in estimating terrestrial GPP has been the main core of several model developments (Monteith, 1972;Wei et al., 2017;Running et al., 2000;Arora, 2002;Schaefer et al., 2012;Zhang et al., 2015) at the regional and global scales (Potter et al., 1993;Yuan et al., 2007;Xiao et al., 2004;Running et al., 2000). In this study, we likewise utilize the LUE approach, which theoretically relies on the concept of interception of photosynthetically 135 active radiation by plant leaves and converting it into biomass through energy to biomass efficiency factor (i.e. LUE factor).
As expressed in Eq. 1, the PCM simulation starts with assimilation of the carbon flux (GPP) by leaf component. The GPP flux (Eq. 3) is estimated as a product of incident photosynthetically active radiation (PAR), by fPAR, which is a fraction of PAR being absorbed by plant leaf, and an LUE factor, multiplied by a modifier factor when environmental constraints present ( ).

145
(Eq. 4) is an overall and integrated modifier that corresponds to environmental stress factors. The overall modifier factor diminishes light use efficiency of vegetation from its potential value during unfavorable environmental conditions (Potter et al., 1993). These unfavorable conditions include for example high and/or low temperature fT, water availability fSM, and elevated vapor pressure deficit fVPD stress factors (Zhang et al., 2015;Pasquato et al., 2015).
In general, calculation of across different LUE models can be expressed either in minimum (Eq. 5) or multiplicative (Eq. 6) 150 approaches to integrate different environmental stress factors. On the one hand, models such as Eddy Covariance-Light Use Efficiency (EC-LUE; (Yuan et al., 2007)) uses Liebig law of minimum stress that emphasise the most limiting resource to constrain GPP (Eq.5). On the other hand models such as Carnegie-Ames-Stanford Approach (CASA; (Potter et al., 1993)) and  (Xiao et al., 2004)) follow a multiplicative approach of stresses (Eq.6). In the present study, we opt for the first approach to integrate different stress factors and to calculate the .
The second approach can be written in a multiplicative way as: The individual stress factors are dimensionless scalars ranging between zero (full stress) and one (no stress), and are intro-160 duced in more detail in the following section.

Environmental constrains and GPP
I) Temperature stress factor (fT): The first reduction factor, fT, on GPP due to air temperature is calculated by including two factors corresponding to low temperature ρ l (cold) and high temperature ρ h (heat) stress effects (Eqs. 7,8,9) (Sitch et al., 2003;Fischer et al., 2016;Rödig et al., 2017).
The stress induced by the cold stress factor (ρ l ) can be calculated as: where, The heat stress factor is calculated as: where T (t) is daily mean air temperature, T low and T high are DBF biome-specific parameters representing high and low temperature limits for CO 2 assimilation, respectively. T hot and T cold are the monthly mean air temperature of the warmest and coldest months, respectively, that a DBF biome can cope with, respectively (Boons-Prins, 2010;Bohn et al., 2014;Fischer et al., 2016;Rödig et al., 2017).

II) Vapour Pressure Deficit stress factor (fVPD):
The canopy photosynthesis rate is strongly related to changes of vapour 180 pressure deficit (VPD) (Konings et al., 2017;Xin et al., 2019), as photosynthesis declines due to stomata closure (Yuan et al., 2019) when atmospheric VPD increases. It can be modelled as follows in Eq. 10 (Jolly et al., 2005): where V P D(t) is daily vapour pressure deficit, v min and v max denote lower and upper thresholds for photosynthetic activities, respectively. The fVPD value of one indicates no stress on GPP, whereas there is full stress when the fVPD becomes zero;

185
values between zero and one result in partial and linear reduction on the GPP.
III) Soil Moisture stress factor (fSM): In general, the impact of soil water deficit on photosynthesis in vegetation models is represented as a generic soil moisture stress function using either modeled or field observation soil moisture content (Cox et al., 1999;Granier et al., 2000;Fischer et al., 2016). Here, we use field observations from different vertical soil profiles including volumetric soil moisture content and soil textural properties (wherever available) to calculate the soil moisture stress factor, 190 fSM.
Essentially, the soil moisture influence on plant productivity depends not only on soil moisture over the entire profile but also on the available soil water to the plant roots. Therefore, to estimate the availability of water to plants, the characteristics of the root system, including rooting depths and its distribution at different soil depths, are essential factors to be considered (Ostle et al., 2009). Thus, we include plant rooting distribution in our analysis, following Jackson et al. (1996), to take into account 195 the root fraction at different soil depths, and weight the soil moisture content layer-wise according to the present fraction of roots in that layer. In doing so, we calculated cumulative root fraction (Rc i ) from the surface to a certain depth (d) in the soil profile for each layer (i) using the biome specific parameter, β as follows (Eq. 11) (Jackson et al., 1996): Then, using the cumulative root fraction up to each layer, the root fractions in each layer Ri i are estimated and then multiplied 200 with the corresponding observed soil moisture content of that layer to calculate the soil moisture contribution from each layer individually. Later, by summing up the soil moisture contributions from all individual layers (θ i ), a daily effective soil moisture content, θ(t), over the soil column is obtained .
Similarly to other stress terms, the soil moisture stress factor varies between 0 and 1; and is quantified as follows (Eq. 15).
where θ(t) is daily effective soil moisture, θ r and θ MSW are water storage corresponding to the permanent wilting point and the critical point below which transpiration is limited, respectively. θ MSW , representing minimum soil water content for unstressed photosynthesis (Hartge, 1980;Granier et al., 1999;Fischer et al., 2014), is calculated as follows: in PCM. scw is a physiological threshold defined as critical relative soil water content at which tree transpiration begins to decrease Granier et al. (1999). According to Granier et al. (1999);Fischer et al. (2016) the scw value does not vary significantly between soil and plant species and can be considered as a constant value. The θ r and θ s correspond to soil matric potentials of -1.5 and -0.033 MPa, respectively.
When the daily effective soil moisture content is above a minimum soil water content (θ MSW ; Eq. 16), there is no stress 220 to limit photosynthesis, while below the θ M SW point, there is a linear increase in stress as water content decreases until θ r is reached. At this point, the soil water stress factor becomes zero with full limitation on photosynthesis and GPP (Harper et al., 2021).

Canopy respiration
To allow estimation of daily changes in carbon in the leaf pool (Eq. 1), the release of carbon to the atmosphere from leaf 225 respiration (R e ) has to be calculated. This flux is part of gained carbon (i.e., GPP) consumed for self-maintenance requirements in the leaf pool. In fact, canopy net primary productivity (NPP canopy ), which is net available carbon ready to be allocated among different plant pools, is the sum of photosynthetically carbon uptake by plants (GPP) reduced by carbon loss via leaf respiration (R e ) (Pasquato et al., 2015;Running et al., 2000;Melton and Arora, 2016).
We use the well-established modified Arrhenius equation (Eq. 17) (Lloyd and Taylor, 1994;Sitch et al., 2003;Perez, 2016) 230 to calculate the leaf respiration. The R e flux is a function of air temperature, carbon mass of leaf pool, and a tissue-specific carbon to nitrogen ratio, given as: where rr represents the leaf respiration rate, Bl the carbon mass of leaf pool (leaf biomass), p 1 , p 2 , p 3 are parameters in the Arrhenius equation, CN r is carbon to nitrogen ratio in leaves, and T is daily mean air temperature.

Vegetation phenology module
We incorporated a phenology submodel into our model using the approach defined in Yue and Unger (2015). This submodel calculates temperature-dependent phenological factors for spring and autumn, f ST and f AT respectively. These factors range from 0 to 1 throughout the year, to determine the timing of spring budburst (once the spring temperature dependent factor sets up to increase above zero), maturity (when the spring temperature-dependent factor approaches to 1), autumn senescence (once the 240 product of autumn temperature-dependent and photo-period factors start off to decrease below 1), and dormancy phenophases (once the product of autumn temperature-dependent and photo-period factors approach zero). The second phenological factor in the autumn and dormancy phenology is photo-period (f dl ) factor and depends on day length. The photo-period factor together with the temperature-dependent factor regulate the leaf senescence. The phenology submodel determines the above-mentioned four phenological transition dates on which a simple allocation of assimilated carbon to the leaf pool is based. Below, we 245 provide details of each phenological factor and events. I) Spring phenology (f SP ): The growing season starts with the budburst day, which is the beginning of canopy development and the time when green tips of leaf show up. It is estimated using a temperature-dependent phenological factor f ST as follows (Eq. 18): where GDD is growing degree day, Gb is budburst threshold value, L g is a parameter for growing length in degree day. The accumulation of growing degree day (GDD) (Eq. 19) from winter solstice day is calculated as below: where T 10 is 10-day average air temperature, T b is base temperature for the budburst (5 • C).
G b in the estimation of f ST (Eq. 18) is a threshold value for budburst to occur and is calculated as follows: where a, b, and r are parameters for the budburst threshold. NCD is counted as number of chill days between the previous winter solstice day and the beginning of the successive year. Given the GDD and G b estimates, temperature-dependent phenological factor (f ST ) is then applied to calculate the spring phenology (f SP ) (Eq. 21).
260 II) Autumn phenology (f AP ): For the autumn phenology the product of two phenological factors, temperature f AT and photo-period f dl factors, is considered to estimate timing of senescence and dormancy. The autumn temperature-dependent factor, f AT , (Eq. 22), is obtained as follows: where F s is a threshold in degree day for leaf fall, and L f is a threshold in degree day for the duration and length of the leaf 265 falling period (more detail can be found in Yue and Unger (2015)). FDD (Eq. 23) is an accumulative falling degree day from summer solstice day which is known as a cumulative cold summation method (Yue and Unger, 2015) and it can be calculated as: where T 10day is 10-day average air temperature,T s is base temperature for leaf fall at 20 • C.

270
In addition to temperature factor f AT , autumn senescence timing is regulated via photo-period factor f dl , which is calculated where dl is the day length in minutes. dl min and dl max are the lower and upper limits of day length for the period of leaf fall, 275 respectively. The autumn phenology (f AP ) is finally calculated as a product of f AT and f dl (Eq. 25): The predicted phenological transition dates from spring f SP and autumn f AP phenology factors determine the budburstmaturity and senescence-dormancy events, respectively. Based on this information, a fractional allocation to and decay from the leaf pool is considered (as detailed below). and (2) allocation driven by allometric constraints. The first scheme uses a fixed allocation ratio to individual plant's carbon pools (e.g. used in CASA (Friedlingstein et al., 1999) or BIOM-BGC (Hidy et al., 2016)). In this scheme, the allocation ratio is constant within different plant development stages. In the second scheme, a fraction of carbon is allocated in such a way that it satisfies allometric relationships that exist between various plant compartments (Malhi et al., 2011;Gim et al., 2017). In the 290 case of allocation to leaf, the allometric relationship is based on the relative mass of canopy -so-called maximum L b -that a plant can support with a certain stem mass and height. We adopted an allocation scheme that mainly depends on an updated daily carbon status of the leaf pool. We use the maximum values of balanced LAI supported by the system (Eq. 26) based on a previous study conducted by Fleischer et al. (2013). Instead of considering it as a fixed value, we vary L b within a range of ±1m 2 /m 2 , and consider it as one of the model parameters.
where λ(t) is the carbon allocation ratio to the leaf pool and L b is the maximum LAI that can be supported by plants.
Provided with the identified major phenological transition dates from the phenology submodel -i.e., budburst, maturity or steady growth, senescence, and dormancy -the calendar year is accordingly divided into four main stages. During the early growing season, once the climate condition becomes favourable to plant growth and budburst occurs, carbon allocation to leaf, 300 λ (Eq. 26), is relatively a large fraction. This means that the largest part of carbon will be partitioned towards leaf and is being https://doi.org/10.5194/gmd-2022-87 Preprint. Discussion started: 2 May 2022 c Author(s) 2022. CC BY 4.0 License.
used for growth during the early growing season (Gim et al., 2017). Given the value for balanced LAI supported by the system (Fleischer et al., 2013), the carbon allocation slowly decreases with an increase in LAI until the leaf mass reaches that balanced LAI. As soon as the canopy approaches a full leaf state (i.e. maturity phenophase), the carbon allocation ratio to the leaf is held at its minimum -a small portion is used for maintenance respiration during this steady growth stage. We set the leaf allocation 305 ratio during the maturity phase to a value of 5% from the assimilated carbon, following the recent version of the Noah-MP model's leaf allocation scheme (Gim et al., 2017).
After the steady growth and maturity phase, the leaf senescence phase approaches and the leaf-loss processes start to play the main role in moderating the mass-balance of canopy and the corresponding LAI seasonality. The loss of carbon via the leaf fall in PCM is simulated based on the calculated senescence and dormancy transition dates via the phenology submodel, such 310 that when the simulation time-step approaches to the senescence date, the model linearly decreases the leaf biomass until the leaf biomass reaches to nearly zero at the beginning of the dormancy phase.
Concerning the leaf loss processes, PCM also accounts for the leaf losses due to cold stress (O C ) (Eq. 27), drought stress (O D ) (Eq. 29), and normal loss of the leaf (O N ) (Eq. 30) following schemes of the CLASSIC model (Melton and Arora, 2016).
The leaf loss due to the cold stress is given by: where, O Cmax is the maximum leaf loss rate parameter and Cs is a cold stress factor value. The cold stress factor (Eq. 28), ranging between 1 (full stress) and 0 (no stress), is calculated as: where T(t) is air temperature and T c is a biome specific temperature threshold below which leaf damage is expected.

320
Similar to the O C , the leaf loss rates due to drought stress O D (Eq. 29) is calculated using the fSM stress factor (through the soil moisture stress submodel) and a O Cmax maximum leaf loss rate parameter associated with the drought stress.
The third leaf loss term represents the loss rates due to a Normal decay O N driven by biome specific leaf lifespan (τ = 1 for DBF in Eq. 30) given by: Finally, the total decay of leaves D(t) consists of contributions from all individual losses (Melton and Arora, 2016); and can be given as follows (Eq. 31): In summary, the proposed PCM model comprises the submodels mentioned above in a hierarchical chain, starting with the carbon uptake via the initial leaf biomass state variable and continues with daily partitioning of that assimilated carbon together with daily decay from leaf compartment to calculate the leaf biomass production increment. This biomass increment is later added up to the state variable from the previous time step to update the leaf biomass for the current time step. Finally, to update the LAI that is required for the GPP estimation over the next time step, the current leaf biomass is converted to LAI according 335 to Eq. 2.

Study sites and datasets
This study focuses on deciduous broad-leaved forests biome type and We selected tower sites distributed over Europe and North America to ensure a representative spatial coverage. Sites were excluded if data of fewer than five years were avail-340 able. We further screened out the data at each site to the years with minimal gap in input data, in particular, photosynthetic photon flux density variable and its associated frequent and long missing data at some sites. Applying the above criteria, nine sites with varying temporal coverage were retained for the analyses (Fig. 2). The general site information is presented in Table 1. Daily flux and meteorological forcing data are from ecosystem stations available from the free fair-use FLUXNET2015 Tier 1 global collection database (https://fluxnet.org/data/download-data/, last access: June 2021) (Pastorello 345 et al., 2020). The input data required to drive the PCM comprises: air temperature (T), photosynthetic active radiation (PAR) (i.e. converted from PPFD in µmol m −2 s −1 ) and vapor pressure deficit (VPD) ( Table 2). The tower-based GPP estimations, GPP_NT_VUT_REF from the FLUXNET2015 dataset are used for model calibration. We used the first year of the time series as a warm-up period, during which the chilling days and thermal requirement in phenology submodel are counted.
Optional variables to establish the model include soil moisture (SM) and soil textural properties, are required to simulate 350 the soil moisture stress development. However, we investigate the soil moisture stress impact only at the Hohes Holz (DE-HoH) site in Germany with soil moisture data available up to 80 cm depth. In regard to calculating the soil moisture stress in PCM, a pedotransfer function following Zacharias and Wessolek (2007) is implemented to estimate site-specific θ s and θ r values. This (pedotransfer) submodel receives soil textural properties (sand, clay contents, and bulk density) obtained from field observations of spatially distributed soil profiles as input. It provides the required field capacity (θ s ) and permanent wilt-355 ing point (θ r ) to calculate θ M SW and the corresponding soil moisture stress term fSM in the calculation of (Eq. 5). To maintain the consistency with the vertically weighted soil moisture, θ s and θ r are estimated as weighted average values of individual layer-specific θ s and θ r taking the respective root fractions as a weighting factor. Other required parameters in the model related to different processes, are listed in The impact of water availability on the canopy photosynthesis (i.e., soil water deficit and atmospheric water deficit), in vegetation models is structured in two ways; individually or in combination with each other. In some models, water stress is quantified as an overall stress from both atmosphere and soil-moisture (GLO-PEM; Prince and Goward, 1995), (Biom-BGC; Hidy et al., 2016), while some other models account for the water stress due to either the atmospheric drought (CASA; Potter et al., 1993),

370
MOD17 algorithm ) or soil moisture drought (EC-LUE; Yuan et al., 2007). In order to determine, how stress should be represented in the final version of PCM, we conduct two sets of preliminary model experiments to examine: (1) whether inclusion of fSM, additionally to the other stress factors affects the results, and (2) Figure S1 and Figure S2). With regard to specific considerations in LAI simulations, the model starts with the simulation using a fixed initial LAI state variable to begin the carbon assimilation once weather conditions become more favourable for plant growth. Following the CABLE model parameterizations (Li et al., 2018), we set the initial LAI value to 0.35. We also consider a local maximum LAI (so-called L b in this study), obtained from 380 reported values in literature (Fleischer et al., 2013), that individual mature forest can sustain at canopy closure. However, the local maximum LAI is, later in the calibration step, allowed to vary within ±1m 2 m −2 of the reported value. The L b constrains the simulated LAI up to the reported value at each site across years.

Global sensitivity analysis
Despite the simplicity of parsimonious models, assessing model robustness remains a fundamental step when building and developing a model. One of the powerful and invaluable tools for robustness assessment is global sensitivity analysis (GSA) to test the underlying model parameterizations and inform about sensitive model parameters for the subsequent parameter inference. In general, the GSA can be performed to understand the influence of parameters perturbations on modeled simulations and to determine the informative parameters that contribute the most to an output behavior (Iooss and Lemaître, 2014;Cuntz et al., 2016;Rakovec et al., 2014). In this study during the GSA, the parameters vary over boundaries reported in literature's.

390
In case there were no bounds available for some parameters (e.g., phenological parameters from Yue and Unger (2015)), we varied them at ± 20% level of their default values. We utilize the Sobol' variance-based sensitivity method (Saltelli et al., 1999) with Sobol2002 formula (Saltelli, 2002), in which decomposition of the output variance is performed in terms of Sobol' indices. The Sobol' First order index (Si) and total-order Sobol' index (ST) express the share of output variance associated with a given parameter i and the share of output variance where all parameters are varied except the parameter i, respectively. These indices range between 0 to 1; with zero value indicating that the model output is entirely insensitive to the respective parameter changes. The closer the value to 1, the more important and sensitive the respective parameter is. Generally, the model parameters deem sensitive, if the sensitivity index is above a certain threshold value. Here in this study, we report the total-order Sobol' index and set the selection threshold at 1% (Tang et al., 2007), meaning that if the variation of a given parameter contributes to a change greater than 1%, then that parameter is recognized as an informative parameter. In contrast, 400 non-informative parameters are reported as parameters with Sobol' indices below 1%. Given the focus of the present study on two main output variables (i.e. GPP and LAI), we use the time mean for both outputs over the entire period for the sensitivity analysis at each study site. However, the results are expected to differ not only according to the site and selected target output but also between the individual years if a specific year is of interest to be investigated (Göhler et al., 2013;Hou et al., 2012). To conduct the sensitivity analysis, we opt to choose all coefficients in the empirical equations as adjustable parameters (Table 3).

405
It helps to explore the model sensitivities of often hidden parameters to properly constrain the model (Cuntz et al., 2016).
Overall, we apply the global sensitivity analysis in all study sites for the common 29 parameters and analyse the sensitivity of the soil moisture stress parameters together with other parameters only for the DE-HoH site at which representative soil moisture data at different depths, down to 80 cm into the soil, was available. Given the importance of the number of model evaluations required to conduct the Sobol' sensitivity analysis (Nossent and Bauwens, 2012) and the stability of sensitivity 410 indices, we also check the convergence of the Sobol' indices through a visual assessment of diagnostic plots.

Parameter estimation
Based on the results of sensitivity analysis, informative and non-informative parameters are identified. Later, we fixed the non-informative parameters to their corresponding reported values in literature (see Table 3 for details) and the remaining informative parameters are inferred using a Monte Carlo approach (Kuczera and Parent, 1998). The parameters were calibrated 415 against the GPP_NT_VUT_REF time series from the corresponding flux tower measurements (global Fluxnet Tier1 network accessed on 13 February 2021) (Pastorello et al., 2020). It is important to note that besides the maximum LAI value we did not use LAI field observations in the calibration process as LAI is not readily available from the FLUXNET dataset. Instead some LAI observations (obtained from site contacts) were used in the model verification step. The first year of the dataset is considered a spin-up period. The rest of the time-series are divided into two sub-periods. The first half is used for the calibration 420 phase, and the remaining years to independently evaluate the model performance (i.e., over the out-of-calibration set). A total of 10 000 parameter sets was sampled from their a priori defined ranges (Table 3) in each study site to estimate the parameters and simulate the GPP flux and LAI. Model performance was quantified using a group of performance metrics, including Kling-Gupta efficiency (KGE) (Gupta et al., 2009), Root Mean Square Error (RMSE), and coefficient of determination (R 2 ).
We selected an ensemble of informative model runs that simultaneously lie within the top 5% of all the performance metrics.

Site-specific verification and model generalization
The second half of the GPP time series at each study site was used for the model verification step. In addition to the atsite verification, it is also equally important to consider the generality of the model structure including underlying model parameterizations. To this end, we considered an independent (spatial) verification approach -so called cross-validation -for assessing the robustness of model parameterizations beyond the conditions during which they were calibrated. The relevance of 430 the cross-validation to the modeling framework, is that transferable models can be used beyond the spatial and temporal limits of their underlying data, especially in the face of pervasive scarcity of observational data to constrain model parameterizations (Yates et al., 2018). Therefore, as the next step in our modeling framework, and after performing the site-specific calibration and verification, a cross-validation of the model is conducted to come up with a compromise solution (here parameter set) applicable across the study sites, following the approach of Zink et al. (2016). In doing so, the behavioral parameter sets found 435 from on-site calibration for each study site are grouped together as one unique set of all possible behavioral parameters. Then the model is run using all possible parameter sets and the respective performance metric (i.e., KGE) for each parameter set at each investigated site is estimated. After that, the mean values of KGEs corresponding to each parameter set over all study sites are calculated. Finally, a set of parameters associated with the highest mean KGE score is recognised as a compromise solution.
Here the goal of this analysis is to investigate the generality of the underlying model structure, and to allow inference 440 of a common (compromise) set of model parameters for the PCM for a broader applicability (i.e., beyond the calibration sites).

Results and Discussion
In the following, we first show and discuss findings from the global sensitivity analysis and site-specific parameter calibration. This is followed by a discussion of the site-specific model performance. Finally, we present the results of a cross-validation to test the generality of underlying model parameterizations. This also allows us to propose a standard set of PCM parameters for 445 locations where calibration is not possible.

Sensitivity analysis
Here, we explore the sensitivity of the output variables (i.e. GPP and LAI) to the model parameter variations using Sobol' method at each study site. Although a direct comparison of PCM parameters sensitivities from this study with similar models in other studies is difficult due to difference in model structures and representation of photosynthesis processes, one can gain 450 insights by comparative assessments among conducted studies. For instance, the light utilization in LUE-oriented GPP models is based on photon absorption and photosynthetic efficiency of incident light (Frost-Christensen and Sand-Jensen, 1992).
Hence, one can compare the significance of the LUE parameter of our model with that of the quantum yield of photosynthesis which is a measure of photosynthetic efficiency in the Farquhar equation (Farquhar et al., 1980) in several land surface models.
As it can be seen from Figure 3a (mean GPP) and b (mean LAI), different sensitive parameters are associated with the different 455 output variables. However, for the same output variable, all sites more or less share a similar informative set of parameters, although the magnitudes differ. In the following, we show and discuss the sensitivity of the model outputs to different PCM parameters.

Parameter sensitivity for GPP estimation
We first investigate the sensitivity of GPP output to the model parameters. Figure 3a shows the total-order Sobol' index of all Next environmental factor constraining the GPP is soil moisture stress. Here, we identify β (Eq. 11) and θ r (Eq. 15) as sensitive 480 parameters. We can only consider and discuss the soil moisture stress-related parameters in the DE-HoH site due to the lack of soil moisture data at other sites. The investigated sensitivity of fSM-related parameters are shown in the supplementary Figure S3. Similar findings of a pronounced impact of parameters controlling soil moisture availability (e.g., θ r and β) on simulated GPP has been reported by Li et al. (2016) for the CABLE and JULES models. From a soil science perspective, the θ r is often a fixed value of soil water content corresponding to soil matric potential of 1500 kPa (Zhang and Han, 2019) and 485 is typically not considered as a parameter. However, our result shows that the θ r might not be considered as fixed. While the functional form of θ r can be deduced based on pedo-transfer functions (Zacharias and Wessolek, 2007), empirical coefficients of such functions representing the linkages between θ r and soil textural properties (e.g., sand, clay contents) can be regarded as model parameters (Samaniego et al., 2010;Kumar et al., 2013;Schweppe et al., 2021).
The SLA parameter (Eq. 2), as one of the structural parameters, is also a major contributor to the simulated GPP. Its sen- 3.1.2 Parameter sensitivity for LAI estimation 510 We further explore the parameter sensitivity for LAI output similar to the GPP described above. In general, a similar set of sensitive parameters were identified for GPP and LAI outputs across sites ( Figure 3b). However, some differences were also detected: parameters such as L b , f cov , L g , p 2 , and p 3 show substantial sensitivity, while the sensitivity to v min was almost negligible. Regarding the similarity of parameters between GPP and LAI, it is worth noting that the calculations of GPP and LAI depend on each other since assimilated carbon (i.e.,GPP) is partly converted to leaf biomass from which the LAI is 515 calculated, and used in turn for the GPP calculation in the next time step. Therefore, LAI output should roughly be sensitive to the same set of parameters as the GPP output. The result in Figure 3b shows that LUE, C, and k, directly involved in the GPP formulation, have considerable influence on the LAI output. These parameters, in particular the LUE, strongly control the assimilated carbon and consequently affect the modelled LAI.
Figure 3b also shows a major contribution of SLA (Eq. 2), f cov (Eq. 2), and L b (Eq. 24) to the LAI output. Similarly to the 520 LUE for GPP, the SLA is central for the calculation of LAI (Eq. 2) and thus shows by far the largest sensitivity. Since the LAI output in the model depends on GPP, the studies reporting the SLA impact on GPP might apply for LAI output as well (Li et al., 2016;Arsenault et al., 2018). The f cov parameter represents the fractional vegetation coverage per unit area and is a critical parameter in calculating forest GPP (Ma et al., 2015). Ma et al. (2015) assumed 100% forest coverage in their calculation of GPP, from which LAI was calculated. They showed how this inappropriate assumption (i.e., 100% forest coverage) can 525 exaggerate the forest area when calculating forest GPP (and consequently the LAI) rather than considering a realistic coverage.
Here in the PCM, the f cov parameter is allowed to vary between 60% to 95% as an adjustable parameter (based on Fluxnet2015 Dataset description of percent coverage greater than 60% at DBF sites; http://sites.fluxdata.org/). We observe that fractional vegetation coverage substantially influences the simulation of LAI. In agreement with Ma et al. (2015), our result supports the importance of the fractional coverage (f cov ) as an important structural parameter in carbon cycle modelling studies. The L b 530 parameter (Eq. 24), also exhibits a marked sensitivity for the LAI output (Figure 3b) because this parameter is a direct factor allowing the canopy to reach to its maximum. Next important contribution of parameters to the LAI output are those governing the leaf phenology in the phenology submodel, L g (Eq. 18), F s (Eq. 22), b (Eq. 20), r (Eq. 20) (Figure 3b). To the best of our knowledge, these parameters have not been studied elsewhere within a sensitivity analysis framework, and therefore we could not perform any comparative assessment. Parameters b and r contribute to the simulation of leaf budburst day, F s contributes 535 to the identification of leaf fall day, and L g parameter influences the LAI output estimation through its influence on the length of the growing season. The F s parameter exhibits higher sensitivity and a larger between-site variation than other parameters ( Figure 3b). This parameter represents the necessary amount of cold accumulation in degree day to trigger the leaf fall event.
For instance, lower cold degree days accumulation would lead to an early leaf fall and leaf shedding. Therefore, the betweensite variation of this parameter might not be surprising, given the differences in temperature and accumulated cold degree days 540 among study sites.
Other additional parameters that showed sensitivity for the LAI output are p 2 , and p 3 (Eq. 17). These parameters belong to the canopy respiration process in the modified Arrhenius equation (Eq. 17). They are typically considered as fixed parameters e.g., in TETIS-VEG model (Perez, 2016), in LPJ-ML model (Schaphoff et al., 2017), while here we varied these parameters within 20% of their nominal value. Notably, these parameters showed greater sensitivity for the LAI estimation than that of 545 the GPP. It might partly be due to the reduced assimilated carbon (GPP) by canopy respiration which in turn might decrease the available carbon to be allocated to leaf biomass and affect the resultant LAI. Furthermore, the evaluation of Sobol' indices convergence (see Figure 4) showed relative stability of sensitivity indices at around 8 000 model evaluations.

Site specific model assessment
We conduct site-specific parameter estimation for all available eddy-covariance (EC) flux tower study sites ( Figure 5). For 550 this, only informative parameters identified in the sensitivity analysis are calibrated and the others are fixed (Table 3). For model parameter calibrations we used the first half of the available time series and the remaining years for verification (Table   1). Calibration and verification of the model are only performed for the GPP flux because direct LAI measurements are not available at all of the flux sites. Figure 5 shows the seven-day mean of simulated GPP for a set of ensemble members for each study site during both the calibration and verification periods. Since the different sites were operational at different times 555 and some sites (e.g. DE-Hai) cover a relatively long time period, we show only five years of simulation at each site: the last three years of calibration and the first two years of verification periods ( Figure 5). A complete simulation for each site during the entire available times series is provided in the Supplementary Figure S4. In addition, Table 4 summarizes the model performance in simulating GPP during calibration and verification periods at different sites. In general, the model achieved KGE values of above 0.65, RMSE of less than 2.5 gCm −2 day −1 , and R 2 values of above 0.65 over all study sites. We compare the performance of our model to other modeling studies, whenever there is either an identical site to our study or a similar biome type (i.e., DBF) presented. To this end, our results are similar to Yue and Unger (2015) who found a high correlation of more than 0.8 and RMSE lower than 3 g C m −2 d −1 for the GPP simulations at DBF forest sites in a global setting using the Yale Interactive terrestrial Biosphere model. Another study conducted by Asaadi et al. (2018) showed a quite good model performance using the CLASS-CTEM land surface model  applied at US-Ha1(1998-565 2013) and US-MMs (1999-2006 flux tower sites, with R 2 value of 0.99 accompanying RMSE of 0.62, and R 2 value of 0.98 accompanying RMSE of 1.07 g C m −2 d −1 at US-Ha1 and US-MMs, respectively. In a recent study, Holtmann et al. (2021) assessed the daily carbon fluxes over the DE-HoH forest during 2015-2017 using the FORMIND model (Fischer et al., 2016).
They showed that the simulated and measured GPP correlates with an R 2 of 0.82 and RMSE of 9 M gCha −1 a −1 equivalent to 2.46 g C m −2 d −1 using FORMIND model.

570
Taken together, our model exhibits a reasonable validity and performs equally well in estimating temporal dynamics of GPP (Table 4) compared to other more complex land surface and biogeochemical models. The PCM shows skill in capturing GPP at most of the investigated sites; although it overestimates GPP at the IT-Ro1 during summer. IT-Ro1 is located in a Mediterranean climate exposed to dry summers (Vicca et al., 2016). The forest ecosystems in Mediterranean type climate are affected by water limitation which can affect the GPP flux significantly (Cueva et al., 2021). The lack of soil moisture data probably contributed 575 to the misrepresentation of GPP at this site. This is in agreement with previous studies that found similarly poor modeling performance across sites located in the Mediterranean climate in central Italy in dry summer periods when simulating GPP (Maselli et al., 2012;Chiesi et al., 2011;Fibbi et al., 2019). In addition, water limitation impact on GPP could be related to the irregular occurrence of extreme events (e.g., European-wide drought 2018  . In the next step, we also examine the model's overall performance in reproducing GPP in terms of multi- year average of GPP at each site. Figure 6 shows that the model can generally explain the spatial variation of GPP with an R 2 as high as 0.90.
As an independent verification step, we evaluate the PCM simulations of LAI against field-measurements data at some study sites where observational data were made available via personal contacts with site investigators.  Table 5 summarizes the model performance in simulating LAI during a common period of observed and 590 modeled data.
The simulated LAI captures reasonably well the observed LAI seasonality at almost all the sites. It demonstrates the capability of the model in capturing canopy status at different phenophases. However, there are some pronounced deviations between modelled and observed LAI at some sites (US-Ha1, DE-HoH) during the dormancy phase and autumn leaf decline period.
Given the deciduous nature of those ecosystems , it is likely that the higher winter values of field measurements compared

Spatial model verification and model generalization
Eventually, we conduct cross-validation of parameter transferability for all sites against tower-derived GPP data (Section 2.2.5). regions. It has been shown that maximum LUE varies in different geographical locations (Jung et al., 2007), and this is in line with our on-site calibration result indicating largest maximum LUE at DK-Sor (northernmost site with a cold and moist climate) and lowest at IT-Ro1 (a relatively drier Mediterranean site) sites. Thus applying a compromise value for LUE at these two site would result in a bias in GPP estimation. Previous studies (Wang et al., 2010;Madani et al., 2014) showed a large variation in maximum LUE not only between different biomes but also even within an individual biome and plant functional 615 type. Concerning the large spatial variability of maximum LUE, several factors such as spatial heterogeneity of vegetation, canopy densities, ages, soil nutrition, leaf nutrient content have been mentioned in previous studies (Wang et al., 2010;Madani et al., 2014). Some methods such as spatially explicit estimation of optimum LUE (Madani et al., 2014) or introducing pixellevel maximum LUE (Wang et al., 2010) have been employed in satellite-based LUE models to overcome this shortcoming.
They argued that the assumption of a constant maximum LUE (i.e. based on standard MODIS-base GPP algorithm and a 620 Biome Property Look-Up Table; Heinsch et al., 2003), needs to be reexamined so that spatial heterogeneity between individual plant functional types is represented. Therefore, the uncertainty introduced by the fixed maximum LUE may be reduced and ecosystem productivity modeling would be improved.

Limitation and opportunities
While the model performs well, in general, on simulating the GPP, some inconsistencies in the observed and modelled GPP would be variation/depletion of deep soil moisture storage (Jung et al., 2009). Since the model does not represent established internal feedback for carrying over effect after extreme events (Reichstein et al., 2013) and only consider the soil moisture up 635 to 80 cm depth, thus the current model version would not reflect on such a process and GPP is likely to be overestimated.
Another limitation in our simulation is a lack to account for a possible effect of diffuse light on GPP response in the current model structure. We observed the potential role of diffuse light on GPP during some mismatch periods between eddy flux tower and modelled GPP across some of the sites (e.g., DE-HoH year 2107, FR-Fon year 2012, and US-Ha1 year 2010) (see Figure S1). The model underestimates GPP during these periods based on a lower PAR input, however, the observations show 640 greater GPP despite lower input PAR. This is in line with findings of Knohl and Baldocchi (2008), where they investigated the effect of diffuse light on the forest ecosystem and discussed how diffuse radiation can lead to an increase in carbon uptake.
Enhancement of GPP due to diffuse light is related to more evenly distribution and more efficient light penetration within canopy (Yuan et al., 2014). Integration of such effect in the current model by introducing a time-varying LUE (conditionvarying) (Wei et al., 2017) instead of the fixed LUE would improve the simulation. In particular, under unprecedented global 645 warming and climate change, future changes in cloud cover and aerosol concentration are expected to modify the solar radiation and the subsequent ecosystem productivity (Durand et al., 2021;Meyer et al., 2014). Regarding LAI simulation, one limitation is that, at some points, the model cannot increase in LAI in the initial onset of LAI as fast as that of observation in the early growing period. In previous studies, it has been shown that the inclusion of non-structural carbon storage at the beginning of green-up might help to overcome this issue (Asaadi et al., 2018) Samaniego et al., 2010;Kumar et al., 2013, available at https://www.ufz.de/mhm) with a proven predictive power in simulating root-zone soil moisture dynamics (Boeing et al., 2021). The spatially simulated soil moisture derived from mHM would provide an alternative to (limited) soil moisture observations required for GPP simulation. In particular, in the face of 660 ongoing and future climate changes and increasing occurrence of droughts (Harper et al., 2021), reliable simulations of soil moisture are invaluable information to capture plant drought responses for the carbon cycle and climate feedbacks (Harper et al., 2021). Finally, in doing so, we expect an improved capability of the hydrological model to represent the coupled water and carbon (i.e., GPP/LAI in this study) components. season was shown to be almost insensitive to LAI changes. This indicates that GPP is saturated at a particular value of LAI and any further increase in LAI does not necessarily increase the resultant GPP. We also tested the robustness and generality of the underlying model structure, identifying a compromise set of model parameters applicable to all sites (region-wide). The results show that the model application is possible without site-specific calibration and yet yielding reasonable model quality.

685
The model's skill in capturing the LAI dynamics -that was not used in the parameter inference process -further confirms the robustness of the presented model structure. Given the scarce soil moisture information, we expect that simulated soil moisture derived from a hydrological model would improve the representation of GPP simulations, particularly at semiarid regions or in drought events. We envision that the medium complexity of the presented model allows a seamless integration into large scale hydrological models to better represent ecohydrological aspects of ecosystems. We plan to implement the 690 PCM model into the existing hydrologic models (e.g., open-source mesoscale Hydrologic Model; mHM), thereby enabling an improved representation of coupled water and carbon fluxes, in the face of a changing environment. To allow for a seamless estimation of carbon and water fluxes, we plan to include implementation of a robust regional parameter inference approach (e.g., establishing regionalized LUE parameter through a multiscale parameterization approach (Samaniego et al., 2010)) and performing extensive cross-validation experiments to ensure credible model simulations across a wide range of spatial domains.  (3) Re (17) T Air GDD, fSP (18-21)

Spring Phenology
Canopy Bl Gain