Comment on gmd-2021-287

L23-27: The conclusions are strongly focusing on the stress-gradient hypothesis, which was not the main focus of this study. I am convinced that there are other, more general conclusions you could make. For example: How are your findings relevant for recent and upcoming studies using the LPJ-GUESS model? What are conclusions that also consider your findings regarding the model parameters? What does your results imply for further model development?

We therefore think that providing both UA and SA, where roughly UA = SA * uncertainty is the best practice. Typically, the UA results will be more relevant from the perspective of predictive uncertainty, while SA results will be more relevant to understand if the model behaves as expected.
We have summarized this in the text: "It is important to note that uncertainties and sensitivities have different interpretations, and which of these two is more relevant strongly depends on the purpose. The calculated percental sensitivities can be interpreted as percentage change in the corresponding output, when changing a parameter value 1% in the prespecified range. The calculated uncertainties per parameter/driver can be interpreted as relative proportion of the overall uncertainty budget coming from environmental drivers and parameters. For scenarioanalysis, e.g. comparing different cut intervals of forests, sensitivities provide a direct estimate of the model response, e.g. how much biomass changes when the cut interval is changed. For a comparison of different model forecasts, uncertainties are usually more relevant. If a reduction of uncertainty via a model-data comparison is the purpose, both measures are important, as parameters with high sensitivities can contribute more or less predictive uncertainty, depending on their input uncertainty." We would love to expand on this in the paper, but there is already some discussion at the end of section 2.5, and we had the feeling that an extended discussion on the sense of UA and SA would be distracting here. No action taken so far, but would be happy to do so if the reviewer wishes.
Section 3.1 mean sensitivities: It might be worth noting some opposite behaviour in the sensitivity results as well, e.g. TSB and GPP are negatively sensitive to temperature except for F. sylvatica. For NBP, something similar occurs for lambdamax, respcoeff, turnoverroot, krp and emax, amongst the important parameters.
Author's Response: We agree with the reviewer and accordingly added this to the text. The text reads now the following: "Mixed stands were less sensitive to changes in parameters than mono-specific stands (Fig. 1). For monospecific simulations, species sometimes showed different magnitudes and even directions of sensitivities, especially Fag. syl. was more strongly affected by bioclimatic limits and Pin. syl. showed higher sensitivity to environmental drivers (temperature and solar radiation) than the other species. Moreover, TSB and GPP are negatively sensitive to temperature except for Fag. syl. For NBP, the direction of sensitivities changes between species for the non-water-stressed ratio of intercellular to ambient CO2 (lambdamax), the respiration coefficient (respcoeff), the root turnover (turnoverroor), an allometric constant (krp) and the maximum evapotranspiration rate (emax)." Also please consider sticking to the same order of parameters (x-axis) in both figures. Why is the mixed (*) symbol so much faded for NBP and TSB drivers? Finally, for sensitivity it makes sense (like authors say SA/UA have different interpretations) but isn't it a bit uncommon to visualize "negative contribution" to uncertainty? I mean, the caption says negative relationship not negative contribution, but still with the y-axis it reads confusing.
Author's Response: Thank you for the helpful comments on the figures. We now changed Fig. 2, such that it has the same ordering as Fig. 1 (aligned Fig. 1

labels as well)
We only show the absolute value of these effects corrected b) and c) because overall this seems to be off somehow Radar charts (or whatever you would call them) look almost identical to the eye, having more grids (inner circles) might aid the eye, or could plot an "average" polygon with a solid black line on each for relativity.

implications for future model development
The management and the nitrogen cycling module are the most recent improvements of the LPJ-GUESS model (Smith et al., 2014;Lindeskog et al., 2021). Compared to previous sensitivity and uncertainty analysis, the high contributions of the nitrogen fixation to the predictive uncertainty of TSB and GPP (Fig. 2 a,c) are novel, though not surprising, as nitrogen is an important factor for the productivity of most temperate and boreal ecosystems (Vitousek and Howarth, 1991). The main reason why few earlier studies report those uncertainties is that vegetation models have only recently begun to integrate nitrogen cycling and limitation (e.g. B. Smith et al., 2014). The management module showed only small uncertainties, which could be due to the narrow parameter ranges for the cut interval and thinning intensity reflecting typical forest owners' choices. As forest owners usually try to maximize their profits (Johansson, 1986;but see Brazee and Amacher, 2000) and thus biomass production, low sensitivities of the management module are not surprising. A more suitable and important test case and application of the management module is a historical reconstruction of foliage projective cover data or similar outputs of the LPJ-GUESS model.
Our study helps to guide the model application, discussion of uncertainties and model development of LPJ-GUESS and other DGVMs. First, future model applications and model comparisons should focus on mortality as these processes contributes high uncertainties for carbon-related projections (see Fig. 1-3). Thereby, it should be investigated if these uncertainties stem from the intra-specific variability of the parameters itself (Bolnick et al., 2011), parameters are just not identifiable (see Marsili-Libelli et al., 2014), or if a model data comparison could reduce uncertainties in the parameters (e.g. Hartig et al., 2011). Using time series inventory data might help as it is informative for constraining mortality modules (Cailleret et al., 2020). Second, small sensitivities of establishment related parameters are surprising as we know that not all three investigated species can effortlessly establish across all of Europe, e.g. Fag. syl. can only establish on locations with no extreme drought and heat and no extreme winter frosts (Bolte et al., 2007). Thus, either we missed important parameters of this module, or the parametrization of the model needs to be updated. Third, when introducing new processes or coupling with other models (e.g. Forrest et al., 2020) calculating interactions helps to get a first impression where these new processes influence other model processes and potentially detect missing links. Moreover, future model applications can interpret their results with regard to the sensitivities in different factors (Saltelli et al., 2019) and discuss uncertainties and the causing factors, when used in policy advice (Laberge, 2013)." 337-338: I'm confused. On lines 71-72 authors said high sensitivities to water-related parameters were found: " Additionally, LPJ-GUESS showed high sensitivity to [...] waterrelated parameters (minimum canopy conductance not associated with photosynthesis, maximum daily transpiration, Pappas et al., 2013;Zaehle et al., 2005)." Please clarify.
Author's Response: Thank you, it is true that we have missed this. We corrected it in the revised version.
358-359: I thank the authors for explaining the potential cause of the negative effect, but while temperature affects TSB and GPP negatively, how does it affect NBP positively in the model?
Author's Response: We cannot say for sure, but note that, in contrast to TSB and GPP, the calculation of NBP also includes respiration and disturbances. So we speculate that respiration decreases more strongly with temperature than GPP (there is also a strong neg. interaction of the respiration coefficient and temperature for NBP, see Fig. A4). However, respiration itself depends not only on temperature but also on different factors like precipitation. A detailed discussion of the complex mechanisms behind this positive influence is out of scope for this manuscript, and in combination with the comments of the second reviewer we decided not to cover this topic in the revised manuscript.

360-361: Not just in magnitude but also in direction?
Author's Response: Thank you, we have added the direction to the revised manuscript.

367: Random forest results could be mentioned in the results section.
Author's Response: Done. 372-373: Does the finding "nitrogen-induced uncertainty decreases with increasing temperatures" correspond to the general statement of "limiting factors change across environmental conditions"? Or did the authors mean to cite a more specific ecological principle / hypothesis here?
Author's Response: We indeed meant the general principle that limiting factors change across environmental conditions. With the manuscript changes, this sentence however is not anymore in the manuscript. 373-377: Since the authors emphasize the stress-gradient hypothesis in the abstract and conclusion, I wonder if they can elaborate more and clarify the reasoning here to convince the reader. I had to read these sentences multiple times and it is still not clear to me how it follows from "decrease of uncertainty contributions of structure-related parameters on the temperature gradient" to the stress-gradient hypothesis which states where the physical environment is relatively benign (harsh) competitive (facilitative) interactions should be the dominant structuring forces. Water, mortality, establishment and photosynthesis parameters' uncertainty contributions also increase on the temperature gradient, which seem to indicate more competitive interactions to me.
Author's Response: We have now elaborated more on these questions in the manuscript and guided the reader towards our thought process. The entire paragraph now reads the following: "Interestingly, our results of decreased uncertainty contributions of structure-related parameters and increased contributions of environmental drivers on the temperature gradient (Fig. 4) also seem in line with the stress-gradient hypothesis (Maestre et al., 2009), an empirically-observed pattern which states that in stressful environments, positive interactions should occur more often than in benign environments (e.g. Callaway, 2007). For the ecosystem that we consider, we interpret increasing temperature as increasing stress (e.g. Ruiz-Pérez and Vico, 2020), and structure as the best indicator for competitive interactions as the structure dictates resource allocation (e.g. bigger crown, but identical stem diameter leads to more photosynthesis; more sapwood to heartwood turnover requires less NPP). With this interpretation, one would conclude that under increasing stress, the importance of competition-related parameters decreases in the model, as expected from the stress-gradient hypothesis. We acknowledge that a fair amount of interpretation is needed to arrive at this conclusion, and we do not claim that this result lends evidence to the empirical discussion about the generality of the stressgradient hypothesis, but we find it noteworthy that such a large-scale pattern emerges in the model from lower-level processes, without having been imposed (see also Levin, 1992)." 378: Unless it was an artefact of the analysis protocol (i.e. lines 341-342).
Author's Response: We have erased this topic from the manuscript, as we wanted to concentrate on the stress-gradient hypothesis and discuss this topic in more depth.
385: After conducting the analyses, (I know they say their results are robust to these choices but) were the authors happy with the uncertainty characterizations they initially came up with? I.e. do they still think these would be their best guess at this point or were there any parameters/drivers in particular that they wished they had varied/treated differently, for future studies?
Author's Response: We have no reason to think that our choices were bad, but also, we wouldn't know how to say that from the analysis we made. In effect, the results about UA are contingent on the uncertainties that are assumed, but there is no way to proof if these assumptions were good from the results of our study.
We realize, however, that our methods to quantify uncertainties were crude, in particular regarding the number of experts involved in the study. We do not think that this is a major limitation of the study, as we expect that experts should at least qualitatively agree about relative uncertainties, but including more experts and their opinion about plausible parameter ranges, for example, would have allowed us to more exactly reflect the current view of the community about the uncertainty in each parameter.
We added a sentence about this into the text.
have not been included in earlier SA/UA studies of LPJ-GUESS. While the process of nitrogen is mentioned and discussed, no insights are given regarding the management modules. An attentive reader can find out in Table 1 that two management-related parameters were considered in the analysis (cutinterval and thinning_intensity), which are however not related to any sensitivity/uncertainty according to Fig. 1&2. Please extend on the changes in model structure and include and discuss the findings in the discussion section.
Author's Response: We are unsure why the reviewer says these parameters are "not related" -the parameters certainly appear in Fig.1/2, but it is true that they are not particularly sensitive.
Upon finding that these parameters are not sensitive, one could of course argue post-hoc that the update of the SA compared to previous studies appears of lesser importance; however, this could have been known only after conducting the study, so testing the sensitivity (and possible interactions) of the these new parameters is necessary to know if the old results are still valid. In that sense, we see no problem with the motivation of our study.
It is true though that we had not put much emphasis on the sensitivity to management so far. We now elaborate on the changes in the management module of the model and included a section in the discussion about it.
2) Drivers vs parameters: It is not at all surprising that the environmental drivers contributed most uncertainty and had the highest sum of interactions in a climatesensitive dynamic vegetation model as LPJ-GUESS. These drivers are inputs to all important processes in the model (i.e., primary production/growth, plant biogeography, soil hydrology, C exchange, etc.). Moreover, the variation ranges deduced from the different climate change scenarios are considerable. Hence, all the climate change simulation studies with LPJ-GUESS build on the sensitivity of the model to the climatic drivers. I can see that the added value of this study is that the uncertainty contributions can be attributed to the individual climatic drivers and analyzed across a temperature gradient. Yet, I think the authors should make really clear that parameters and drivers have different roles in process-based models. At least to me, it appears a bit as if you are comparing apples and oranges. In view of the potential insights for the LPF-GUESS modelling community, a separate analysis of both would show patterns much better (right now, 'drivers' mask all other changes in parameters in Fig. 3, which is not really informative).
Author's Response: We fully agree with the reviewer that drivers and parameters are conceptually different -this is why we refer to them as drivers and parameters throughout the manuscript, and never call temperature, for example, a model parameter. We have added a few sentences in the discussion to clarify this: "As the model is sensitive to parameters and environmental drivers, and because these influence each other, we treated them in a combined sensitivity and uncertainty analysis (Saltelli et al., 2019), however, when interpreting it should be kept in mind that the one group relates to uncertainties in the model, while the other is external, so the two are conceptually very different." That being said, from the point of view of a sensitivity or uncertainty analysis, both are factors that the model is sensitive to, or that can contribute to uncertainty, and that should therefore be included in a SA / UA. The need to include as many factors as possible to produce valid SAs/UAs is explicitly highlighted in:  Saltelli et al. stress in chapter "5.4

. Recommendations for best practice"
Both uncertainty and sensitivity analysis should be based on a global exploration of the space of input factors, be it using an experimental design, Monte Carlo or other ad-hoc designs.
We would therefore argue that we follow the best advice in the literature, and that it would be counterproductive to make any changes to the methods.
3) I find it surprising that there are hardly any differences of the relative uncertainties across the environmental zones in Fig. 3. For example, the different species do not seem to show species-specific uncertainties to water-related parameters across space. I would also expect that not all species are able to grow in all environmental zones, which should somehow become visible at range limits. In a subsequent analysis, the authors focused on the uncertainty contributions across a mean annual temperature gradient (Fig. 4). In line with Fig. 3, the changes in uncertainty contributions are rather small (e.g., approx. 11% for temp between a 5° and a 20° site; e.g. Southern Sweden vs. Southern Spain). I wonder, whether these changes are statistically significant. Could you please provide some details (e.g., plots of the linear regressions including simulated TSB and R2). Given these results, I got the impression that the authors oversold the results in this regard (L23-27; L371-380).
Author's Response: First, regarding the concern that not all species should be able to grow in all environmental zones: we are not sure about the entire environmental zone, but it is indeed so that some species were not able to grow on some plots. If that was so, the species was excluded from the analysis for the given plot. This was noted in the methods section: "To calculate mean sensitivities/uncertainties for each species, we averaged site-specific sensitivities over all sites with an average annual biomass production greater than 2 tC/ha. We have chosen this threshold because smaller values indicate that the environment is not suitable for the species, however, for each site at least one species was able to establish." Moreover, note that we also varied the bioclimatic limits in the SA, meaning that boundaries for where species can grow are likely softer than if there would be a fixed parameter.
Regarding the concern of the reviewer that changes across the environmental zones are small: we would argue it is somewhat subjective if a 10% change is small or not. As the reviewer notes in point 2, LPJ-GUESS is climate sensitive, so it would be very surprising if temperature would affect model results in Scandinavia, but not in Spain. Note that what we display here are changes in percentage points in uncertainty contributions.
Moreover, we also vary the bioclimatic limits of each species and thus species basically have at least some parameter combinations for which they then can establish. Putting these two things together, the magnitude of differences in relative contributions across environmental zones do not seem surprising to us.
As requested by the reviewer, we now also report summaries of the fitted regressions (pvalues and R2).

4) Discussion:
The discussion section needs considerable improvement. Various topics are mentioned, but there is very little substance and added value to many of the raised points (e.g., often only one sentence mentions the importance of a process and refers to a study that found similar effects). A better selection of the critical points and an in-depth discussion of these issues would greatly improve the manuscript.
Author's Response: We agree with the reviewer that we have a lot of topics to discuss in the manuscript. We have now focused on the most relevant topics which we discuss in greater detail.
Moreover, the discussion should be better embedded in the existing body of literature, both regarding model-based and empirical/physiological studies, especially if the authors keep the comparison with empirical results as one of the four main objectives of this study (L102-103). Thereby, please make clear whether the reference you refer to is a modelbased study or a field study.
Author's Response: We have now carefully revised the discussion to make this more clear, in which context this study is important and how it connects to the current literature.
Also, please be careful with the wording, e.g., a positive or negative effect of a parameter or driver can be explained by the fact that a certain ecological effect (which has been found by empirical studies) is integrated in the model formulation. Such an effect can be 'in line with empirical studies' but it cannot prove an effect as LPJ-GUESS is just a model.
Author's Response: We apologize if our wording gave the impression to the reviewer that we think that we prove an ecological effect by finding it in LPJ-GUESS. This was never our interpretation.

What we intended to do when comparing LPJ-GUESS behavior to ecological hypothesis is to show that the model behaves similar to what is reported or at least conjectured based on field observations.
We hypothesize that the model should behave similar to such field observations, and if it does, we primarily interpret this as evidence for the fact that the model behavior is plausible. We have carefully revised the manuscript to only make claims that are supported by this study.
Please also discuss processes, which turned out to be related to low sensitivities/uncertainties according to your simulation setup (e.g., establishment, management). Currently the discussion only considers the processes that turned out to be important, but it lacks an explanation why these patterns occur. For instance, no effect was found for management. This definitely needs explanation. Or, only small effects were found for establishment. Can this be explained by the spin-up and what are the implications for other simulation setups?
Author's Response: This is a good point. We have now dedicated a paragraph to a more in depth discussion of the results regarding the newer modules and modules of low sensitivity in LPJ-GUESS.
"Second, small sensitivities of establishment related parameters are surprising as we know that not all three investigated species can effortlessly establish across all of Europe, e.g. Fag. syl. can only establish on locations with no extreme drought and heat and no extreme winter frosts (Bolte et al., 2007). Thus, either we missed important parameters of this module, or the parametrization of the model needs to be updated." Given the motivation the authors outline in the introduction, it would be worth to cover the following points: How are your findings relevant for recent and upcoming studies using the LPJ-GUESS model? What does your results imply with regard to further model development efforts? How could the robustness and reliability of the model projections be increased?
Author's Response: We agree that model development is a motivation for our analysis, but it is by far not the only one. In general, we see at least 3 purposes for an SA / UA We have slightly modified the discussion to convey these ideas. "Our study helps to guide the model application, discussion of uncertainties and model development of LPJ-GUESS and other DGVMs. First, future model applications and model comparisons should focus on mortality as these processes contributes high uncertainties for carbon-related projections (see Fig. 1-3). Thereby, it should be investigated if these uncertainties stem from the intra-specific variability of the parameters itself (Bolnick et al., 2011), parameters are just not identifiable (see Marsili-Libelli et al., 2014), or if a model data comparison could reduce uncertainties in the parameters (e.g. Hartig et al., 2011). Using time series inventory data might help as it is informative for constraining mortality modules (Cailleret et al., 2020). ".... "Our study helps to guide the model application, discussion of uncertainties and model development of LPJ-GUESS and other DGVMs. First, future model applications and model comparisons should focus on mortality as these processes contributes high uncertainties for carbon-related projections (see Fig.  1-3). Thereby, it should be investigated if these uncertainties stem from the intra-specific variability of the parameters itself (Bolnick et al., 2011), parameters are just not identifiable (see Marsili-Libelli et al., 2014), or if a model data comparison could reduce uncertainties in the parameters (e.g. Hartig et al., 2011). Using time series inventory data might help as it is informative for constraining mortality modules (Cailleret et al., 2020). "

Minor comments
Abstract L23-27: The conclusions are strongly focusing on the stress-gradient hypothesis, which was not the main focus of this study. I am convinced that there are other, more general conclusions you could make. For example: How are your findings relevant for recent and upcoming studies using the LPJ-GUESS model? What are conclusions that also consider your findings regarding the model parameters? What does your results imply for further model development?
increasing stress, the importance of competition-related parameters decreases in the model, as expected from the stress-gradient hypothesis. We acknowledge that a fair amount of interpretation is needed to arrive at this conclusion, and we do not claim that this result lends evidence to the empirical discussion about the generality of the stressgradient hypothesis, but we find it noteworthy that such a large-scale pattern emerges in the model from lower-level processes, without having been imposed (see also Levin, 1992)."

L382-400: well written
Author's Response: Thanks Tables   Table 1: Could you order the parameters by Group for better readability?

Figures and
Author's Response: Done. Fig. 1: The species result symbols are very small, please improve. The sensitivities of the three species are quite different from each other for some parameters (also different directions occur!). Please use the same y-axis for better comparability between the three outputs (also for Fig. 2).
Author's Response: We have enlarged the species symbols and used the same y-axis.  Author's Response: We have now colored the bars according to the processes and moved the labels more into the middle.