Continental-scale evaluation of a fully distributed coupled land surface and groundwater model ParFlow-CLM (v3.6.0) over Europe
- 1Institute of Bio- and Geosciences Agrosphere (IBG-3), Forschungszentrum Jülich GmbH, Jülich, Germany
- 2Bureau of Meteorology, Melbourne, Australia
- 1Institute of Bio- and Geosciences Agrosphere (IBG-3), Forschungszentrum Jülich GmbH, Jülich, Germany
- 2Bureau of Meteorology, Melbourne, Australia
Abstract. High-resolution large-scale predictions of hydrologic states and fluxes are important for many multi-scale applications including water resource management. However, many of the existing global to continental scale hydrological models are applied at coarse resolution and or neglect lateral surface and groundwater flow, thereby not capturing smaller scale hydrologic processes. Applications of high-resolution and more complex models are often limited to watershed scales, neglecting the mesoscale climate effects on the water cycle. We implemented an integrated, physically-based coupled land surface groundwater model; Parflow-CLM version 3.6.0, over a pan-European model domain at 0.0275° ( 3 km) resolution. The model simulates three-dimensional variably saturated groundwater flow solving Richards equation and overland flow with a two-dimensional kinematic wave approximation, which is fully integrated with land surface exchange processes. A comprehensive evaluation of hydrologic states and fluxes, resulting from a 10 year (1997–2006) model simulation, was performed using in-situ and remote sensing observations including discharge, surface soil moisture (SM), evapotranspiration (ET), snow water equivalent and water table depth. Overall, the uncalibrated PF-CLM-EU3km model shows good agreement in simulating river discharge for 176 gauging stations across Europe. Comparison with satellite-based datasets of SM shows that PF-CLM-EU3km performs well in semi-arid and arid regions, but simulates overall higher SM in humid and cold regions. Comparisons with Global Land Evaporation Amsterdam Model(GLEAM) and Global Land Surface Satellite (GLASS) ET datasets show no significant differences, both, across the European domain (on average the difference is -0.09 and 0.30 mm d-1 for GLEAM and GLASS products, respectively), and within regions (R > 0.9). The large-scale high-resolution setup forms a basis for future studies, demonstrating the added value of capturing heterogeneities for improved water and energy flux simulations in physically-based fully distributed hydrologic models over very large model domains. This study also provides an evaluation reference for climate change impact projections and a climatology for hydrological forecasting, considering the effects of lateral surface and groundwater flows.
- Preprint
(8781 KB) -
Supplement
(3368 KB) - BibTeX
- EndNote
Bibi S. Naz et al.
Status: open (until 01 Sep 2022)
-
RC1: 'Comment on gmd-2022-173', Anonymous Referee #1, 24 Jul 2022
reply
The model evaluation paper of Naz et al. describes a version of the established ParFlow-CLM model applied over Europe and evaluated its hydrological components.
ParFlow-CLM is an established modeling tool, and a publication of a model evaluation paper that builds a foundation for future scientific use is certainly something I would like to support. Unfortunately, in the current stage, the manuscript does not deliver on this goal and seems to purposely hide model shortcomings. In the current version, I can only suggest significant revisions.
I would like to focus on two aspects that are currently flawed. Firstly, the paper's motivation could be much clearer from the beginning. In the light of the many publications that already exist on ParFlow and CLM, what is the added value of this model evaluation paper? What is the model's purpose within the range of continental and global models? What questions can it help to answer? Outlining this much more clearly from the beginning will be helpful for the scientific community in making this publication a helpful reference for future research.
Secondly, I cannot accept the current evaluation of the groundwater component. The authors use groundwater in the title and motivate the model's usefulness with the argument of an active groundwater component but provide a not convincing evaluation. I do not expect the model to be able to perfectly represent the water table. Still, I think we can only progress if we are open about our models' shortcomings and clearly communicate uncertainties. Poor model performance is not a reason for not publishing something as long as there is a proper discussion on the causes. Currently, the paper is not doing that and uses oversimplified evaluation methods to obfuscate the actual model behavior. Furthermore, existing literature and models are omitted as well.
Additional notes:* While I know that it is difficult to find a repository to host a large amount of data, I employ the authors to think about if selected model outputs could be made available in the spirit of OpenScience principles!
* Is it really necessary to use the overcomplicated PF-CLM-EU3km as a name? Why not stick with ParFlow-CLM in the paper? If it is a very different model, why is that not the name used in the title?
L. 1: How are these large-scale models useful for water resource management? I see how they are helpful for large-scale policy and fostering scientific understanding but are they really useful for management? Please also define what high-resolution means in brackets - people have very different interpretations about that, and it is changing fast.
3: How is the coarse spatial resolution linked to the lateral fluxes and groundwater components - isn't that mixing up things? What small scale processes specifically?
4: what does more complex refer to? Complex in what regard?
11: what is PF-CLM-EU3km? It has not been introduced; quantify good agreement
17: this is the first-time heterogeneities are mentioned. Is it implied that this is a result of the higher spatial resolution? This should be explained
Fig. 1 c) WTD in log scale without indicating what red is. Is that deeper than 100 m? How deep is it? Why is the WTD so deep near larger rivers? Why so shallow in mountainous regions? What is the reasoning here why this is plausible? Is it plausible in the light of the performance of other large-scale models?415: I get the problem of inconsistent WTD elevation data. Still, this should be solvable for at least some regions in Europe. I feel that the authors feared that the model performance would be judged too harshly. Whatever the reason, the solution shown here is not acceptable. Furthermore, you can't simply select only the cells that simulate WTD < 10!! This is the range almost all models do a good job. This is not advancing our science. This is far from ok.
Please show how much the model deviates from observations. You motivate your paper with the statement that representation of groundwater is essential and then skip a proper evaluation of your model.
I suspect it will not perform perfectly - no large-scale model currently can, and you are providing some reasonable answers by referring to Gleeson et al., 2021, which is good but not enough. Please provide a more extensive discussion on how the performance differs from other existing research.
417: ?? = Fig. 7 -
RC2: 'Comment on gmd-2022-173', Anonymous Referee #2, 03 Aug 2022
reply
This manuscript is an implementation of the ParFlow-CLM at high resolution (3 km) focused upon the European domain. The validation of the model performance is a wide-ranging analysis based upon remotely sensed soil moisture, and ET, as well as ground-based data products of soil moisture, SWE, ET, groundwater depth, and streamflow. It is generally well written, although there is a lack of focus in the key findings. The authors attributed deviations from observed site level behavior (e.g. positive SM and ET bias) primarily to uncertainties with the incoming atmospheric forcing. However, it seems likely that uncalibrated parameters could have just as easily led to these biases.
The authors motivate the analysis by claiming high spatial resolution combined with a representation of lateral groundwater flow is necessary for improved region wide prediction of hydrological variables. However, this reviewer did not find compelling evidence to demonstrate these assertions from this analysis alone, partly because the model skill was not put in context of other simulations. For example, implementing a coarse version of ParFlow CLM, or a version without lateral ground-flow could have better demonstrated these points.
This manuscript is, in fact, complementary to a similar implementation of ParFlow-CLM for the CONUS domain (O’Neill et al). Yet, the author’s do not fully address this point until late in the conclusions, and miss an opportunity to provide a more rigorous comparison between the CONUS and European domain performance with ParFlow-CLM.
It is challenging to evaluate this manuscript because in one sense the methods behind the model implementation and evaluation are useful to the LSM or hydrology community. This validation approach (use of statistics based on comparison to RS and site-based observations) could be used as a template for benchmarking other models. Furthermore, this ‘evaluation of a previously published model’ does fulfill one of the criteria for publication in GMD. On the other hand, the comparison between the model simulation and remotely-sensed and ground based observations lacked a clear focus. Detailed comments are below.
Line 21: It is a bit confusing what the authors mean by high resolution hydrological modeling, and large-scale hydrologic processes. Better quantification?
Line 30: LSM’s are also used commonly for carbon and nitrogen cycling research. Both LSM’s and GHMs solve water balance equations.
Lines 40-50: The author seems to be conflating two things: issues of spatial resolution, or issues related to physical processes. It is true a coarse scale model will not capture fine scale hillslope topography which could be important for watershed scale studies, but is this necessary for global scale climate models?
Line 77: You need to spell out remote-sensing (RS) the first time you use it.
Line 90: What is the difference between Parflow-CLM, PF-CLM and PF-CLM-EU3km?
Line 97: Renaming a model to PF-CLM-EU3km usually means you have changed the model equation/structures/parameterizations. I don’ think the author’s do that here – it is simply the PF-CLM or Parflow-CLM model run at a certain spatial domain (Europe) and at 3 km resolution. A ‘new’ model hasn’t been designed or developed…..
Section 2.0.2 It is completely unclear what is novel about your implementation of ParFlow-CLM other than the domain and resolution. This seems like a model application and not novel development.
Line 134: Not clear what ‘inscribing’ into the Eur-11 grid means.
Line 144: CLM3.5 is from the Community Land Model, different than the Common Land Model (CLM) described here within ParFlow-CLM.
Section 2.0.4 It seems unlikely that nine years of spinup would be enough to reach equilibrium between prescribing vegetation conditions and subsurface soil moisture state. Did the author’s check that the hydrological variables approached an equilibrium. It is also typically not normal to spinup with a single year (1997), you would want to spinup up overall several years (decade if possible) to capture variation in met forcing.
Line 269: “Because of the explicit lateral groundwater and surface flow representation, we show that the PF-CLM270 EU3km model is able to resolve multi-scale spatial variability in hydrological states and fluxes such as simulated river flow, SM, ET and WTD distributions which are strongly correlated with the river network and topography as shown in Fig. 1.”
I am not sure I found any evidence of this causal relationship.
Line 339: “The difference is explained by the shallow groundwater system simulated only by PF-CLM-EU3km, which contributes to the saturation of the deeper soil layers leading to higher soil water content, whereas the standalone CLM3.5 model applies a simple approach to simulate groundwater recharge and discharge processes in a single column and neglects explicit lateral groundwater flow.”
It appears here that the authors are attempting a comparison against CLM3.5 (the Community Land Model) which was used as the LSM to develop the ESSMRA product, and comparing against the PF-CLM-EU3km. Claiming the differences in SM can be accounted for by differences in the accounting of lateral groundwater flow. This is a complicated comparison for many reasons, one of them being that the ESSMRA product includes observations of the ESA-CCI ‘observations’. The PF-CLM-Eu3km does not. It is not a controlled comparison to claim lateral groundwater flow is the cause for the differences…..
It's also extremely confusing that CLM3.5 (Community Land Model) is not the same as the “CLM” (Common Land Model) in PF-CLM-EU3km.
Figure 4: Not clear what we can hope to learn by comparing 3 separate SM products against each other. Would it not be more helpful to compare the performance of the SM products against in-situ site ISMN observations? I see that this comparison is pushed to the supplement.
Line 387: “Previous studies of PF-CLM-EU3km also indicate……”
Apparently this exact implementation of this configuration of the CLM ParFlow has been done before? Still failing to see the novelty of the study?
Figure 5: It would be more compelling to show mean seasonal cycles for a sampling of sites (model vs. flux tower ET) across a variety of biomes. Seasonal correlations (as shown) should be strong, just based on phenology of vegetation, as well as increase/decreases in SW radiation. You show regional plots in Figure 6, but running at high resolution grid (3 km) should allow you to make direct comparison to flux tower ET data. It is less compelling to show seasonal variation with GLEAM and GLASS given these are data products.
Line 417: “Figure ??” typos show up a few times in this manuscript.
Line 469: “Our comparison of simulated SWE with observed SWE reveals an overprediction of SWE in the Eastern regions which is more likely to be related to the uncertainties in precipitation.”
I don’t follow how the authors came to this conclusion. Could not biases in SWE be a result of uncertainties in temperature, or from issues with the snow/energy balance model which simulates accumulation and depletion of snowpack? If some sort of evaluation against in-situ site atmospheric observations was performed that could provide more credibility.
Line 481: “The rigorous evaluation of the PF-CLM-EU3km model over Europe together with the recent study by O’Neill et al. (2021) which evaluated model performance over CONUS paves the way towards a global application of fully distributed physically-based hydrologic models.”
This is the first time, at the end of the manuscript, where the authors mention this serves as a companion paper to the CONUS implementation of the same model. This manuscript would have been much more compelling if comparison in performance were discussed between the CONUS and EU implementations throughout. Or to quantify the benefit of high resolution implementation of this model, with subsurface, later flow against other LSM’s at coarse resolution, or lacking later, subsurface flow.
Line 483: “The protocol of evaluation metrics and methods presented in this study and in O’Neill et al. (2021) can be used as a framework to benchmark future PF-CLM-EU3km model implementations to further improve model simulations in the areas that have been identified or to explore the impacts of groundwater on 485 simulated hydrological states and fluxes by comparing with other existing global land surface model applications.”
Again, it would be more compelling if this manuscript performed a direct comparison of performance against the CONUS implementation or existing global land surface model applications to demonstrate improved utility/skill.
-
RC3: 'Comment on gmd-2022-173', Anonymous Referee #3, 05 Aug 2022
reply
In their manuscript Naz et al. evaluate a pan-European, high-resolution (0.0275°) simulation with the coupled land surface groundwater model ParFlow-CLM, using observations and re-analysis data for streamflow, near-surface soil moisture, evapotranspiration, water table depth and snow water equivalent. In general, the manuscript is well written, the metrics for evaluation seem to be appropriate and the authors go into great detail discussing the potential sources for some of the biases – with respect to possible shortcomings of the model but also of the observational data.
Having said that, there was one aspect of the evaluation that did not fully convince me, namely the evaluation of the simulated water table depths, where only the anomalies were being investigated. I understand that it may not be easy to define the reference elevation, but with the sophisticated ground water fluxes being the key component of the model that sets ParFlow-CLM apart from most LSMs, the authors should really think about discussing a comparison of the absolute values – maybe indicating the uncertainty due to the reference surface elevation. Also I did not understand, why the authors limited their comparison to those points with simulated WTD < 10m ?
However, my main concern is that I found it somewhat difficult to connect the results to the motivation outlined in the (very well written) introduction of the paper. A large part of the latter is focused on the shortcomings of LSMs and GHMs and their -- admittedly extremely simple – representation of (subsurface) processes. So I would have welcomed a comparison between ParFlow-CLM and a CLM version without ParFlow – possibly the one that is part of ParFlow-CLM -- or with a LSM that includes some simple parametrization of ground water flow (e.g. CLM5 [Felfelani et al., 2021]). Furthermore, the authors indicate that the resolution of the model is important, which I am very willing to believe. Yet they do not show how this affects the simulations in case of their model. Here, a convincing case may have been made by comparing their simulation to the 12km runs in Shastra et al. (2021). If the authors do not want to include an inter-model/-resolution comparison maybe they could think about a different approach to the paper: E.g. as an alternative, the authors could have referred to the study of O’Neil et al. (2021) from the beginning and then set up the paper as a comparison of ParFlow-CLM simulations of Europe and of the CONUS region?
Finally, it is not always easy to estimate whether the model really captures a given variable well or not. E.g the authors state that the model “appropriately captures the seasonal cycles” of the WTD (l. 419). However, with only 20% of the investigated cells exhibiting an R > 0.5, it is debatable whether or not this is appropriate. Again, it would have been much more straightforward if the simulation had been compared to a different model / resolution and the question would have simply been about better or worse than XYZ. Without such a comparison, I am not sure that all of the claims made by the authors – e.g. “the added value of capturing heterogeneities for improved water and energy flux simulations in physically-based fully distributed hydrologic models over very large model domains” (l. 16 ff) – are substantiated by their results.Additional comments:
l. 144) (Annoying detail, but) I think that here CLM refers to Community Land Model, while CLM was defined in l. 121 for the predecessor Common Land Model.
l. 171) Why do you loop a single year to force the model? Doesn’t that include the risk of running the model to a non-representative equilibrium state? Also, how did you decide that a 9-year spin-up is enough and how were the states initialized, that a 9-year spin-up is sufficient?
l. 218) What specific data was assimilated?
l. 226) I think it could also be really interesting to compare SM profiles at the stations in addition to the top layer SM.
l. 269 ff & Fig1) As you indicate a strong dependency on topography, could you maybe include a plot of the topography in Fig. 1. Also, why is the SWC so low and the WTD so high right next to the river?
l. 289 f.) In case of the Rhine (gauges 2-5) the model appears to underestimate the discharge quite a bit, would this still be explainable by human impacts? Or could it not point to an underestimation of P-ET?
l. 290) I am not sure that everyone is so familiar with the KGE as to immediately know what the range of values indicates. Could you maybe add a very brief explanation here?
Fig2.) I found it a bit hard to identify the gauges in subplot a, do you think it would be possible to zoom in over the center of the first subplot?
l. 298) I think something went wrong referencing the figure.
Fig3) Could you clarify that the color-code in panel c is the same as in b?
l. 339 ff) How can you be sure that the differences are a result of the different treatment of the lateral groundwater flow? I thought that between CLM3.0 and 3.5 there were also major changes in the terrestrial hydrology – e.g. a TOPMODEL approach to runoff generation and changes to the evaporation calculation?
l. 352) Not Fig. 4c?
l. 353) The R values in subplot 4c go beyond this range.
Fig 4.) When comparing ESACCI and ESSMRA in subplot b, these seem to agree much better than ParFlow-CLM agrees with any of the two datasets. As ESSMRA is the closest to a second model that is shown in the study, one could come to the conclusion that the added complexity of the explicit treatment of groundwater fluxes in PArFlow-CLM does very little to improve the near surface soil moisture. Thus, it would be very helpful if the authors could describe in more detail what was assimilated in ESSMRA, because if it was soil moisture directly then the good agreement between ESACCI and ESSMRA is not very surprising. Otherwise it would be very interesting to understand why the ESSMRA appears to be so much closer to ESACCI.
l. 387) Could this overestimation of ET also be a reason for the underestimation streamflow in the Rhine?
l. 417) I think something went wrong referencing the figure.
l. 419) Here I was a bit surprised at the comparatively low R values. Given that precipitation is prescribed based on observations and that both streamflow and ET show a much better correlation with the observations, does this indicate that the model is missing something important in the representation of the groundwater dynamics?
Bibi S. Naz et al.
Model code and software
ParFlow Version 3.6.0 Simth et al., 2019 https://doi.org/10.5281/zenodo.4639761
Bibi S. Naz et al.
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
187 | 60 | 10 | 257 | 19 | 1 | 1 |
- HTML: 187
- PDF: 60
- XML: 10
- Total: 257
- Supplement: 19
- BibTeX: 1
- EndNote: 1
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1