the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Deep Dive into Global Hydrologic Simulations: Harnessing the Power of Deep Learning and Physics-informed Differentiable Models (δHBV-globe1.0-hydroDL)
Abstract. Accurate hydrological modeling is vital to characterizing how the terrestrial water cycle responds to climate change. Pure deep learning (DL) models have shown to outperform process-based ones while remaining difficult to interpret. More recently, differentiable, physics-informed machine learning models with a physical backbone can systematically integrate physical equations and DL, predicting untrained variables and processes with high performance. However, it was unclear if such models are competitive for global-scale applications with a simple backbone. Therefore, we use – for the first time at this scale – differentiable hydrologic models (fullname δHBV-globe1.0-hydroDL and shorthanded δHBV) to simulate the rainfall-runoff processes for 3753 basins around the world. Moreover, we compare the δHBV models to a purely data-driven long short-term memory (LSTM) model to examine their strengths and limitations. Both LSTM and the δHBV models provide competent daily hydrologic simulation capabilities in global basins, with median Kling-Gupta efficiency values close to or higher than 0.7 (and 0.78 with LSTM for a subset of 1675 basins with long-term records), significantly outperforming traditional models. Moreover, regionalized differentiable models demonstrated stronger spatial generalization ability (median KGE 0.64) than a traditional parameter regionalization approach (median KGE 0.46) and even LSTM for ungauged region tests in Europe and South America. Nevertheless, relative to LSTM, the differentiable model was hampered by structural deficiencies for cold or polar regions, and highly arid regions, and basins with significant human impacts. This study also sets the benchmark for hydrologic estimates around the world and builds foundations for improving global hydrologic simulations.
- Preprint
(2271 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on gmd-2023-190', Anonymous Referee #1, 06 Nov 2023
This study presents a simulation using a differentiable model at 3753 basins globally. It represents an advance in the field and is thus worthy of being published somewhere. However, the used model and the design of the experiments are almost the same as those published in the authors' previous papers. Readers of GMD would expect more progress in those aspects. Thus, I suggest a major revision.
Major comments
- The authors should introduce more details of the experimental design, such as the metrics used in the training, how many experiments (temporal generalization, PUB, and PUR; correct me if I am wrong), and the purpose of the experiments. Some details may have been presented somewhere else. I find this manuscript difficult to follow without reading the authors' previous publications.
- L124: what are the criteria for the selection? Can you describe the erroneous cases? Do the erroneous cases include the data processing error described in L350?
- L306-L317: Can you discuss more about comparing the traditional regionalization method and PUR? The PUR regionalization method can utilize a large number of observations to calibrate/train the differentiable model, whereas the traditional method can only use very limited samples. In other words, PUR may have a much higher chance of finding the optimal parameters than the traditional method.Minor comments
- Title: the phrase, global hydrologic simulations, is misleading. The simulations are conducted at 3753 basins across the globe. It represents a concept different from the "global hydrologic simulations."
- L127: is the classification from Beck et al., 2020b?
- L26 & L212: How is the subset of 1675 basins selected? What is the objective of the selection?
- L223: can you describe more about the structural issues? Why does the explicit solution scheme introduce numerical errors here?
- L291-L293: please rewrite the sentence. It is difficult to read.
- L398, "the underrepresentation of the processes...": this conclusion is too general. The difficulty of representing arid/polar/anthropogenic processes is known before reading this paper. The conclusion should be specific.Citation: https://doi.org/10.5194/gmd-2023-190-RC1 - AC2: 'Reply on RC1', Chaopeng Shen, 17 Jan 2024
-
CEC1: 'Comment on gmd-2023-190', Juan Antonio Añel, 19 Nov 2023
Dear authors,
I have checked the "Code and Data Availability" section in your manuscript. I would like to point out that the sentence saying that an updated version of the code and data will be available upon acceptance is misleading. Obviously, if your manuscript is accepted for publication, you have to publish it with the most updated code and data, and we do not accept "upon acceptance" statements. Right now, your statement seems to imply that you have not published your code and data in advance, however, they are included in the Zenodo repository. I would like to ask you to modify the sentence to avoid confusion.
Also, please clarify which is the PyTorch version that you have used for your work.
Regards,
Juan A. Añel
Geosci. Model Dev. Executive Editor
Citation: https://doi.org/10.5194/gmd-2023-190-CEC1 -
AC1: 'Reply on CEC1', Chaopeng Shen, 08 Jan 2024
Thanks. The code itself was indeed published previously. It is applied to a new (larger) dataset to yield new insights. This model was not previously applied on the global scale. We will clarify it in the revised manuscript. Thanks.
Citation: https://doi.org/10.5194/gmd-2023-190-AC1
-
AC1: 'Reply on CEC1', Chaopeng Shen, 08 Jan 2024
-
RC2: 'Comment on gmd-2023-190', Anonymous Referee #2, 10 Jan 2024
Overall, the manuscript is well-written with a clear research objective, innovative hybrid models, and solid results. The study compared the performance of the commonly used LSTM model with two differentiable hydrological models (static and dynamic parameters) with a temporal generalization experiment and conducted a comparative analysis for traditional “prediction in ungauged basins” problem. The model evaluation results provide valuable insights into improving the mechanism of the hydrologic models, confirming the strong localization and extrapolation capabilities of differentiable models. Below are some key comments and concerns:
L161-166: From my understanding, the original parameter calibration process of the HBV model has been replaced by the parameter calibration process of the gA neural network. If this is the case, it is still necessary to run the HBV model. How does this approach compare to the traditional hydrological model calibration methods in terms of modeling speed? Has it resulted in time and labor savings in the modeling process?
L200-205: Employing FHV and FLV is a thoughtful model evaluation strategy. In contrast to relying solely on integrated metrics like KGE, FHV and FLV offer a more thorough evaluation of model performance. Has the author considered whether integrated metrics such as KGE are suitable for capturing the distribution characteristics of state variables? Additionally, is the loss function used during the neural network parameter determination suitable for the data's distribution characteristics? This is crucial, as KGE includes a term related to Pearson correlation coefficient, which may not be applicable to distributions beyond normal. Such considerations are essential to avoid potential misguidance in the model calibration process.
L208-212 & Figure 3: Are the evaluation results corresponding to the training set, the testing set, or the overall results from both? The assessment outcomes should be provided separately for the training set and the testing set to enable a clearer evaluation of the model's performance and identification of any issues within the model.
L222-223: The structural issues might explain the errors with peak flow in differentiable models. However, it raises a question as to why LSTM also exhibits errors in peak flow. As mentioned earlier, the utilization of inappropriate evaluation metrics could contribute to the low FHV across all models. It requires a more in-depth consideration of how metrics impact the calibration results.
Citation: https://doi.org/10.5194/gmd-2023-190-RC2 - AC3: 'Reply on RC2', Chaopeng Shen, 17 Jan 2024
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
645 | 319 | 25 | 989 | 17 | 16 |
- HTML: 645
- PDF: 319
- XML: 25
- Total: 989
- BibTeX: 17
- EndNote: 16
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
Cited
3 citations as recorded by crossref.
- Identifying Structural Priors in a Hybrid Differentiable Model for Stream Water Temperature Modeling F. Rahmani et al. 10.1029/2023WR034420
- On the need for physical constraints in deep learning rainfall–runoff projections under climate change: a sensitivity analysis to warming and shifts in potential evapotranspiration S. Wi & S. Steinschneider 10.5194/hess-28-479-2024
- A comprehensive study of deep learning for soil moisture prediction Y. Wang et al. 10.5194/hess-28-917-2024