Repeatable high-resolution statistical downscaling through deep learning
- Institute of Hydrology and Meteorology Technische Universität Dresden, Germany
- Institute of Hydrology and Meteorology Technische Universität Dresden, Germany
Abstract. One of the major obstacles for designing solutions against the imminent climate crisis is the scarcity of high spatio-temporal resolution model projections for variables such as precipitation. This kind of information is crucial for impact studies in fields like hydrology, agronomy, ecology and risk management. The currently highest spatial resolution datasets on a daily scale for projected conditions fail to represent complex local variability. We used deep learning (DL) based statistical downscaling (SD) methods to obtain daily 1 km resolution gridded data for precipitation in the Eastern Ore Mountains in Saxony, Germany. We built upon the well established climate4R framework, while adding modifications to its base-code and introducing skip connections based DL architectures, such as U-Net and U-Net++. We also aimed to address the known general reproducibility issues by creating a containerized environment with multi-GPU and TensorFlow’s deterministic operations support. The perfect prognosis approach was applied using the ERA5 reanalysis and the ReKIS (Regional Climate Information System for Saxony, Saxony-Anhalt, and Thuringia) dataset. The results were validated with the VALUE framework. The introduced architectures show a clear performance improvement when compared to previous SD benchmarks. Characteristics of the DL models configurations that promote their suitability for this specific task were identified, tested and argued. Full model repeatability was achieved employing the same physical GPU.
-
Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
-
Preprint
(3583 KB)
-
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
Journal article(s) based on this preprint
Dánnell Quesada-Chacón et al.
Interactive discussion
Status: closed
-
RC1: 'Comment on gmd-2022-14', Anonymous Referee #1, 27 May 2022
This paper evaluates a suite of related Deep Learning models for downscaling precipitation data over the Ore Mountain region in Germany. In addition to evaluating U-Net and U-Net++ architectures with a number of different options, the authors also explore the use of containerization to enable repeatability of the experiment, and provide a Singularity container with the associated code and environment.
This paper is well-written and of interest to the readers of GMD, and the topic is highly relevant and of value to the scientific community. I have only minor revisions to suggest. It's a good paper, and I enjoyed reading it.
Good work!
Suggested revisions:
The manuscript has too many novel abbreviations. I recommend replacing all of the following with their full expansions, both in the text and in figures:
CC - climate change
TF - transfer function
OM - Ore Mountains
EOM - Eastern Ore Mountains? (This abbreviation is never defined)
SN - Saxony
DD - DresdenI recommend changing the following abbreviations:
IB - international borders - these are obvious from context and can be omitted from the legend
m.a.s.l - meters above sea level - I would call this simply "elevation (m)"Consider also expanding:
SD - statistical downscaling
DL - deep learning
Although these abbreviations are not unknown, I think the text would read more clearly without them.Line 107: change "the Fichtelberg with 1215 m.a.s.l and the Kahleberg (905 m.a.s.l.)" to "the Fichtelberg (elevation 1215 m) and the Kahleberg (elevation 905 m)"
---
URLs should be listed as part of a reference, not included in the text of the paper. For example, on line 132, you could simply write "The code needed to recalculate these results can be found on GitHub (Quesada-Chacon, 2022), with all the modifications..."
Please replace the URLs on lines 113, 132, 198, 212, 214, and 217 (and any others I may have missed) with citations.
---
In section 3.2, please add a definition of the Bernoulli-Gamma loss function used.
---
Line 165: A sentence or two discussing why batch normalization and spatial dropout were included as options (i.e., what effect they have and when one would want to use them) would be valuable to the reader.
---
Although many writers use the passive voice in scientific writing, active writing using the first person is much clearer and easier to understand. The authors could improve the paper by switching to active voice throughout; for example, on line 123, replacing "The reanalysis dataset employed as a predictor is ERA5..." with "For the predictor, we used the ERA5 reanalysis..." makes the text much easier to follow. Since this would require extensive editing, I do not expect the authors to make this change, but mention it merely for the sake of future papers.
---
Figure 4:
I think this figure would be easier to understand if the green-to-orange color bar were reversed, with orange indicating lower values (worse performance) and green indicating higher values (better performance).
---
Line 294: rather than giving Lat & Lon coordinates, I think it would be clearer to point to this location as the southeast corner of the region.
---
Minor corrections:
Line 59: change "1 500" to "1500"
Line 120: change "which leads to a number of 1916 pixels" to "giving a region with 1916 pixels"
Line 123: change "with a spatial resolution" to "which has a spatial resolution"
Line 209: I think you want "interface" here, not "interfere"
Line 225: change "functions" to "function"
Line 227: change "improved significantly" to "significantly improved"
Line 252, 289: change "yet," to "although" (no comma)
Line 323: change "observations" to "failures" (not a correction, but clearer)
Line 334: change "Besides" to "In addition"
Line 380: change "Yet" to "However"
Line 500: "https://doi.org/" has been doubled in this URL
- AC1: 'Reply on RC1', Dánnell Quesada-Chacón, 02 Aug 2022
-
RC2: 'Comment on gmd-2022-14', Anonymous Referee #2, 24 Jun 2022
General comments:
In this paper, the author employed deep learning models to downscale rainfall at a regional scale (1km) over the Eastern Ore Mountains in Saxony. The author used different deep learning algorithms including the state-of-the-art U-Net and U-Net++ models and compared their performance with CNN1 (benchmark). The aim of this work was not only to downscale precipitation but also to explore the repeatability aspect of the downscaling experiment. The findings in this paper are very interesting. In general terms, this paper falls within the scope of this journal, the figures and tables are well organized, and the results are properly discussed. However, the paper needs extensive revision, especially the introduction section.
Specific comments:
Abstract:
The abstract is well written however, some very interesting findings mentioned in the conclusion could also be included in the abstract.
Introduction:
- The introduction requires an extensive language revision.
- The flow of the paragraphs needs to be adjusted.
- The IPCC report citation is wrong, this is the correct citation (IPCC 2021).
- “2011-2020, being the warmest on record.” And “while 2020 tied with 2016 as the hottest” something is not correct here, the author needs to check which years are the hottest (2020 and 2021) or (2020 and 2016).
- “…and simultaneous numerous heat waves”, this sentence should be improved.
- “On a smaller…in Germany in decades.” this sentence is ambiguous and too long. It should be divided into two or three sentences.
- The same remarque for the second paragraph, the first sentence is too long. The author is advised to rewrite it using short and clear sentences.
- “which depending on the application can be a…”, this does not sound correct.
- “which depending on the application can be a…into climate4R (Iturbide et al., 2019).” The author is advised to rewrite this paragraph.
- The references used in paragraph 4 are old, the author might consider exploring recent papers.
- The section where the term “reproducibility” term is explained is too long, the author might consider summarising it.
Data:
- “The raw station data…to Deutsch (1996) for the amounts. This sentence is not clear.
- “https://github.com/dquesadacr/Rep_SDDL“ this link is not accessible.
- The author indicated that the precipitation dataset was used as a predictand, while several variables were considered from the predictor. In the training phase, shouldn’t the author use the same variable from the predictor and the predictand to train the model?
Methods:
- In the caption of Figure 1, The author didn’t mention which variable is considered to calculate the relative bias. Is it precipitation?
- Table 1. Replace “d” with “day”.
- The links provided are not accessible: https://github.com/dquesadacr/Rep_SDDLhttps://bit.ly/ 215 dl-determinism-slides-v3, https://bit.ly/ 215 dl-determinism-slides-v3
- The focus of this work was on precipitation, however, the author also mentioned that several variables are selected from the predictor (zonal and meridional wind, temperature, geopotential, and specific humidity). How did the author use these variables to downscale precipitation?
- It is advised to add another Figure to show the details of the model used (including the resolution of the input and output), the author is referred to check Figure 3 in (Baño-Medina et al., 2019).
Results and discussion
Minor comments:
- “This could be applied to by …” this sentence needs correction, line 285.
- Figure 4. These matrices are calculated on which years, is it the validation period (2010-2015)?
Conclusion
Minor comments:
- The author mentioned that 5 variables from the predictor (ERA5) were used to downscale precipitation, however, in the conclusion, the author stated that 20 variables were used. Which one is correct?
- AC2: 'Reply on RC2', Dánnell Quesada-Chacón, 02 Aug 2022
Peer review completion




Interactive discussion
Status: closed
-
RC1: 'Comment on gmd-2022-14', Anonymous Referee #1, 27 May 2022
This paper evaluates a suite of related Deep Learning models for downscaling precipitation data over the Ore Mountain region in Germany. In addition to evaluating U-Net and U-Net++ architectures with a number of different options, the authors also explore the use of containerization to enable repeatability of the experiment, and provide a Singularity container with the associated code and environment.
This paper is well-written and of interest to the readers of GMD, and the topic is highly relevant and of value to the scientific community. I have only minor revisions to suggest. It's a good paper, and I enjoyed reading it.
Good work!
Suggested revisions:
The manuscript has too many novel abbreviations. I recommend replacing all of the following with their full expansions, both in the text and in figures:
CC - climate change
TF - transfer function
OM - Ore Mountains
EOM - Eastern Ore Mountains? (This abbreviation is never defined)
SN - Saxony
DD - DresdenI recommend changing the following abbreviations:
IB - international borders - these are obvious from context and can be omitted from the legend
m.a.s.l - meters above sea level - I would call this simply "elevation (m)"Consider also expanding:
SD - statistical downscaling
DL - deep learning
Although these abbreviations are not unknown, I think the text would read more clearly without them.Line 107: change "the Fichtelberg with 1215 m.a.s.l and the Kahleberg (905 m.a.s.l.)" to "the Fichtelberg (elevation 1215 m) and the Kahleberg (elevation 905 m)"
---
URLs should be listed as part of a reference, not included in the text of the paper. For example, on line 132, you could simply write "The code needed to recalculate these results can be found on GitHub (Quesada-Chacon, 2022), with all the modifications..."
Please replace the URLs on lines 113, 132, 198, 212, 214, and 217 (and any others I may have missed) with citations.
---
In section 3.2, please add a definition of the Bernoulli-Gamma loss function used.
---
Line 165: A sentence or two discussing why batch normalization and spatial dropout were included as options (i.e., what effect they have and when one would want to use them) would be valuable to the reader.
---
Although many writers use the passive voice in scientific writing, active writing using the first person is much clearer and easier to understand. The authors could improve the paper by switching to active voice throughout; for example, on line 123, replacing "The reanalysis dataset employed as a predictor is ERA5..." with "For the predictor, we used the ERA5 reanalysis..." makes the text much easier to follow. Since this would require extensive editing, I do not expect the authors to make this change, but mention it merely for the sake of future papers.
---
Figure 4:
I think this figure would be easier to understand if the green-to-orange color bar were reversed, with orange indicating lower values (worse performance) and green indicating higher values (better performance).
---
Line 294: rather than giving Lat & Lon coordinates, I think it would be clearer to point to this location as the southeast corner of the region.
---
Minor corrections:
Line 59: change "1 500" to "1500"
Line 120: change "which leads to a number of 1916 pixels" to "giving a region with 1916 pixels"
Line 123: change "with a spatial resolution" to "which has a spatial resolution"
Line 209: I think you want "interface" here, not "interfere"
Line 225: change "functions" to "function"
Line 227: change "improved significantly" to "significantly improved"
Line 252, 289: change "yet," to "although" (no comma)
Line 323: change "observations" to "failures" (not a correction, but clearer)
Line 334: change "Besides" to "In addition"
Line 380: change "Yet" to "However"
Line 500: "https://doi.org/" has been doubled in this URL
- AC1: 'Reply on RC1', Dánnell Quesada-Chacón, 02 Aug 2022
-
RC2: 'Comment on gmd-2022-14', Anonymous Referee #2, 24 Jun 2022
General comments:
In this paper, the author employed deep learning models to downscale rainfall at a regional scale (1km) over the Eastern Ore Mountains in Saxony. The author used different deep learning algorithms including the state-of-the-art U-Net and U-Net++ models and compared their performance with CNN1 (benchmark). The aim of this work was not only to downscale precipitation but also to explore the repeatability aspect of the downscaling experiment. The findings in this paper are very interesting. In general terms, this paper falls within the scope of this journal, the figures and tables are well organized, and the results are properly discussed. However, the paper needs extensive revision, especially the introduction section.
Specific comments:
Abstract:
The abstract is well written however, some very interesting findings mentioned in the conclusion could also be included in the abstract.
Introduction:
- The introduction requires an extensive language revision.
- The flow of the paragraphs needs to be adjusted.
- The IPCC report citation is wrong, this is the correct citation (IPCC 2021).
- “2011-2020, being the warmest on record.” And “while 2020 tied with 2016 as the hottest” something is not correct here, the author needs to check which years are the hottest (2020 and 2021) or (2020 and 2016).
- “…and simultaneous numerous heat waves”, this sentence should be improved.
- “On a smaller…in Germany in decades.” this sentence is ambiguous and too long. It should be divided into two or three sentences.
- The same remarque for the second paragraph, the first sentence is too long. The author is advised to rewrite it using short and clear sentences.
- “which depending on the application can be a…”, this does not sound correct.
- “which depending on the application can be a…into climate4R (Iturbide et al., 2019).” The author is advised to rewrite this paragraph.
- The references used in paragraph 4 are old, the author might consider exploring recent papers.
- The section where the term “reproducibility” term is explained is too long, the author might consider summarising it.
Data:
- “The raw station data…to Deutsch (1996) for the amounts. This sentence is not clear.
- “https://github.com/dquesadacr/Rep_SDDL“ this link is not accessible.
- The author indicated that the precipitation dataset was used as a predictand, while several variables were considered from the predictor. In the training phase, shouldn’t the author use the same variable from the predictor and the predictand to train the model?
Methods:
- In the caption of Figure 1, The author didn’t mention which variable is considered to calculate the relative bias. Is it precipitation?
- Table 1. Replace “d” with “day”.
- The links provided are not accessible: https://github.com/dquesadacr/Rep_SDDLhttps://bit.ly/ 215 dl-determinism-slides-v3, https://bit.ly/ 215 dl-determinism-slides-v3
- The focus of this work was on precipitation, however, the author also mentioned that several variables are selected from the predictor (zonal and meridional wind, temperature, geopotential, and specific humidity). How did the author use these variables to downscale precipitation?
- It is advised to add another Figure to show the details of the model used (including the resolution of the input and output), the author is referred to check Figure 3 in (Baño-Medina et al., 2019).
Results and discussion
Minor comments:
- “This could be applied to by …” this sentence needs correction, line 285.
- Figure 4. These matrices are calculated on which years, is it the validation period (2010-2015)?
Conclusion
Minor comments:
- The author mentioned that 5 variables from the predictor (ERA5) were used to downscale precipitation, however, in the conclusion, the author stated that 20 variables were used. Which one is correct?
- AC2: 'Reply on RC2', Dánnell Quesada-Chacón, 02 Aug 2022
Peer review completion




Journal article(s) based on this preprint
Dánnell Quesada-Chacón et al.
Data sets
Predictors and predictand for "Repeatable high-resolution statistical downscaling through deep learning" Quesada-Chacón, Dánnell https://doi.org/10.5281/zenodo.5809553
Model code and software
Singularity container for "Repeatable high-resolution statistical downscaling through deep learning" Quesada-Chacón, Dánnell https://doi.org/10.5281/zenodo.5809705
dquesadacr/Rep_SDDL: Submission to GMD Quesada-Chacón, Dánnell https://doi.org/10.5281/zenodo.5856118
Dánnell Quesada-Chacón et al.
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
703 | 199 | 17 | 919 | 6 | 3 |
- HTML: 703
- PDF: 199
- XML: 17
- Total: 919
- BibTeX: 6
- EndNote: 3
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
- Preprint
(3583 KB) - Metadata XML