Randomized Block Nonparametric Temporal Disaggregation of Hydrological Variables RB-NPD (version1.0) &ndash; model development

Lee, Taesam; Ouarda, Taha B. M. J.

doi:10.5194/gmd-2022-274

Preprints

https://doi.org/10.5194/gmd-2022-274

Preprints

Submitted as: model description paper

26 Jan 2023

Submitted as: model description paper |

| 26 Jan 2023

Status: this preprint was under review for the journal GMD. A final paper is not foreseen.

Randomized Block Nonparametric Temporal Disaggregation of Hydrological Variables RB-NPD (version1.0) – model development

Taesam Lee and Taha B. M. J. Ouarda

Abstract. Stochastically simulated data have been employed for hydrological variables in critical water-related risk management. The simulated data can be utilized to assess the existing flood protection structure and future mitigation frameworks. Disaggregation of the simulated annual data to a lower time scale is often required since water resource management and flood mitigation plans should be done in a fine scale such as a monthly or quarter-monthly. In the current study, the randomized random block length was proposed for the nonparametric disaggregation model since one of the major weakness points for the nonparametric disaggregation model is repetition of similar patterns in the disaggregated data. Furthermore, long-term dependence structure was also mainly focused to preserve since consistent high-flow results devastating damages to inundated area. The proposed model was compared with the existing parametric and nonparametric disaggregation models. The annual net basin supplies (NBS) of the Lake Champlain–Richelieu River (LCRR) Basin was employed to test the performance of the proposed model by reproducing the critical statistics of the 2011 flood in the LCRR Basin. The 2011 flood occurred and was sustained for a few months. The results show that the existing parametric and nonparametric models have limitations and shortcoming and do not provide sufficient temporal dependence. In contrast, the proposed random block-based nonparametric disaggregation (RB-NPD) model with further model enhancement by the genetic algorithm mixture illustrates that the proposed RB-NPD model can be a comparable alternative and that its enhancement is suitable for disaggregating the annual NBS data for the LCRR Basin.

This preprint has been withdrawn.

Received: 11 Nov 2022 – Discussion started: 26 Jan 2023

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Preprint (PDF, 4098 KB)

Withdrawal notice
This preprint has been withdrawn.
Preprint (4098 KB)

Download & links

This preprint has been withdrawn.

Taesam Lee and Taha B. M. J. Ouarda

Interactive discussion

Status: closed

RC1:
'Comment on gmd-2022-274', Anonymous Referee #1, 30 Mar 2023
I received the manuscript as reviewer for the first time. The manuscript deals with a modification of a disaggregation scheme to improve the lag-1 autocorrelation at annual crossings and to increase the randomness of the disaggregated time series (less repeating patterns). This modification is studied with a focus on a single flood in (probably) Canada. I’m not too familiar with the applied disaggregation models but I can follow the methods introduced and see the differences between the versions. I have to disappoint the authors, I cannot recommend a publication of the manuscript. This decision is based mainly on the missing novelty and only partly on the manuscript style. Please find my major comments below, followed by more specific comments.
First of all a positive thing: I like the validation of a modified approach by comparing it to a number of existing methods. Often authors choose the method for comparisons which is expected to perform worst, which is not the case here. However, the authors fail to convince me about the novelty of their approach. I’m missing the big step forward for science. Neither in the abstract nor in the summary a clear message is transported why to use this method instead of others. Finding ‘a comparable alternative’ is not worth to publish in my opinion. Also, if there is a step forward it is hidden in the 20+ figures the authors have used. The missing novelty is my main reason for rejecting this manuscript. Also, limit the manuscript to 10 figures to keep a possible read feasible.

The structure of the manuscript is confusing. Sect. 2. ‘Mathematical background’ is followed by Sect. 3 ‘Model development’ and Sect. 4.2 ‘Application methodology’. It takes some cross-reads to find the final applied methods. Why not using a classical/conservative structure (Intro/Data/Methods/Results/Disc+Summary)? I don’t see any benefit for this manuscript resulting from the uncommon structure.

The authors are writing about ‘hydrological data’ and ‘hydrological variables’. As far as I can see the manuscript is about runoff, but this is mentioned nowhere (figures with absolute values are missing units). The developed method will not work for all hydrological variables (e.g. rainfall has higher intermittency), so please be as concise as possible.

Specific comments
General: Please use for resolution characterization the terms ‘fine’ and ‘coarse’ instead of ‘high’ and ‘low’. The latter two can be confusing. Monthly values are not of ‘high’ resolution for all hydrologists, some treat even daily values as ‘coarse’ (e.g. urban hydrology)
Abstract: Novelty is missing, main result is missing. Study area is missing (French-sounding catchment, could be in Europe, Africa, North America, Asia). In one figure I see Quebec, so I guess it’s Canadian. This should be mentioned, along with a hydroclimatic characterization to enable drawbacks if the method works in a different climate or not).
L72-79 The project is irrelevant for the scientific study, it should be a step forward for hydrological science in general, not a project report.
Eq. 9 What are j and J?
Eq. 10&11 What is the influence of the mutation on the inter-annual correlation? The authors point out at several points that it is important to increase lag-1 autocorrelation on New Years Eve, but should not focus on lag-1 only. There are several studies that show that good representations of the lag-1 autocorrelation could have high deviations of lag-2 or longer autocorrelations.
L250 ‘chosen to popularity’ Please state only scientific reasons.
L272 Why 200 realisations? Please reason this choice, so that others can find the methodologic-identical number for their catchment.
References: From 27 publications (very short list for the number of applied methods, indicating a too rough literature review and no in-depth discussion of the results) the authors are involved in 11 publications. From the remaining 16 references the latest were published in 2011 and 2004. This reference is not state-of-science and highly biased, claiming that there are no other researchers active in this scientific field, which is not the case.
Citation: https://doi.org/10.5194/gmd-2022-274-RC1
RC2:
'Comment on gmd-2022-274', Anonymous Referee #2, 04 Apr 2023
In this study, the authors present a disaggregation approach to disaggregating annual streamflow data while capturing the interannual (monthly) relationship. This in particular address the weakness of existing parametric and nonparametric models, specifically the repetition of similar patterns in disaggregated data and the lack of sufficient temporal dependence. The proposed model is tested using the annual net basin supplies of the Lake Champlain–Richelieu River Basin to reproduce the critical statistics of the 2011 flood.
Undoubtedly, the authors have undertaken a significant amount of work to develop a model, write code, and apply it to a catchment. However, there are some underlying issues in its current form. Firstly, the novelty of the approach is only incremental, which may limit its impact in the field. Additionally, the introduction lacks focus and clarity, making it challenging to discern the study’s motivation. Furthermore, the description of the key results and the discussion thereof are brief. While significant work needs to be undertaken to make it ready for publication, I cannot recommend for publication in its current form. The general and some specific comments are below.
General comments

In general, I do not find any issue inherently with the statistical approach employed. However, since the work is incremental, I expected a stronger justification for the utility or application of the approach, which was not clearly presented in the paper.

The overall motivation is not stated clearly in the introduction section. It was not until I reached the data description section that I could understand the actual problem the paper aims to solve. After that, I figured out that the disaggregation would be performed on the simulated stochastic series at an annual scale. The introduction section could have been clearer in explaining the purpose of the study. In the introduction section, the term “hydrological variable” is used frequently, but the focus is solely on streamflow. Similarly, there are few instances where the 2011 flood was mentioned. When the flood and high-resolution data were mentioned, I was thinking of sub-daily scales which was not clearly the focus of the paper.

The paper spends too much time discussing some of the methods which have already been shown in multiple literatures to have the limitations mentioned in the paper. Given that the proposed approach in the manuscript aims to improve auto-correlation, a summary of the evaluation of all other approaches in relation to auto-correlation would have a greater impact. The results and the corresponding discussion are brief. For example, Figures 20 and 21 need some more explanation on what the figure conveys in general, and what it means for the current study.

In the paper, much focus has been given to the 2011 flood. As a model development paper, the focus should be on the method and the 2011 flood event should be primarily used for validation purposes.

A critical issue in the paper is the lack of a literature review. There are a variety of disaggregation approaches that are used in many applications, from daily to sub-daily, monthly to daily, annual to monthly, and in streamflow, rainfall, and other hydrometeorological variables. However, the references cited in this paper are very limited. Moreover, almost all of the cited literature is outdated, with only a few papers on the reference list published after 2010. Additionally, a significant proportion of the references are by the authors themselves.

In addition to making the model code available through Mendeley data, it would be beneficial to include a basic description of the code in the paper. The basic information such that: it was written in MATLAB, can be run in MATLAB and Octave would help the potential readers. Additionally, it would be easier to navigate the code if there were comments in the code, especially in the description of each function.

There are too many figures in the paper, and some of them are redundant. For example, Figures 4 - 6 convey a similar message and do not provide any additional information. These can be combined into one figure that supports the overall story and the rest could be added to the supplement. Similarly, Figures 4 and 8 are almost identical other than “Lag-1 autocorrelation” for month 1. Rather than repeating similar metrics that are not crucial to the paper’s message, including only relevant information in the figures would help readers focus on the core message of the paper.

Specific comments.
Data description- I suggest adding a panel to Figure 3 that shows monthly seasonality and variability.
L72-79 – considering this is a data description paper, the context of the project feels out of place.
L272 – is the number of series generated enough for convergence of the statistics presented in the study.
L276-278 – I do not think the whiskers always extend to 1.5 IQR. Rather they extend to the lesser of 1.5 IQR or the farthest data point.
L332, L346. What is the reason for the underestimation in months 5 and 6? If it is the gamma function, why is it not impacting other months?
Almost all the Figures – Spell out the axis labels properly. Also, add units in the parenthesis. For eg. “Months” instead of “mon”, “Minimum (mm)” instead of “min”, “Lag-1 autocorrelation” instead of “acf1” etc. I think almost all of the figures should be revisited.
Citation: https://doi.org/10.5194/gmd-2022-274-RC2

Interactive discussion

Status: closed

RC1:
'Comment on gmd-2022-274', Anonymous Referee #1, 30 Mar 2023
I received the manuscript as reviewer for the first time. The manuscript deals with a modification of a disaggregation scheme to improve the lag-1 autocorrelation at annual crossings and to increase the randomness of the disaggregated time series (less repeating patterns). This modification is studied with a focus on a single flood in (probably) Canada. I’m not too familiar with the applied disaggregation models but I can follow the methods introduced and see the differences between the versions. I have to disappoint the authors, I cannot recommend a publication of the manuscript. This decision is based mainly on the missing novelty and only partly on the manuscript style. Please find my major comments below, followed by more specific comments.
First of all a positive thing: I like the validation of a modified approach by comparing it to a number of existing methods. Often authors choose the method for comparisons which is expected to perform worst, which is not the case here. However, the authors fail to convince me about the novelty of their approach. I’m missing the big step forward for science. Neither in the abstract nor in the summary a clear message is transported why to use this method instead of others. Finding ‘a comparable alternative’ is not worth to publish in my opinion. Also, if there is a step forward it is hidden in the 20+ figures the authors have used. The missing novelty is my main reason for rejecting this manuscript. Also, limit the manuscript to 10 figures to keep a possible read feasible.

The structure of the manuscript is confusing. Sect. 2. ‘Mathematical background’ is followed by Sect. 3 ‘Model development’ and Sect. 4.2 ‘Application methodology’. It takes some cross-reads to find the final applied methods. Why not using a classical/conservative structure (Intro/Data/Methods/Results/Disc+Summary)? I don’t see any benefit for this manuscript resulting from the uncommon structure.

The authors are writing about ‘hydrological data’ and ‘hydrological variables’. As far as I can see the manuscript is about runoff, but this is mentioned nowhere (figures with absolute values are missing units). The developed method will not work for all hydrological variables (e.g. rainfall has higher intermittency), so please be as concise as possible.

Specific comments
General: Please use for resolution characterization the terms ‘fine’ and ‘coarse’ instead of ‘high’ and ‘low’. The latter two can be confusing. Monthly values are not of ‘high’ resolution for all hydrologists, some treat even daily values as ‘coarse’ (e.g. urban hydrology)
Abstract: Novelty is missing, main result is missing. Study area is missing (French-sounding catchment, could be in Europe, Africa, North America, Asia). In one figure I see Quebec, so I guess it’s Canadian. This should be mentioned, along with a hydroclimatic characterization to enable drawbacks if the method works in a different climate or not).
L72-79 The project is irrelevant for the scientific study, it should be a step forward for hydrological science in general, not a project report.
Eq. 9 What are j and J?
Eq. 10&11 What is the influence of the mutation on the inter-annual correlation? The authors point out at several points that it is important to increase lag-1 autocorrelation on New Years Eve, but should not focus on lag-1 only. There are several studies that show that good representations of the lag-1 autocorrelation could have high deviations of lag-2 or longer autocorrelations.
L250 ‘chosen to popularity’ Please state only scientific reasons.
L272 Why 200 realisations? Please reason this choice, so that others can find the methodologic-identical number for their catchment.
References: From 27 publications (very short list for the number of applied methods, indicating a too rough literature review and no in-depth discussion of the results) the authors are involved in 11 publications. From the remaining 16 references the latest were published in 2011 and 2004. This reference is not state-of-science and highly biased, claiming that there are no other researchers active in this scientific field, which is not the case.
Citation: https://doi.org/10.5194/gmd-2022-274-RC1
RC2:
'Comment on gmd-2022-274', Anonymous Referee #2, 04 Apr 2023
In this study, the authors present a disaggregation approach to disaggregating annual streamflow data while capturing the interannual (monthly) relationship. This in particular address the weakness of existing parametric and nonparametric models, specifically the repetition of similar patterns in disaggregated data and the lack of sufficient temporal dependence. The proposed model is tested using the annual net basin supplies of the Lake Champlain–Richelieu River Basin to reproduce the critical statistics of the 2011 flood.
Undoubtedly, the authors have undertaken a significant amount of work to develop a model, write code, and apply it to a catchment. However, there are some underlying issues in its current form. Firstly, the novelty of the approach is only incremental, which may limit its impact in the field. Additionally, the introduction lacks focus and clarity, making it challenging to discern the study’s motivation. Furthermore, the description of the key results and the discussion thereof are brief. While significant work needs to be undertaken to make it ready for publication, I cannot recommend for publication in its current form. The general and some specific comments are below.
General comments

In general, I do not find any issue inherently with the statistical approach employed. However, since the work is incremental, I expected a stronger justification for the utility or application of the approach, which was not clearly presented in the paper.

The overall motivation is not stated clearly in the introduction section. It was not until I reached the data description section that I could understand the actual problem the paper aims to solve. After that, I figured out that the disaggregation would be performed on the simulated stochastic series at an annual scale. The introduction section could have been clearer in explaining the purpose of the study. In the introduction section, the term “hydrological variable” is used frequently, but the focus is solely on streamflow. Similarly, there are few instances where the 2011 flood was mentioned. When the flood and high-resolution data were mentioned, I was thinking of sub-daily scales which was not clearly the focus of the paper.

The paper spends too much time discussing some of the methods which have already been shown in multiple literatures to have the limitations mentioned in the paper. Given that the proposed approach in the manuscript aims to improve auto-correlation, a summary of the evaluation of all other approaches in relation to auto-correlation would have a greater impact. The results and the corresponding discussion are brief. For example, Figures 20 and 21 need some more explanation on what the figure conveys in general, and what it means for the current study.

In the paper, much focus has been given to the 2011 flood. As a model development paper, the focus should be on the method and the 2011 flood event should be primarily used for validation purposes.

A critical issue in the paper is the lack of a literature review. There are a variety of disaggregation approaches that are used in many applications, from daily to sub-daily, monthly to daily, annual to monthly, and in streamflow, rainfall, and other hydrometeorological variables. However, the references cited in this paper are very limited. Moreover, almost all of the cited literature is outdated, with only a few papers on the reference list published after 2010. Additionally, a significant proportion of the references are by the authors themselves.

In addition to making the model code available through Mendeley data, it would be beneficial to include a basic description of the code in the paper. The basic information such that: it was written in MATLAB, can be run in MATLAB and Octave would help the potential readers. Additionally, it would be easier to navigate the code if there were comments in the code, especially in the description of each function.

There are too many figures in the paper, and some of them are redundant. For example, Figures 4 - 6 convey a similar message and do not provide any additional information. These can be combined into one figure that supports the overall story and the rest could be added to the supplement. Similarly, Figures 4 and 8 are almost identical other than “Lag-1 autocorrelation” for month 1. Rather than repeating similar metrics that are not crucial to the paper’s message, including only relevant information in the figures would help readers focus on the core message of the paper.

Specific comments.
Data description- I suggest adding a panel to Figure 3 that shows monthly seasonality and variability.
L72-79 – considering this is a data description paper, the context of the project feels out of place.
L272 – is the number of series generated enough for convergence of the statistics presented in the study.
L276-278 – I do not think the whiskers always extend to 1.5 IQR. Rather they extend to the lesser of 1.5 IQR or the farthest data point.
L332, L346. What is the reason for the underestimation in months 5 and 6? If it is the gamma function, why is it not impacting other months?
Almost all the Figures – Spell out the axis labels properly. Also, add units in the parenthesis. For eg. “Months” instead of “mon”, “Minimum (mm)” instead of “min”, “Lag-1 autocorrelation” instead of “acf1” etc. I think almost all of the figures should be revisited.
Citation: https://doi.org/10.5194/gmd-2022-274-RC2

Taesam Lee and Taha B. M. J. Ouarda

Viewed

Total article views: 1,509 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
1,184	256	69	1,509	84	107

HTML: 1,184
PDF: 256
XML: 69
Total: 1,509
BibTeX: 84
EndNote: 107

Views and downloads (calculated since 26 Jan 2023)

Month	HTML	PDF	XML	Total
Jan 2023	62	8	3	73
Feb 2023	66	10	1	77
Mar 2023	36	12	2	50
Apr 2023	25	13	2	40
May 2023	4	2	0	6
Jun 2023	18	3	0	21
Jul 2023	25	6	0	31
Aug 2023	26	7	0	33
Sep 2023	35	6	1	42
Oct 2023	30	5	0	35
Nov 2023	9	3	0	12
Dec 2023	28	4	2	34
Jan 2024	21	4	2	27
Feb 2024	26	8	3	37
Mar 2024	33	9	2	44
Apr 2024	35	5	6	46
May 2024	34	5	4	43
Jun 2024	34	4	3	41
Jul 2024	14	2	0	16
Aug 2024	17	4	1	22
Sep 2024	19	1	0	20
Oct 2024	10	0	10
Nov 2024	30	2	1	33
Dec 2024	16	5	0	21
Jan 2025	13	2	1	16
Feb 2025	13	4	0	17
Mar 2025	14	4	3	21
Apr 2025	8	8	0	16
May 2025	11	10	2	23
Jun 2025	10	5	5	20
Jul 2025	10	6	0	16
Aug 2025	35	10	1	46
Sep 2025	266	9	1	276
Oct 2025	38	16	4	58
Nov 2025	23	18	4	45
Dec 2025	14	13	4	31
Jan 2026	17	6	6	29
Feb 2026	28	4	2	34
Mar 2026	31	13	3	47

Cumulative views and downloads (calculated since 26 Jan 2023)

Month	HTML	PDF	XML	Total
Jan 2023	62	8	3	73
Feb 2023	66	10	1	77
Mar 2023	36	12	2	50
Apr 2023	25	13	2	40
May 2023	4	2	0	6
Jun 2023	18	3	0	21
Jul 2023	25	6	0	31
Aug 2023	26	7	0	33
Sep 2023	35	6	1	42
Oct 2023	30	5	0	35
Nov 2023	9	3	0	12
Dec 2023	28	4	2	34
Jan 2024	21	4	2	27
Feb 2024	26	8	3	37
Mar 2024	33	9	2	44
Apr 2024	35	5	6	46
May 2024	34	5	4	43
Jun 2024	34	4	3	41
Jul 2024	14	2	0	16
Aug 2024	17	4	1	22
Sep 2024	19	1	0	20
Oct 2024	10	0	10
Nov 2024	30	2	1	33
Dec 2024	16	5	0	21
Jan 2025	13	2	1	16
Feb 2025	13	4	0	17
Mar 2025	14	4	3	21
Apr 2025	8	8	0	16
May 2025	11	10	2	23
Jun 2025	10	5	5	20
Jul 2025	10	6	0	16
Aug 2025	35	10	1	46
Sep 2025	266	9	1	276
Oct 2025	38	16	4	58
Nov 2025	23	18	4	45
Dec 2025	14	13	4	31
Jan 2026	17	6	6	29
Feb 2026	28	4	2	34
Mar 2026	31	13	3	47

Viewed (geographical distribution)

Total article views: 1,482 (including HTML, PDF, and XML) Thereof 1,482 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 26 Mar 2026

Short summary

The current study proposed random block based nonparametric disaggregation model so that the weakness point of the existing nonparametric disaggregation models can be resolved with preserving the long-term persistence. The proposed model illustrates superior performance for disaggregating the net basin supply of the LCRR basin in the Great Lakes, which experienced the worst flood in 2011.


Total:	0
HTML:	0
PDF:	0
XML:	0