Articles | Volume 18, issue 24
https://doi.org/10.5194/gmd-18-10185-2025
© Author(s) 2025. This work is distributed under the Creative Commons Attribution 4.0 License.
Estimation of local training data point densities to support the assessment of spatial prediction uncertainty
Download
- Final revised paper (published on 19 Dec 2025)
- Preprint (discussion started on 14 Nov 2024)
Interactive discussion
Status: closed
Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor
| : Report abuse
-
RC1: 'Comment on egusphere-2024-2730', Anonymous Referee #1, 07 Dec 2024
- AC2: 'Reply on RC1', Fabian Schumacher, 20 May 2025
-
CEC1: 'Comment on egusphere-2024-2730 - No compliance with the policy of the journal', Juan Antonio Añel, 08 Dec 2024
-
AC1: 'Reply on CEC1', Fabian Schumacher, 11 Dec 2024
- CEC2: 'Reply on AC1', Juan Antonio Añel, 12 Dec 2024
-
AC1: 'Reply on CEC1', Fabian Schumacher, 11 Dec 2024
-
RC2: 'Comment on egusphere-2024-2730', Anonymous Referee #2, 12 May 2025
- AC3: 'Reply on RC2', Fabian Schumacher, 20 May 2025
-
RC3: 'Comment on egusphere-2024-2730', Anonymous Referee #3, 21 May 2025
- AC4: 'Reply on RC3', Fabian Schumacher, 29 May 2025
Peer review completion
AR – Author's response | RR – Referee report | ED – Editor decision | EF – Editorial file upload
AR by Fabian Schumacher on behalf of the Authors (25 Jun 2025)
Author's response
Author's tracked changes
Manuscript
ED: Referee Nomination & Report Request started (05 Sep 2025) by Yongze Song
RR by Anonymous Referee #2 (12 Sep 2025)
RR by Anonymous Referee #1 (16 Sep 2025)
ED: Publish as is (18 Sep 2025) by Yongze Song
AR by Fabian Schumacher on behalf of the Authors (20 Oct 2025)
Thanks for the opportunity to review this article. The manuscript "Estimation of local training data point densities to support the assessment of spatial prediction uncertainty" is presented with a new methodology design to improve the spatial prediction quality of the ML model by further considering the data point density. Overall, the work is solid enough at the publication level (with very high potential); the presentation quality (writing, data visualization, etc) is good; study background, research aim, and methodology introduction are clear; the methodology has the potential to stimulate further spatial interpolation method innovation. However, I have the following major concerns and would recommend a 'major revision' for this work.
Major scientific concerns (need to explain or clarify or do more quantitative work):
1. It looks like the LPD is designed for very large-scale spatial interpolation. However, given the case shown in Europe, and South America, sampling point density is subjected to the real-world distribution of stations (i.e. the location of record stations). Thus, the point density pattern is always fixed. Does LPD also work for non-fixed sampling locations for more realistic cases? I noticed that you have presented the random pattern distribution in the simulation study, while such a random pattern distribution at that large spatial scale might not be realistic due to the physical condition. Is it possible that LPD can work for smaller-scale cases (in-situ sampling for soil, for instance)?
2. The ML model in this article is a random forest (RF). RF is indeed intensively used for spatial interpolation and mapping. However, one major concern is how spatial patterns are fed as useful information in RF? How does RF understand point density or use point density to improve the model accuracy? Within the mechanism of RF and other ML models (tree shape for instance), how the tree shape is changed by integrating LPD? Does it necessarily lead to model improvement?
3. Another major concern is that the research significance / contribution (LPD+RF) is not linked directly with major spatial interpolationmethods (especially traditional methods like Kriging), given the content presented. The main contribution is shown as improvement from LPD after comparison with DI only.
4. It is scientists' common knowledge that no method is designed with suitability for all cases. Readers and I also want to see under what kind of spatial cases (patterns, scales, etc.) or what kind of scenarios the LPD model is more powerful than others? Under what kind of cirsumstences, that LPD can be improved (in discussion, future work)
Minor issues:
1. Please check the grammar issues throughout the article in the next version. Here are some examples of minor errors: "at a spatial resolution of 10 minutes" (Line 106); " with increasing dimensionality the computational and memory effort increases drastically." (Line 54, a little hard to understand)
2. There is missing information on your all 19 bio variables. I think it would be great that a summary table can be provided to demonstrate these variable information.