the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
A diffusion-based kernel density estimator (diffKDE, version 1) with optimal bandwidth approximation for the analysis of data in geoscience and ecological research
Maria-Theresia Pelz
Markus Schartau
Christopher J. Somes
Vanessa Lampe
Thomas Slawig
Abstract. Probability density functions (PDFs) comprise basic information about the variability of observed or simulated variables within a system of interest. In geoscience data distributions are often expressed by a parametric estimation of their PDF, such as e.g. a Gaussian distribution. At present there is a growing attention towards the analysis of non-parametric estimation of PDFs, where no prior assumptions about the type of PDF are required. A common tool for such non-parametric estimation is a kernel density estimator (KDE). Existing KDEs are valuable but incomplete, because of the difficulty of specifying optimal bandwidths for the individual kernels. A diffusion-based KDE provides a useful approach to mitigate the difficulty in identifying bandwidths that resolve desired details of multi-modal data while being insensitive to noise. Therefore we designed and developed a new implementation of a diffusion-based KDE as an open source Python tool. We tested our implementation on artificial and real marine biogeochemical data individually and against other popular KDEs. Our estimator is able to detect relevant multiple modes and resolve boundary close data while suppressing details induced by noise and individual outliers. The convergence rate is comparable to the Gaussian estimator, but with a generally smaller error, most notably for small data sets with up to around 5000 data points. We exemplify and discuss the general applicability of such KDEs for data-model comparison in geoscience, in particular for sparse data. We also provide an example for how our approach can be efficiently utilized for the derivation of plankton size spectra in ecological research.
- Preprint
(1692 KB) - Metadata XML
- BibTeX
- EndNote
Maria-Theresia Pelz et al.
Status: final response (author comments only)
-
RC1: 'Comment on gmd-2023-17', Anonymous Referee #1, 26 Jun 2023
The comment was uploaded in the form of a supplement: https://gmd.copernicus.org/preprints/gmd-2023-17/gmd-2023-17-RC1-supplement.pdf
-
RC2: 'Comment on gmd-2023-17', Anonymous Referee #2, 21 Jul 2023
The comment was uploaded in the form of a supplement: https://gmd.copernicus.org/preprints/gmd-2023-17/gmd-2023-17-RC2-supplement.pdf
-
AC1: 'Comment on gmd-2023-17', Maria-Theresia Pelz, 07 Sep 2023
Dear editor
Please find attached our answers to the review comments. The document is organized as followed: The first 15 pages discuss the points made by RC1, the following 9 those by RC2 and the final pages illustrate the proposed changes in the updated version of our manuscript with tracked changes.
Maria-Theresia Pelz et al.
Maria-Theresia Pelz et al.
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
428 | 129 | 15 | 572 | 6 | 7 |
- HTML: 428
- PDF: 129
- XML: 15
- Total: 572
- BibTeX: 6
- EndNote: 7
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1