Preprints
https://doi.org/10.5194/gmd-2021-174
https://doi.org/10.5194/gmd-2021-174

Submitted as: model description paper 27 Jul 2021

Submitted as: model description paper | 27 Jul 2021

Review status: this preprint is currently under review for the journal GMD.

SciKit-GStat 1.0: A SciPy flavoured geostatistical variogram estimation toolbox written in Python

Mirko Mälicke Mirko Mälicke
  • Institute for Water and River Basin Management, Karlsruhe Institute of Technology (KIT), Germany

Abstract. Geostatistical methods are widely used in almost all geoscientific disciplines, i.e. for interpolation, re-scaling, data assimilation or modelling. At its core geostatistics aims to detect, quantify, describe, analyze and model spatial covariance of observations. The variogram, a tool to describe this spatial covariance in a formalized way, is at the heart of every such method. Unfortunately, many applications of geostatistics rather focus on the interpolation method or the result, than the quality of the estimated variogram. Not least because estimating a variogram is commonly left as a task for computers and some software implementations do not even show a variogram to the user. This is a miss, because the quality of the variogram largely determines, whether the application of geostatistics makes sense at all. Furthermore, the Python programming language was missing a mature, well-established and tested package for variogram estimation a couple of years ago.

Here I present SciKit-GStat, an open source Python package for variogram estimation, that fits well into established frameworks for scientific computing and puts the focus on the variogram before more sophisticated methods are about to be applied. SciKit-GStat is written in a mutable, object-oriented way that mimics the typical geostatistical analysis workflow. Its main strength is the ease of usage and interactivity and it is therefore usable with only a little or even no knowledge in Python. During the last few years, other libraries covering geostatistics for Python developed along with SciKit-GStat. Today, the most important ones can be interfaced by SciKit-GStat. Additionally, established data structures for scientific computing are reused internally, to keep the user from learning complex data models, just for using SciKit-GStat. Common data structures along with powerful interfaces enable the user to use SciKit-GStat along with other packages in established workflows, rather than forcing the user to stick to the authors programming paradigms.

SciKit-GStat ships with a large number of predefined procedures, algorithms and models, such as variogram estimators, theoretical spatial models or binning algorithms. Common approaches to estimate variograms are covered and can be used out of the box. At the same time, the base class is very flexible and can be adjusted to less common problems, as well. Last but not least, it was made sure, that a user is aided at implementing new procedures, or even extending the core functionality as much as possible, to extend SciKit-GStat to uncovered use-cases. With broad documentation, user guide, tutorials and good unit-test coverage, SciKit-GStat enables the user to focus on variogram estimation, rather than implementation details.

Mirko Mälicke

Status: open (until 19 Oct 2021)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse

Mirko Mälicke

Model code and software

mmaelicke/scikit-gstat: A scipy flavoured geostatistical variogram analysis toolbox Mirko Mälicke, Egil Möller, Helge David Schneider, Sebastian Müller https://doi.org/10.5281/zenodo.4835779

SciKit-GStat companion code Mirko Mälicke https://doi.org/10.5281/zenodo.4817675

Mirko Mälicke

Viewed

Total article views: 437 (including HTML, PDF, and XML)
HTML PDF XML Total BibTeX EndNote
335 97 5 437 3 3
  • HTML: 335
  • PDF: 97
  • XML: 5
  • Total: 437
  • BibTeX: 3
  • EndNote: 3
Views and downloads (calculated since 27 Jul 2021)
Cumulative views and downloads (calculated since 27 Jul 2021)

Viewed (geographical distribution)

Total article views: 341 (including HTML, PDF, and XML) Thereof 341 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 
Latest update: 28 Sep 2021
Download
Short summary
I preset SciKit-GStat, a well-documented and tested Python package for variogram estimation. The variogram is the core means of geostatistics, which almost all other methods rely on. Geostatistical interpolation and field generation are widely spread in geoscience, ie. for data assimilation or modeling. While SciKit-GStat focuses on effective and intuitive variogram estimation, it can interface with other prominent packages and make its variograms available for a multitude of methods.