The design, deployment, and testing of kriging models in GEOframe with SIK-0.9.8

. This work presents a software package for the interpolation of climatological variables, such as temperature and precipitation, using kriging techniques. The purposes of the paper are (1) to present a geostatistical software that is easy to use and easy to plug in to a hydrological model; (2) to provide a practical example of an accurately designed software from the perspective of reproducible research; and (3) to demonstrate the goodness of the results of the software and so have a reliable alternative to other, more traditional tools. A total of 11 types of theoretical semivariograms and four types of kriging were implemented and gathered into Object Modeling System-compliant components. The package provides real-time optimization for semivariogram and kriging parameters. The software was tested using a year’s worth of hourly temperature readings and a rain storm event (11 h) recorded in 2008 and retrieved from 97 meteorological stations in the Isarco River basin, Italy. For both the variables, good interpolation results were obtained and then compared to the results from the R package gstat.

Abstract. This work presents a software package for the interpolation of climatological variables, such as temperature and precipitation, using kriging techniques. The purposes of the paper are (1) to present a geostatistical software that is easy to use and easy to plug in to a hydrological model; (2) to provide a practical example of an accurately designed software from the perspective of reproducible research; and (3) to demonstrate the goodness of the results of the software and so have a reliable alternative to other, more traditional tools. A total of 11 types of theoretical semivariograms and four types of kriging were implemented and gathered into Object Modeling System-compliant components. The package provides real-time optimization for semivariogram and kriging parameters. The software was tested using a year's worth of hourly temperature readings and a rain storm event (11 h) recorded in 2008 and retrieved from 97 meteorological stations in the Isarco River basin, Italy. For both the variables, good interpolation results were obtained and then compared to the results from the R package gstat.

Introduction
Meteorological forcing data such as rainfall, temperature, and solar radiation are the dominant controlling factors for the hydrological cycle, energy balance, and ecosystem processes (Ly et al., 2013). These data, in addition to being important themselves, are the natural input to distributed and semi-distributed hydrological models. Their quality and precision affect the accuracy of results (Xu and Singh, 1998;Stooksbury et al., 1999;Balme et al., 2006;Abera et al., 2017). In fact, all the surface water models require a reliable precipitation dataset that is complete both in space and in time. Quite often, however, datasets of hydrological variables suffer from errors and missing data; therefore, filling the gaps in time series by estimating the missing values is a common approach to solving this problem (Eischeid et al., 2000;Saghafian and Bondarabadi, 2008;Di Piazza et al., 2011;Adhikary et al., 2015). Several algorithms for the spatial interpolation of meteorological data are available in the literature: Thiessen polygons (e.g., Thiessen, 1911;WMO, 1994), inverse distance methods (Ly et al., 2013), interpolation with splines (e.g., Mitášová and Mitáš, 1993;Hutchinson, 1995), kriging (e.g., Krige, 1951;Matheron, 1981;Goovaerts, 1997), and other types of interpolation (e.g., Robeson, 1992;Li and Heap, 2011, and references therein). Their performances have been assessed by several authors, among others Tabios and Salas (1985) and Jarvis and Stuart (2001), who have concluded that kriging is one of the best techniques for the spatial interpolation of climatological variables. More specifically, Creutin and Obled (1982) and Tabios and Salas (1985) demonstrated that for monthly rainfall and storm totals kriging is preferable to other rainfall interpolation methods. Goovaerts (2000), Lloyd (2005), Basistha et al. (2008), and Ly et al. (2011) confirmed these results.
Generally, kriging can be applied to a wide range of datasets (e.g., Stahl et al., 2006;Phillips et al., 1992), and allows for the estimation of the variance of interpolated quantities (e.g., Li and Heap, 2011). The interpolation can be im-proved with the use of auxiliary variables, such as terrainrelated parameters (e.g., relief, slope, and aspect), as investigated in Attorre et al. (2007). Not surprisingly, Carrera-Hernández and Gaskin (2007) found that the use of elevation as a secondary variable improves temperature prediction.
However, kriging can be computationally more demanding than other interpolation techniques. To overcome this problem, most applications that implement kriging interpolators use either long time series with long time steps, such as daily, (Verfaillie et al., 2006;Buytaert et al., 2006), monthly, or yearly time steps (Hevesi et al., 1992;Goovaerts, 2000;Boer et al., 2001;Todini, 2001), or short time series with shorter time steps (such as rainfall events) (e.g., Haberlandt, 2007). Having tools that implement efficient computations could help to extend the interpolation method to real-time processes.
Based on these premises, we set ourselves two objectives with this work. The first was to provide an efficient and precise tool for spatial estimations and interpolation of environmental quantities. The second was to make use of an implementing strategy that favors the usability of the software, its maintenance, its inspection, and its extension and, hopefully, makes scientific work easier. This second goal comes under the contemporary efforts to promote open science (e.g., https://www.fosteropenscience.eu/, last access: 7 June 2018). However, in order to maintain the right focus, we will not discuss the open-science aspects and philosophy directly in this paper; rather we will present the kriging software and its design.
The Spatial Interpolation Kriging package (version 0.9.8) (GEOframe-SIK, henceforth simply SIK) is presented here. It is a package that makes estimates of any spatially distributed environmental data at hourly steps (or sub-hourly when it is reasonable). SIK is designed according to the Object Modeling System v3 (OMS3) framework ; as such it is compatible with the GEOframe-NewAGE system (Formetta et al., 2014;Bancheri, 2017). As a consequence, the package can be integrated with other GEOframe-NewAGE components and connected to them at run time to form a variety of modeling solutions (MSs). In this work, the SIK package is presented as four components: 1. The first is used for the production of the experimental semivariograms.
2. The second is used for the production of the theoretical semivariograms.
3. The third is used for the kriging interpolation.
4. The last is used for an automatic and easy jackknife resampling to assess the error of estimates.
SIK inherits some previous code used, for instance, in Formetta et al. (2014) and Abera et al. (2017). In particular Abera et al. (2017) assessed the effects of interpolation of precipitation on long-term mean annual runoff.
To make the code more flexible, easily extensible, and maintainable, SIK was completely refactored and a systematic use of design patterns (DPs) (Gamma, 1994;Freeman et al., 2004) was introduced.
-Dakota (Surfpack). C++ software with flexible interface that provides optimization algorithms, uncertainty quantification, parameter estimation, and sensitivity analysis for supporting computational models and simulators (Adams et al., 2009).
-PyKrige. Python package that performs 2-D and 3-D ordinary and universal kriging computation with flexible design for custom variogram implementation (Murphy, 2014).
gstat. R package (computational core coded in C) that supports block kriging, simple, ordinary, and universal (co)kriging, and many other features (Pebesma, 2004), (Gräler et al., 2016). It is historically the leading software in this field.
While GIS-based tools, such as QGIS and GRASS (v.kriging) kriging, are easily included into scripts leveraging GIS capabilities, they are not easily included into complicated MSs. As well as being open source, SIK is the only Java-based and component-based software of those mentioned above. Moreover, it implements a quick way to plug in to hydrological models and automatic calibration algorithms. We decided to compare the performances of SIK and R gstat since the latter is one of the most widely used tools in the scientific community.
The present paper is organized as follows. First, some preliminary information on kriging interpolation is given in Sect. 2. Then the structure of the package and the informatics are presented in Sect. 3. Section 4 describes the study area and the experimental setup. The results of the application of the SIK package to temperature and rainfall datasets are discussed in Sect. 5. Finally, a comparison of results with the interpolation of both datasets obtained with R gstat is presented in Sect. 6.

Algorithms required for kriging
Kriging is a group of geostatistical techniques used to interpolate the value of random fields based on spatial autocorrelation of measured data (Isaaks and Srivastava, 1989;Goovaerts, 1997;Kitanidis, 1997). The theory is briefly summarized in Appendix A.
Three main variants of kriging can be distinguished (Goovaerts, 1997): simple kriging (SK), which considers the mean to be known and constant throughout the study area; ordinary kriging (OK), which accounts for local fluctuations of the mean, limiting the stationarity to the local neighborhood (in this case the mean is unknown); kriging with a trend model (here detrended kriging, DK), which considers that the local mean varies within the local neighborhood.
The trend can be, for example, a linear regression model between the investigated variables and an auxiliary variable, such as elevation or slope. According to the procedure shown in Goovaerts (1997), DK is performed as follows: (i) the trend is subtracted from the original data and OK of the residuals performed, and (ii) the final interpolated values are the sum of the interpolated values and the previously estimated trend.
Variants of OK and DK are local ordinary kriging (LOK) and local detrended kriging (LDK), respectively. In this case the estimate is only influenced by the measurements belonging to a neighborhood, which are usually defined either in a maximum searching radius or as a set number of stations which are closer to the interpolation point. In the LDK case, the trend is estimated locally too, and therefore it can take local trend variations into account.
The SIK package implements both OK and DK since local mean may vary significantly over the study area and the SIK assumption about the mean could be too strict (Goovaerts, 1997).
The workflow of the main algorithm for solving an interpolation problem with kriging can be summarized in the following steps: 1. get the data from gauges, 2. build the empirical semivariogram, 3. fit a theoretical model to the semivariogram, 4. use the theoretical model for solving the kriging system, 5. produce continuous surface maps or pointwise time series of the quantity desired in any point of the domain, 6. calculate estimation errors.
The last step underlines that we are interested not only in estimating a variable (temperature, rainfall intensity, or other scalars) but also in evaluating the errors of our estimate. In addition to the spatial variable estimate, kriging also returns a variance of the estimate. However, Goovaerts (1997) states that the standard deviation cannot be used as a direct measure of estimation precision since the kriging variance is only a ranking index of data geometry (and size) and not a measure of the local spread of errors (Deutsch and Journel, 1992).
Therefore, to estimate the errors produced by kriging interpolations, we chose the leave-one-out (LOO) crossvalidation technique (Efron, 1982;Isaaks and Srivastava, 1989;Martin and Simpson, 2003;Aidoo et al., 2015). LOO cross validation consists of removing one data point at a time and performing the interpolation for the location of the removed point by using the remaining stations. The approach is repeated until every sample has been, in turn, removed and estimates are calculated for each point. This procedure is straightforward but cumbersome if performed manually. Therefore, a special module (component) was programmed and implemented to do it. LOO estimates errors just over the location where measures are available and, eventually, these errors can be interpolated themselves to obtain an error estimation at any point of the spatial domain.

Design, deployment, and use cases of the SIK package
On the basis of the analysis of the mathematical problems and of the use cases delineated in the previous section, the design of the software was organized into four OMS3 components, the logic of which is explained below.

Overall design of the SIK components
The component-based environmental modeling framework OMS3  was chosen for the development of the SIK code. Components are self-contained building blocks, modules, or units of code (e.g., Argent, 2004;Van Ittersum et al., 2008). Each component implements a single modeling concept and the components can be joined together to obtain an MS that can accomplish a complicated task. The OMS user does not need extensive knowledge of OMS libraries. As its authors state in David et al. (2013), "there are no interfaces to implement, no classes to extend, no polymorphic methods to override and no framework-specific data types to use". In addition, when the workflow allows it, components are run in parallel without any special effort by the computer programmer (this property is often called "implicit parallelism"). In addition to minimizing couplings, the advantage of building within a modular software framework is the production of a code that is more flexible and easier to maintain and be inspected by third parties. Multiple algorithms can be implemented within the same component or in various components and inserted in the MS as alternatives. Thus, inside the same chain of tools, different candidate solutions to the same hydrological problem can be compared. More details on OMS3 can be found in David et al. (2013), Formetta et al. (2014), and Bancheri (2017). It is clear that the adoption of such a framework as the basis for our programs has an impact on the software design. However, the initial implementation of kriging, used in Formetta et al. (2014) and in Abera et al. (2017), was designed to group all the tasks into a single component. In this case we thought it useful to split it into four components: the first is SIK-EV, related to the production of experimental variogram from data; the second is SIK-TV, related to the selection and parameter estimation of the theoretical variogram; the third is SIK-K, for the solution and mapping of the kriging system; and the fourth is SIK-LOO, to manage the error estimation. SIK-LOO does not work alone to produce its results, it uses the other three components to generate the spatial estimates of errors. Figure 1 shows the MS for the interpolation and validation process in OMS. Each component is represented by a rectangle with rounded corners containing the name of the component itself and its inputs and outputs. Arrows represent the connections between components. This representation is also used in the subsequent sections. The inputs for SIK-EV are the time series of the measured variables and the geometry of the measuring stations, in shapefile format with the spatial coordinates. The outputs are the experimental variogram values and the distance vector.
The distance vector, the name, and the parameters for the theoretical semivariogram models (sill, nugget, and range), are the inputs of SIK-TV. Particle swarm optimization (PSO) (Eberhart and Kennedy, 1995) is the component that optimizes the theoretical model parameters. Further inputs for the calibrator are the objective function to be optimized and other internal parameters, such as the number of iterations and the tolerance. PSO can be connected to the SIK-K component (blue dashed arrow) or to the SIK-LOO component (red dashed arrow). Inputs for SIK-K are the shapefiles with the coordinates of the measuring stations and the interpolation points, the measurement data, the DEM, and the optimized parameters of the semivariogram model. Final outputs are either time series or maps of interpolated values, which can be visualized directly in a GIS system.
Using the alternative connection, PSO can be connected to SIK-LOO, which implements the iterative procedure necessary to estimate errors of interpolation. Given n spatially distributed measures, n − 1 measures are used for the interpolation, while the remaining one is used for comparison to produce the error estimate. The operation is repeated n times, each time excluding a different gauged location, to obtain a set of n estimated errors. Because our package needs to deal with time-varying fields, the operation is repeated for each time step, when measures are available, and the site error is actually a temporal mean over a period.

Internal class design characteristics
Each of the four OMS3 components presented in the previous section can contain alternative solutions. For example, in SIK-TV the software design allows for multiple theoretical variogram models, while in the SIK-K component the four types of kriging listed in Sect. 2 were implemented.
In principle, we could have implemented a single component for every single type of variogram and kriging but this would have exploded the number of software modules to maintain. However, to "close the code to modification and keep it open to extensions" (Martin, 2002) and to maintain the code abstract enough to avoid code disruption at any addition ("program to an interfaces"; e.g., Gamma, 1994), we adopted the use of DPs (Gamma, 1994). This was a further enhancement with respect to the previous version of the SIK package.
In general, DP implement rules that allow us, for instance, to separate code parts that are going to vary from those that are going to remain the same. The adoption of these DPs, once their rationale is understood, makes the code easier to be read and maintained. While largely known among programmers, DPs are not widely known in the scientific community, which has remained largely impervious to these techniques, and just a few examples of good practice can be found in the scientific literature (Gardner and Manduchi, 2002;Donatelli and Rizzoli, 2008;Rouson et al., 2011).
The various theoretical semivariogram models or kriging types to be chosen at run time were encapsulated by using the Simple Factory class (Freeman et al., 2004). In this way, adding a new type of variogram to, or deleting an obsolete one from, the SIK-TV is straightforward and requires few changes, which are confined to just a class. Figure 2 shows the implementation of the Simple Factory class for the choice of the theoretical semivariogram model. The concrete classes, "Bessel" and "Spline", implement the same interface, "Model". Simple Factory (named Simple-ModelFactory in the Figure) generates objects of a concrete class from given information (i.e., a string containing the name of the chosen model). The component SIK-TV (named TheoreticalVariogram in the Figure) class uses the pattern to obtain the object of the concrete class.
The dependency inversion principle, according to which high-level modules should not depend on low-level modules, (Eckel, 2003), was also strictly respected in all the programming. The dependencies of the classes are not demanded by the concrete subclasses but only by the abstract classes and interfaces, (Ellis et al., 2007). Thus any changes in a concrete (sub)class do not affect the overall structure of the program and remain limited to it.    allows one to estimate the relevant variables of the hydrological budget: after the spatial interpolation of precipitation, it is possible to simulate the components of the energy budget, shortwave and longwave radiations, the snow processes of accumulation and melting, and the potential evapotranspiration. These are then used as inputs for the rainfall-runoff model, embedded reservoirs model, (Bancheri, 2017), which provides discharge and actual evapotranspiration. Thanks to the MS, it is possible, for example, to test the impact of the type of kriging used on the rainfall-runoff model outputs or to perform a validation of the SIK package using remotely sensed data, e.g., MODIS data product (Turk and Miller, 2005;Hall et al., 2006;Abera et al., 2017). Further information about this MS solution is presented in Bancheri (2017).

Use cases
A second MS is presented in Fig. 4. This MS interpolates temperature maps, which are the inputs for the shortwave and longwave radiation components and for the Shymansky-Or evapotranspiration (SO-ET) component, built after Schymanski and Or (2017). Inputs of the SO-ET component are the maps of interpolated temperature, the shortwave and longwave radiation maps, and the DEM. Outputs are the ET maps and leaf temperature. Obviously, it is possible to use SO-ET instead of potential ET in Fig. 3. In that case, two sets of kriging act concurrently to give maps of temperature and rainfall. Another different scheme (not shown) is obtained when the parameters of the radiation decomposition model (Formetta et al., 2013(Formetta et al., , 2016, i.e., those parameters which are used to determine the attenuation of radiation due to the atmosphere, are set to be spatially varying. In this case a further (third) kriging set is run.

SIK development as an RRS tool
Here, we delineate the practices implemented in building the SIK package for making it a reproducible research system (RRS) (e.g. Formetta et al., 2014).
Although the initial code (let us call it v0.1) was already available from a control version system under a GPL v3 license (www.gnu.org/licenses/gpl-3.0.en.html, last access: 7 June 2018), the repository was owned by the original author. A non-personal repository was judged to be better suited to host a collaborative work. Therefore, for SIK and its companion tools the collective GEOframe organization repository was created under GitHub (www.github.com, last access: 7 June 2018), using Git (www.git-scm.com, last access: 7 June 2018), and can be found at the following link: www. github.com/geoframecomponents (last access: 7 June 2018).
Code v0.1 did not include a building tool. These tools can be considered a modern evolution of the UNIX "make" (e.g., www.gnu.org/software/make/, last access: 7 June 2018) and take care of gathering the various concurring libraries and linking them to form the final executable file. In our case, the possible choices for Java projects include Apache Ant (http: //ant.apache.org, last access: 7 June 2018), Maven (www. maven.apache.org, last access: 7 June 2018), and Gradle (www.gradle.org, last access: 7 June 2018). All of these pro-vide ways to solve the software dependencies. Both Maven and Gradle can download and update the remote resources needed. Our final choice was Gradle since it uses a more concise syntax, thanks to the use of the Groovy language (www.groovy-lang.org, last access: 7 June 2018), compared to the XML (www.w3.org/XML, last access: 7 June 2018) used by Maven. Using building tools also allows abstraction from the use of integrated development environments (IDEs). In the current Java market there are at least three major IDEs for managing large projects: NetBeans (https://netbeans.org/, last access: 7 June 2018), Eclipse (www.eclipse.org, last access: 7 June 2018), and IntelliJ (www.jetbrains.com, last access: 7 June 2018). All of them support both Gradle and Maven, and Ant and can import a Gradle or Maven (or Ant) project seamlessly. These tools are widely used by programmers, but rarely by scientists, who are increasingly struggling with the difficulties of maintaining their own code. With these tools, researchers could master others' codes more easily, especially if they are open source. Therefore, we think that adopting a proper building tool is useful in promoting collaborative work and open science.
Another important step in the management of the code was the implementation of a continuous integration system (https: //jenkins.io/, last access: 7 June 2018).
It ensures the building and testing of the source code at each commit, forcing the good practice of preparing tests for each software module developed.  Continuous integration (Meyer, 2014) is the practice of merging all developer working copies to a shared mainline several times a day. Unit tests (Beck, 2003) are built with the code and run each time the merging is performed. The continuous integration service automatically builds the executable codes, checks if the tests are performed correctly, and returns a positive answer if all is carried out properly. Eventually, major code commits are tagged with release numbers, under the GPL v3 license. For this purposes, we chose to use Travis CI (https://travis-ci.org, last access: 7 June 2018), which uses GitHub as a web-based Git repository hosting service, and is a good choice for a continuous integration service.

SWRB
Since GitHub is a repository and not an archival system, we decided to use Zenodo (www.zenodo.org, last access: 7 June 2018) to provide our products with a Digital Object Identifier (DOI) and then we put the entire project, as used to obtain the results presented in this work, on Open Science Framework (www.osf.io, last access: 7 June 2018). The assignment of the DOI allows researcher peers to retrieve exactly that code in the foreseeable future. This could be important when reconstructing which software version was used in a paper and, perhaps, it could make life easier within research groups.

Study area and data description
To test the performances of the modeling solutions presented in Figs. 3 and 1, we used the SIK components to interpolate temperature and rainfall data from 97 stations located in the Isarco River valley, Italy, shown in Fig. 5 and detailed in the Supplement. The Isarco River is a left tributary of the Adige River, in the Trentino-Alto Adige region, northern Italy.
The catchment area is about 4200 km 2 and the altitude ranges from 210 to 3400 m a.s.l. The river length is about 95 km and the discharge is about 78 m 3 s −1 yearly average at its confluence with the Adige River near Bolzano. The geological and geomorphological conditions of the valley are homogeneous and represented by the very thick Palaeozoic series of the "Complesso Vulcanico Atesino". The formation is deeply affected by the effects of Quaternary glacial and post- glacial erosion. The cross section of the Isarco Valley along the Bolzano-Ponte Gardena stretch is characterized by very steep and rugged slopes. Climate is typically alpine, characterized by dry winters, snow and glacier melt in spring, and humid summers and autumns. The land is mainly used for agriculture in the upper part, while in the lower part of the basin, the narrow valley is mainly occupied by civil infrastructures.
Data used for the testing were provided by Provincia Autonoma di Bolzano (local government), and collected into the Adige database (http://abouthydrology.blogspot.it/2016/09/ the-adige-database-or-database-newage.html, last access: 7 June 2018) during the CLIMAWARE and GLOBAQUA projects. The DEM of the study area was downloaded from the U.S. Geological Survey EarthExplorer (https:// earthexplorer.usgs.gov, last access: 7 June 2018) and it has 100 m × 100 m cell size.

Simulation setup
In the available dataset (2003-2013) we identified the year with the smallest number of missing data, which was 2008, and then we used it to test the SIK components.
A quality check was made to eliminate any outliers. Also, the spatial distribution of the no value was analyzed in order to assess the number of bins of distances in which to compute the semivariance. In fact, to reduce the number of points in the experimental semivariogram, the pairs of locations are grouped based on their distance from one another. This grouping process is known as binning. For each time step, we found that about 10 % of stations were not recording data. Therefore, since the mean number of active stations for each time step was 70-80, we decided to use eight bins. This choice was also supported by a visual inspection of the shape of the experimental semivariance, which confirmed that by using eight bins the number of stations involved were neither too low nor too high.
In order to assess the goodness of SIK performances, two applications were performed: an interpolation of 1 year of hourly temperature data; an interpolation of a rainfall event, also at hourly time steps.
First, the analysis of the semivariance was performed and experimental semivariograms were fitted using all 11 theoretical models. The model that gave the best fitting was then used for the interpolation of the temperature and rainfall variables using the four types of kriging. Kriging performances were assessed using the LOO cross validation. The two local cases (LOK and LDK) were performed using a fixed number of closer stations. In particular, we decided to use 10 stations for the temperature case since it was a good compromise between the distance among the stations and the mean number of recording stations for each time step. Regarding the local interpolation of precipitation, the number of closer stations was five, given the prevalently convective nature of summer precipitation and the lower number of active gauge stations for each time step. Finally, results obtained from the interpolation of the temperature dataset were compared to the results obtained with R gstat, in order to assess the differences between the two packages, their easiness of use, and their performances.

Application of SIK to a temperature dataset
The first application of SIK components was carried out using the temperature dataset. The hourly experimental semivariograms were computed and then fitted using the 11 available theoretical models. Figure 6 shows the results of the fitting of the experimental semivariogram for a single time step on 15 June 2008. The black dots represent the experimental semivariance, while each colored curve represents a different optimized theoretical model. The Y and X axes show the values of the semivariance γ (h) and the distances in meters, respectively. Table 1 reports the main indexes of goodness of fit (GOFs), namely NSE, RMSE, R 2 , and PBIAS (see Appendix B for a list of abbreviations), computed between the experimental semivariogram and the 11 theoretical semivariogram models. The aforementioned GOFs are defined in Appendix C and were obtained using the R package hydroGOF (https://cran.r-project. org/web/packages/hydroGOF/hydroGOF.pdf, last access: 7 June 2018), which computes the GOFs between measured and simulated values (in this case experimental semivariance and theoretical semivariance). All the semivariogram models gave satisfactory results, with large values of NSE (0.72 : 0.92) and R 2 (0.73 : 0.92) and low values of RMSE (2.14 • C : 3.99 • C) and PBIAS (−3.80 % : −7.90 %), confirming the accuracy of the calibration procedure.
In order to asses the goodness of the interpolation, we performed the LOO cross validation using the optimized hourly values of sill, nugget, and range for the Bessel model, which is one of the best semivariograms according to the previous results. Figure 7 shows the results for the four types of kriging in terms of NSE. Each point represents the averaged monthly NSE over the 97 meteorological stations. The two local cases were performed using the 10 closest stations to the interpolation point.
For both the OK and LOK cases the performances were very poor (NSE < 0.5), indicating that mean temperature might have been a better predictor than the interpolation.
A strong trend between temperature and elevation (R 2 ∼ 0.9) was detected during the quality check phase (which was

Application of SIK to a rainfall dataset
The application to a rainfall dataset was made at event scale; specifically, a rainfall event of 11 h between the 29 and 30 June 2008. The event was chosen because it was the longest and most intense recorded by the highest number of stations for 2008. Figure 9 shows the box plots of the 11 hourly semivariograms with eight bins of lag distance, while the red line represents the best theoretical semivariogram, which in this case was obtained using the Bessel model. The Y and X axes show the values of the semivariance γ (h) and the distances in meters, respectively.
The optimized values of range, nugget, and sill were then used for the four types of kriging interpolations. Figure 10 compares the results obtained for two stations (ID 1152 and ID 1270) chosen at different elevations (943 and 2100 m a.s.l., respectively). All the interpolators were able to capture the rainfall peak at midnight for both stations. Comparing the volumes of cumulative rainfall, shown in Table 2, all the interpolators gave good results for station ID 1152, while there is an overestimation in the case of station ID 1270; this is due to peaks detected but not recorded between 21:00 CET and the 00:00 CET. Table 2 also shows the GOFs between the measured and the interpolated rainfall for the four types of kriging and the two stations. The performances are overall good in the case of station ID 1152 and the best interpolator is the LDK computed using the five closest stations. Results for station ID 1270 are generally worse, with the highest NSE of 0.62 for the LOK case. This could be due to the higher elevation of the station, which led to a series of interpolated peaks that were not recorded.
The spatial interpolation of the precipitation was also performed for each pixel of the DEM, applying the LOK and the Bessel semivariogram model. Figure 11 shows the results of the interpolation for 30 June 2008 at 00:00 CET. As it appears from the map, the rainfall intensities are higher in the river valley, with a value of 9.8 mm h −1 measured at station ID 1152. The bubble plots of the RMSE obtained between the measured and the interpolated values are overlapped. The size of a bubble represents the magnitude of the error: the largest error is obtained for station ID 90133 (Z = 1246 m a.s.l.).

A qualitative comparison with the R package gstat
A comparison between SIK and the R package gstat was made in order to highlight their differences and similarities, and to justify the deployment of an alternative software. We performed a qualitative comparison between the two softwares accounting for design, the implemented features, and the accuracy of the results. Benchmarks or quantitative performance comparisons would not have been useful or completely truthful since the "velocity" of computation (a classic quantitative comparison) depends on too many factors, some of which are described below. Moreover, in our opinion, the   two tools that we analyzed have different purposes. This can be seen just by looking at the features of the relative programming languages. The gstat software is developed in C with a part of the code in R language. It must be executed using the various R environments. SIK is developed in Java (7) as a group of OMS components and it can be executed within the OMS console, as a stand-alone Java program, or embedded in other codes in languages that support Java bindings. Java is slower than third-generation languages such as C. However, in the course of Java development several optimizations, such as "just-in-time compilation" and "adaptive optimization", have been introduced to improve the performance of its Java virtual machine (JVM). These techniques identify recurrently executed algorithms, so-called "hot spots", and dynamically recompile them at run time. Eventually, the hot spots gain valuable computational speed. C is one of the fastest compiled languages. But only the computational core of gstat is coded in C; the management of temporal steps, such as "for-loops", and data structures must be scripted in R. Undoubtedly, R is a very powerful programming language, Figure 10. Comparison among the four types of kriging and the measured rainfall.
Precipitation (mm h -1 ) Figure 11. Spatial interpolation of the precipitation applying LOK and the Bessel semivariogram model. The bubble plot of the RMSE is overlapped.
mainly because of its flat learning curve and easy syntax and semantics, but it is fully interpreted, which makes it very slow. As a result, the comparison of the speed of computation for a single temporal kriging interpolation is unfair against Java since the JVM cannot exploit its optimization tools for a single computation. Conversely, the comparison of the speed of computation for a year of hourly kriging interpolations is biased against R because temporal steps affect most of the computational time. In terms of functionality, gstat computes both omnidirectional and directional semivariograms, while SIK does not implement directional semivariograms yet (although we have included this feature on the software wish list). Furthermore, gstat provides four more theoretical semivariogram models with respect to SIK: Matern, Matern with Stein's parameterizations, Wave, and Legendre. Adding the desired theoretical model to any SIK-TV component would be easy and straightforward, thanks to the DP implemented, as shown in Fig. 2, but they are not available yet. Regarding the estimates that the two packages offer, these are usually different. Comparisons were made with both the temperature and rainfall datasets used in Sect. 4. Semivariograms were computed using the same number of bins and cutoff distance. Figure 12 shows the results of the temperature interpolations performed with SIK and gstat, in terms of NSE, RMSE, PBIAS, and R 2 : the overall performances of both tools are very good. The NSE values are always above 0.65, while the RMSE is always lower then 2 • C. Figure 13 shows the results of the precipitation interpolation performed with SIK and gstat, in terms of NSE, RMSE, PBIAS, R 2 , and cumulative volumes. Also in this case, both softwares are able to reproduce the rainfall event well, simulating the peaks. The results obtained for station ID 1152 are very good for both softwares, with a NSE > 0.9. Both softwares show slightly worse results for station ID 2170, with lower values of NSE and R 2 , higher RMSE, and an overestimation of the total rainfall (19 mm with gstat and 20 mm with SIK, compared to the 16 mm recorded by the gauges).
In conclusion, gstat is a powerful, flexible tool to obtain fast results with fast scripting in answer to single, specific questions (with some implementation efforts user-side); SIK is a tool that is ready to be integrated into broader MSs, specifically because of its OMS-compliant design. The interpolations of both temperature and rainfall confirm the quality and accuracy of the predictions obtained using the SIK package, demonstrating that it is a good competitor of R gstat.

Conclusions
This paper presents a new modeling package for the spatial interpolation of environmental variables. It includes 11 theoretical semivariogram models and four types of kriging interpolations. To test the performance of the SIK package, two applications were performed: the interpolation of 1 year of temperatures and the interpolation of a rainfall event. Data were retrieved from a dataset of 97 stations located in the Isarco Valley in Italy and the resolution of the interpolation grid data was 100 m.
Several characteristics make the SIK package a good competitor tool among those available in the literature. From the user perspective, it can be used as a stand-alone; it can be plugged in to the hydrological modeling system GEOframe-NewAGE; it can be used with all OMS-compliant components, such as calibration tools for the optimization of the parameters; it includes a tool for the automatic estimation of errors; its results are presented in data formats that can be visualized directly by GIS; a variety of MSs can be obtained, according to user needs; it is faster than gstat in everyday use, under certain conditions.
From the programmer perspective, the implementation of DP makes the package easy to maintain and suitable for future improvements. All the elements are close to modification and open to extension. Further developments of the package are easy and straightforward. Examples of such developments might include integrating new types of kriging, implementing a different selection method of the gauge stations, and the addition of nonlinear relationships between the interpolated variable and an auxiliary variable.
The interpolations of both the temperature and the rainfall gave very good results, with a high agreement between the measured and the interpolated variables. The tests also show how it is possible to choose between 11 variograms and four kriging alternatives and to compare the outcomes easily. Conversely, the single rainfall event did not show trend with elevation.
In comparison with gstat, the SIK package proved to be a good alternative, regarding both the easiness of use and the accuracy of the interpolation.
Code availability. An OSF project with all the components needed to reproduce the results shown in this paper has been created and is available at the following link: https://osf.io/24rgv (Rigon, 2018). The interested researcher can find the entire OMS project, containing input data, output, .sim files, jar files, and the R script used for the plots, at the following link: https://doi.org/10.5281/zenodo. 1244034 , as well as in the OSF project. Moreover, the links to the source codes and to the documentation of the SIK components are also available in the OSF project. In particular, for the present work, version 0.9.8 is the version of the codes of the GEOframe-SIK package that we used, available at the following link: https://github.com/geoframecomponents/Krigings/tree/v0. 9.8 (Serafin, 2018) Appendix A: Kriging theory Kriging is a group of geostatistical techniques used to interpolate the value of random fields based on spatial autocorrelation of measured data, (Isaaks and Srivastava, 1989;Goovaerts, 1997;Kitanidis, 1997). The measurement value z(x α ) and the unknown value z(x), where x is the location given according to a certain cartographic projection, are considered as particular realizations of the random variables Z(x α ) and Z(x) (Isaaks and Srivastava, 1989;Goovaerts, 1997). Let the estimation of the (true) random variable Z(x) be Z λ (x). It is obtained as a linear combination of the N random variables at surrounding points, denoted as x α with α = {1, N }, as in Goovaerts (1999): where m(x) and m(x α ) are the expected values of the random variables Z(x) and Z(x α ); λ(x α ) at varying α is the N-uple of weights assigned to the random variable Z(x α ) at measured sites. The superscript λ in Z λ (x) denotes that this new random variable is parameterized by the weights. These are chosen to satisfy the condition of minimizing the error of variance of the estimator σ 2 λ , that is argmin under the constraint that the estimate is unbiased, i.e., The latter condition implies that N α=1 λ α (x α ) = 1.
As shown in various textbooks, e.g., Kitanidis (1997), the above conditions create a linear system with the unknown being the N-uple of weights, and the system matrix dependent on the semivariograms (defined below in a simplified case). In synthetic notation, the linear system can be written as where is the matrix of two point variograms (defined below), is the N-uple of unknown weights and B (the socalled known term) is an N-uple containing the variograms between the ungauged site and the measured sites. Further information is required for Eq. (A5) to be a solvable linear system. In fact, B is still unknown at this stage.
If isotropy of the spatial statistics of the quantity analyzed is assumed, then the semivariogram is given by (e.g., Cressie and Cassie, 1993) where N h denotes the number of observation points at location x i at distance h from x for any h. When random variables are substituted by their available realizations (i.e., z(x i ), indicated with normal letters) an empirical semivariogram is obtained. In order to be extended to any distance, γ (h) needs to be fitted to a theoretical semivariogram model, i.e., an assumed function form, as those detailed in Appendix D. The fitting to the theoretical semivariogram model is also necessary to obtain B. In fact, when a theoretical semivariogram is selected, only position information for the ungauged location is required to obtain its semivariogram with respect to any of the measured locations. Once B has been determined, the system (A5) can be solved. This procedure is clearly delineated in the literature and explained for instance in Kitanidis (1997). Optimized semivariogram models are used to estimate the weighted parameters of the kriging algorithm. The coefficient of determination, R 2 , is the proportion of variance in the dependent variable that is predictable from the independent variable(s): where M i is the true value, M i is the mean of M i , and S i is the predicted value. It varies between 0 and 1, where 1 corresponds to the maximum agreement between predicted and true values.
-Nash-Sutcliffe efficiency The Nash-Sutcliffe efficiency (NSE) is a normalized model efficiency coefficient. It determines the relative magnitude of the residual variance compared to the measured data variance (Nash and Sutcliffe, 1970): where S i is the predicted value and M i is the observed value at a given time step. It varies from −∞ to 1, where 1 corresponds to the maximum agreement between predicted and observed values.
-Percent bias Percent bias (PBIAS) measures the average tendency of the simulated values to be larger or smaller than the corresponding measured ones. The optimal value of PBIAS is 0, with small values indicating accurate model simulation. Positive values indicate overestimation bias, while negative values indicate model underestimation bias.
where S i is the predicted value and M i is the observed value.
-Root-mean-square error The root-mean-square error (RMSE) is given by where M and S represent the measured and simulated time series, respectively, and N is the number of components in the series.
Appendix D: List of semivariogram models implemented in SIK Using n to represent nugget, h to represent lag distance, r to represent range, and s to represent sill, the 11 theoretical semivariogram models most frequently used in literature are as follows. The Supplement related to this article is available online at https://doi.org/10.5194/gmd-11-2189-2018supplement.
Author contributions. MB, GF, and FS developed the model code integrated in the GEOframe-SIK package. MB, FS, MB, and WA designed the experiments and performed the simulations. RR planned the research, coordinated and supervised all its phases, and provided the financial support with his funding. MB prepared the paper with contributions from all co-authors.
Competing interests. The authors declare that they have no conflict of interest.
Acknowledgements. The authors acknowledge Trento University project CLIMAWARE (http://abouthydrology.blogspot.com/ search?q=climaware, last access: 7 June 2018) and European Union FP7 Collaborative Project GLOBAQUA ("Managing the effects of multiple stressors on aquatic ecosystems under water scarcity", grant no. 603629-ENV-2013.6.2.1), which partially financed this research. The authors would also like to acknowledge the EU and MIUR for funding the project STEEP STREAMS through the ERA-NET WaterWorks2014 Cofunded Call. This ERA-NET is an integral part of the 2015 joint activities developed by the Joint Programming Initiative "Water challenges for a changing world" (Water JPI).
Edited by: Rolf Sander Reviewed by: two anonymous referees