Fast approximate Barnes interpolation: illustrated by Python-Numba implementation fast-barnes-py v1.0

Zürcher, Bruno K.

doi:https://doi.org/10.5194/gmd-16-1697-2023

Articles | Volume 16, issue 6

https://doi.org/10.5194/gmd-16-1697-2023

Articles | Volume 16, issue 6

Development and technical paper

27 Mar 2023

Development and technical paper |

| 27 Mar 2023

Fast approximate Barnes interpolation: illustrated by Python-Numba implementation fast-barnes-py v1.0

Bruno K. Zürcher

Abstract

Barnes interpolation is a method that is widely used in geospatial sciences like meteorology to remodel data values recorded at irregularly distributed points into a representative analytical field. When implemented naively, the effort to calculate Barnes interpolation depends on the product of the number of sample points N and the number of grid points W×H, resulting in a computational complexity of $O (N \cdot W \cdot H)$ . In the era of highly resolved grids and overwhelming numbers of sample points, which originate, e.g., from the Internet of Things or crowd-sourced data, this computation can be quite demanding, even on high-performance machines.

This paper presents new approaches of how very good approximations of Barnes interpolation can be implemented using fast algorithms that have a computational complexity of $O (N + W \cdot H)$ . Two use cases in particular are considered, namely (1) where the used grid is embedded in the Euclidean plane and (2) where the grid is located on the unit sphere.

Download & links

Article (PDF, 6360 KB)

Download & links

How to cite.

Received: 24 Apr 2022 – Discussion started: 19 Jul 2022 – Revised: 14 Jan 2023 – Accepted: 16 Jan 2023 – Published: 27 Mar 2023

1 Introduction

In the early days of numerical weather prediction, various methods of objective analysis were developed to automatically produce the initial conditions from the irregularly spaced observational data (Daley, 1991), one of which was Barnes interpolation (Barnes, 1964). However, objective analysis soon lost its significance in this field against statistical approaches. Today, the Barnes method is still used to create grid-based isoline visualizations of geospatial data, like in the meteorological workstation project NinJo (Koppert et al., 2004) for instance, which is jointly developed by the Deutscher Wetterdienst, the Bundeswehr Geophysical Service, MeteoSwiss, the Danish Meteorological Institute and the Meteorological Service of Canada.

Barnes interpolation $f (x) : D ⟶ R$ for an arbitrary point x∈D and a given set of sample points x_k∈D with observation values f_k∈ℝ for $k = 1, \dots, N$ is defined as

\begin{matrix} (1) & f (x) = \frac{\sum_{k = 1}^{N} f_{k} \cdot w_{k} (x)}{\sum_{k = 1}^{N} w_{k} (x)}, \end{matrix}

with Gaussian weights

\begin{matrix} (2) & w_{k} (x) = e^{- \frac{d (x, x_{k})^{2}}{2 σ^{2}}}, \end{matrix}

a distance function $d : D \times D ⟶ R$ and a Gaussian width parameter σ.

If Barnes interpolation is computed in a straightforward way for a regularly spaced W×H grid, the computational complexity is given by $O (N \cdot W \cdot H)$ as easily can be seen from the 3-fold nested loops of Algorithm A given below. Consequently, for big values of N or dense grids, a naive implementation of Barnes interpolation turns out to be unreasonably slow.

The fact that the Gaussian weight function quickly approaches 0 for increasing distances leads to a first improvement attempt, which consists of neglecting all terms in the sums of Eq. (1), for which the weights w_k drop below a certain limit w₀, e.g., w₀=0.001. This is equivalent to only taking into account observation points x_k that lie within the distance $r_{0} = σ \sqrt{- 2 \ln w_{0}}$ from the interpolation point x. Thus, the described procedure requires the ability to quickly extract all observation points x_k that lie within a distance r₀ from point x. Data structures that support such searches, e.g., are so-called k–d trees (Bentley, 1975) or quadtrees (Finkel and Bentley, 1974).

This improved approach actually reduces the required computational time by a constant factor, but the computational complexity remains in the same order. To see this, note that a specific sample point x_k contributes to the interpolation value of exactly those grid points that are contained in the circular disk $B_{r_{0}} (x_{k}) = {q \in D ∣ d (x_{k}, q) < r_{0}}$ of radius r₀ around it. Also note that in a regularly spaced grid, the number of affected grid points is roughly the same for each sample point. If now the number of grid points W and H in each dimension is increased by a factor of κ – i.e., the grid becomes denser – the number of grid points contained in $B_{r_{0}} (x_{k})$ grows accordingly, namely by a factor of κ², which shows that the algorithmic costs rise in direct dependency to W and H. Since this is obviously true for the number of sample points N as well, the improved algorithm also has complexity $O (N \cdot W \cdot H)$ .

In this paper, we discuss a new and fast technique to compute very good approximations of Barnes interpolation. The utilized underlying principle of applying multiple convolution passes with a simple rectangular filter in order to approximate a Gaussian convolution is well known in computational engineering. In image processing and computer vision, for instance, Gaussian filtering of images is often efficiently calculated by repeated application of an averaging filter (Wells, 1986).

The theoretical background of the new approach is presented in Sects. 2 and 3. Thereafter, we investigate two use cases for the domain D:

i.
D=ℝ² – the Euclidean plane with the usual distance $d (p, q) = ∥ p - q ∥_{2}$ in full detail in Sects. 4 and 5.4,
ii.
$D = S^{2} = {p \in R^{3} ∣ ∥ p ∥_{2} = 1}$ – the unit sphere, where d(p,q) is the spherical distance between p and q, as a broad outline in Sect. 5.5.

2 Conclusions from central limit theorem

For a set ${X_{k}}_{k = 1}^{n}$ of independent and identically distributed (i.i.d.) random variables with mean μ and variance σ², the central limit theorem (Klenke, 2020) states that the probability distribution of their sum converges to a normal distribution if n approaches infinity, formally written as

\begin{array}{l} P [\frac{X_{1} + \dots + X_{n}}{\sqrt{n}} \leq a] \\ (3) & ⟶_{n \to \infty}^{} \frac{1}{\sqrt{2 π} σ} \int_{- \infty}^{a} e^{- \frac{1}{2} {(\frac{t - μ}{σ})}^{2}} d t . \end{array}

Without loss of generality, we assume in the further discussion that μ=0. Let p(x) denote the probability density function (PDF) of the scaled random variables ${\frac{1}{\sqrt{n}} X_{k}}_{k = 1}^{n}$ that consequently have the variance $\frac{σ^{2}}{n}$ . Since the PDF of a sum of random variables corresponds to the convolution of their individual PDFs, we find that on the other hand

P [\frac{X_{1} + \dots + X_{n}}{\sqrt{n}} \leq a] = \int_{- \infty}^{a} p^{* n} (x) d x,

where p^∗n(x) denotes the n-fold convolution of p(x) with itself (refer to Appendix A). With that result, we can write the relationship (3) equivalently in an unintegrated form as

\begin{matrix} (4) & p^{* n} (x) ⟶_{n \to \infty}^{} \frac{1}{\sqrt{2 π} σ} e^{- \frac{x^{2}}{2 σ^{2}}}, \end{matrix}

which leads directly to

Approximation 1. For sufficiently large n, the n-fold self-convolution of a probability density function p(x) with mean μ=0 and variance $\frac{σ^{2}}{n}$ approximates a Gaussian with mean 0 and variance σ², i.e.,

p^{* n} (x) \approx \frac{1}{\sqrt{2 π} σ} e^{- \frac{x^{2}}{2 σ^{2}}} .

Note that this approximation is valid for arbitrary PDFs p(x) with mean μ=0 and variance $\frac{σ^{2}}{n}$ .

A particularly simple PDF is given by the uniform distribution. We therefore define a family of uniform PDFs as ${u_{n} (x)}_{n = 1}^{\infty}$ , of which each member u_n(x) has mean 0 and variance $\frac{σ^{2}}{n}$ . These uniform PDFs can be expressed by means of elementary rectangular functions:

\begin{matrix} (5) & r_{n} (x) = \{\begin{array}{ll} 1 & for | x | \leq \sqrt{\frac{3}{n}} σ \\ 0 & otherwise \end{array} where n = 1, 2, \dots, \end{matrix}

such that

u_{n} (x) = \frac{1}{2 \sqrt{\frac{3}{n}} σ} r_{n} (x) = \{\begin{cases} \frac{1}{2 \sqrt{\frac{3}{n}} σ} & for | x | \leq \sqrt{\frac{3}{n}} σ \\ 0 & otherwise \end{cases},

where $n = 1, 2, \dots$ . From this definition, it is clear that u_n(x) is actually a PDF with mean E[u_n]=0 and variance

\begin{array}{l} Var (u_{n}) = \int_{- \infty}^{\infty} x^{2} \cdot u_{n} (x) d x = \frac{1}{2 \sqrt{\frac{3}{n}} σ} \int_{- \sqrt{\frac{3}{n}} σ}^{\sqrt{\frac{3}{n}} σ} x^{2} d x \\ = {\frac{1}{2 \sqrt{\frac{3}{n}} σ} \cdot \frac{1}{3} x^{3}|}_{- \sqrt{\frac{3}{n}} σ}^{\sqrt{\frac{3}{n}} σ} = \frac{1}{2 \sqrt{\frac{3}{n}} σ} \cdot \frac{2}{3} {(\sqrt{\frac{3}{n}} σ)}^{3} = \frac{σ^{2}}{n}, \end{array}

as postulated. According to the convergence relation (4) and Approximation 1, the series of the n-fold self-convolutions ${u_{n}^{* n} (x)}_{n = 1}^{\infty}$ converges to a Gaussian with mean 0 and variance σ². The converging behavior can actually be examined visually in Fig. 1, which plots the n-fold self-convolution of the first few family members.

https://gmd.copernicus.org/articles/16/1697/2023/gmd-16-1697-2023-f01

Figure 1From left to right, blue indicates the plot of the PDFs of the uniform distributions u₁(x), u₂(x), u₃(x) and u₄(x) for σ=1; black shows their self-convolutions $u_{2}^{* 2} (x)$ , $u_{3}^{* 3} (x)$ and $u_{4}^{* 4} (x)$ . The area covered by the PDF of the normal distribution is indicated in gray.

Download

The central limit theorem can also be stated more generally for i.i.d. m-dimensional random vectors ${X_{k}}_{k = 1}^{n}$ , refer for instance to (Muirhead, 1982; Klenke, 2020). Supposing that the X_k has a mean vector μ=0 and a covariance matrix Σ, we can follow the same line of argument as in the one-dimensional case. Let p(x) be the joint PDF of the scaled random variables ${\frac{1}{\sqrt{n}} X_{k}}_{k = 1}^{n}$ , which therefore have a zero mean vector and the covariance matrix $\frac{1}{n} Σ$ . Then the limit law in m dimensions becomes

p^{* n} (x) ⟶_{n \to \infty}^{} \frac{1}{(2 π)^{\frac{m}{2}} \sqrt{\det Σ}} e^{- \frac{1}{2} x^{T} Σ^{- 1} x} .

For the remainder of the discussion, we fix the number of dimensions to m=2 and for the sake of readability, we write the vector argument x in its component form (x,y), if it is appropriate. In case the random vectors X_k are isotropic, i.e., do not have any preference in a specific spatial direction, the covariance matrix is a multiple of the identity matrix Σ=σ²I and, in two dimensions, the limit law simplifies to the following:

\begin{matrix} (6) & p^{* n} (x) ⟶_{n \to \infty}^{} \frac{1}{2 π σ^{2}} e^{- \frac{1}{2 σ^{2}} ∥ x ∥^{2}} . \end{matrix}

In the following step, we are aiming to substitute p(x) with the members of the family ${u_{n}^{(2)} (x)}_{n = 1}^{\infty}$ of two-dimensional uniform distributions over a square-shaped domain, which are defined as

\begin{array}{l} u_{n}^{(2)} (x, y) = u_{n} (x) \cdot u_{n} (y) = \frac{n}{12 σ^{2}} r_{n} (x) \cdot r_{n} (y) \\ = \{\begin{array}{ll} \frac{n}{12 σ^{2}} & for | x |, | y | \leq \sqrt{\frac{3}{n}} σ \\ 0 & otherwise \end{array} for n = 1, 2, \dots . \end{array}

With this definition, the members $u_{n}^{(2)}$ have a mean vector 0 and an isotropic covariance matrix $\frac{σ^{2}}{n} I$ , and hence satisfy the prerequisite of the limit law in Eq. (6). Also note that $u_{n}^{(2)} (x)$ is separable because $u_{n}^{(2)} (x, y) = u_{n} (x) \cdot u_{n} (y)$ . As a consequence of the latter, the n-fold self-convolution of $u_{n}^{(2)} (x)$ is itself separable, i.e.,

\begin{array}{l} {(u_{n}^{(2)})}^{* n} (x, y) = {(u_{n} (x) \cdot u_{n} (y))}^{* n} \\ = {u_{n}}^{\overset{x}{*} n} (x) \cdot {u_{n}}^{\overset{y}{*} n} (y) \\ (7) & = {(\frac{n}{12 σ^{2}})}^{n} {r_{n}}^{\overset{x}{*} n} (x) \cdot {r_{n}}^{\overset{y}{*} n} (y), \end{array}

where the operators $\overset{x}{*}$ and $\overset{y}{*}$ denote one-dimensional convolution in the x and y direction, respectively (refer to Appendix B). Substituting p(x) in Eq. (6) with the right-hand side (RHS) of Eq. (7), we obtain

{(\frac{n}{12 σ^{2}})}^{n} {r_{n}}^{\overset{x}{*} n} (x) \cdot {r_{n}}^{\overset{y}{*} n} (y) ⟶_{n \to \infty}^{} \frac{1}{2 π σ^{2}} e^{- \frac{1}{2 σ^{2}} ∥ x ∥^{2}},

or expressed as

Approximation 2. For sufficiently large n, the n-fold self-convolution of the two-dimensional uniform probability density function $\frac{n}{12 σ^{2}} r_{n} (x) \cdot r_{n} (y)$ approximates a bivariate Gaussian with mean vector 0 and covariance matrix σ²I, i.e.,

{(\frac{n}{12 σ^{2}})}^{n} {r_{n}}^{\overset{x}{*} n} (x) \cdot {r_{n}}^{\overset{y}{*} n} (y) \approx \frac{1}{2 π σ^{2}} e^{- \frac{1}{2 σ^{2}} ∥ x ∥^{2}},

where r_n(x) is an elementary rectangular function defined as

r_{n} (x) = \{\begin{array}{ll} 1 & for | x | \leq \sqrt{\frac{3}{n}} σ \\ 0 & otherwise \end{array} for n = 1, 2, \dots .

3 Barnes interpolation as series of convolutions

Let φ_μ,σ(x) denote the PDF of a two-dimensional normal distribution with mean vector μ and isotropic variance σ², i.e.,

φ_{μ, σ} (x) = \frac{1}{2 π σ^{2}} e^{- \frac{1}{2 σ^{2}} ∥ x - μ ∥^{2}} .

Note that φ_0,σ(x) corresponds to the RHS of Approximation 2. Further, let δ_a(x) denote the Dirac or unit impulse function at location a with the property $δ_{a} * f (x) = f (x - a)$ . Then we can write

\begin{array}{l} φ_{μ, σ} (x) = \frac{1}{2 π σ^{2}} e^{- \frac{1}{2 σ^{2}} ∥ x - μ ∥^{2}} = δ_{μ} * (\frac{1}{2 π σ^{2}} e^{- \frac{1}{2 σ^{2}} ∥ x ∥^{2}}) \\ = δ_{μ} * φ_{0, σ} (x) . \end{array}

Thus, a Gaussian-weighted sum as found in the numerator of Barnes interpolation (1) for the Euclidean plane ℝ² can be written as convolutional operation

\begin{array}{l} \sum_{k = 1}^{N} f_{k} \cdot e^{- \frac{1}{2 σ^{2}} ∥ x - x_{k} ∥^{2}} = 2 π σ^{2} \sum_{k = 1}^{N} f_{k} \cdot φ_{x_{k}, σ} (x) \\ = 2 π σ^{2} \sum_{k = 1}^{N} f_{k} \cdot (δ_{x_{k}} * φ_{0, σ}) (x), \end{array}

and due to the distributivity and the associativity with scalar multiplication of the convolution operator, Eq. (8) follows:

\begin{array}{l} = 2 π σ^{2} \sum_{k = 1}^{N} (f_{k} \cdot δ_{x_{k}}) * φ_{0, σ} (x) \\ (8) & = 2 π σ^{2} (\sum_{k = 1}^{N} f_{k} \cdot δ_{x_{k}}) * φ_{0, σ} (x) . \end{array}

Substituting φ_0,σ(x) with Approximation 2, we obtain the following for sufficiently large n:

\begin{array}{l} \sum_{k = 1}^{N} f_{k} \cdot e^{- \frac{1}{2 σ^{2}} ∥ x - x_{k} ∥^{2}} \\ \approx 2 π σ^{2} {(\frac{n}{12 σ^{2}})}^{n} (\sum_{k = 1}^{N} f_{k} \cdot δ_{x_{k}}) * ({r_{n}}^{\overset{x}{*} n} (x) \cdot {r_{n}}^{\overset{y}{*} n} (y)) . \end{array}

For the denominator of Barnes interpolation (1), we can use the same expression but set the coefficients f_k to 1. Since the common factors in the numerator and the denominator cancel each other, we can state

Approximation 3. For sufficiently large n, Barnes interpolation for the Euclidean plane ℝ² can be approximated by

\begin{matrix} (9) & f (x, y) \approx \frac{(\sum_{k = 1}^{N} f_{k} \cdot δ_{x_{k}}) * ({r_{n}}^{\overset{x}{*} n} (x) \cdot {r_{n}}^{\overset{y}{*} n} (y))}{(\sum_{k = 1}^{N} δ_{x_{k}}) * ({r_{n}}^{\overset{x}{*} n} (x) \cdot {r_{n}}^{\overset{y}{*} n} (y))}, \end{matrix}

provided that the quotient is defined.

In other words, Barnes interpolation can very easily be approximated by the quotient of two convolutional expressions, both consisting of an irregularly spaced Dirac comb, followed by a sequence of convolutions with a one-dimensional rectangular function of width $2 σ \sqrt{3 / n}$ , executed n times in the x direction and n times in the y direction. As the convolution operation is commutative, the convolutions can basically be carried out in any order. The sequence shown in Approximation 3, evaluated from left to right, is however especially favorable regarding the computational effort.

Approximation 3 can be stated in a more generalized context as well, i.e., also for non-uniform PDFs. In the special case of using a normal distribution with mean 0 and variance σ², we can even formulate a convolutional expression that is equal to Barnes interpolation. Refer to Appendix C for more details.

4 Discretization

In a straightforward way, Approximation 3 leads to a very efficient algorithm for an approximate computation of Barnes interpolation on a regular grid Γ that is embedded in the Euclidean plane ℝ². Let

Γ = {(i \cdot Δ s, j \cdot Δ s) \in R^{2} ∣ 0 \leq i < W, 0 \leq j < H}

be a grid of dimension W×H with a grid-point spacing Δs. Without loss of generality, we assume that all sample points x_k are sufficiently well contained in the interior of Γ. In what follows, we differentiate discrete functions from their continuous function counterparts by enclosing the arguments in brackets instead of parentheses and write, for instance, g[i] for g(x) in the one-dimensional case and g[i,j] for g(x,y) in the two-dimensional case.

In an iterative procedure, we now compute the convolutional expression that corresponds to the numerator (or denominator) of Approximation 3 simultaneously for all points [i,j] of grid Γ. The intermediate fields that result from this iteration are denoted by F^(m), where m indicates the iteration stage.

The first step consists of discretizing the expression $\sum f_{k} \cdot δ_{x_{k}}$ , i.e., by injecting the values f_k at their respective sample points x_k into the underlying grid. For this purpose, the field F⁽⁰⁾ is initialized with 0. From the definition of Dirac's impulse function in two dimensions,

\begin{array}{l} δ_{0} (x, y) = lim_{α \to 0} \frac{1}{α^{2}} r_{α} (x, y) \\ where r_{α} (x, y) = \{\begin{cases} 1 & for | x |, | y | \leq \frac{1}{2 α} \\ 0 & otherwise \end{cases}, \end{array}

we deduce for its discrete version that the grid cell containing the considered sample point receives the weight $1 / Δ s^{2}$ , while all other cells are left unchanged with weight 0. Since a sample point x_k in general does not coincide with a grid point (refer also to Fig. 2) and in order to achieve a good localization, the Dirac impulse $δ_{x_{k}}$ is distributed in a bilinear way to its four neighboring grid points according to step 5 of Algorithm B.1.

Note that if a grid point [i,j] is affected by several sample points, the determined weight fractions are accumulated accordingly in the respective field element F⁽⁰⁾[i,j]. For the final calculation of quotient (9), the factor $1 / Δ s^{2}$ cancels out and can therefore be omitted in Algorithm B.1, but for reasons of mathematical correctness, it is shown here. Since we have N input points and we perform a fixed number of operations for each of them, the complexity of Algorithm B.1 is given by 𝒪(N).

https://gmd.copernicus.org/articles/16/1697/2023/gmd-16-1697-2023-f02

Figure 2The nearer a sample point x_k is located to a grid point, the larger the weight assigned to it. Grid point [i,j], e.g., receives a weight of $(1 - w_{x}) \cdot (1 - w_{y})$ .

Download

The other algorithmic fragment we require, which is implemented by Algorithm B.2, is the computation of a one-dimensional convolution of an arbitrary function g(x) with the rectangular function r_n(x) as defined in Eq. (5). Employing the definition of convolution, we obtain

\begin{array}{l} h (x) = g * r_{n} (x) = \int_{- \infty}^{\infty} g (x - t) \cdot r_{n} (t) d t \\ (10) & = \int_{- τ}^{τ} g (x - t) d t = \int_{x - τ}^{x + τ} g (t) d t, \end{array}

where $τ = σ \sqrt{3 / n}$ . In other words, the convolution g ∗ r_n at point x is simply the integral of g(x) in the window $[x - τ, x + τ]$ . Transferred to a one-dimensional grid with spacing Δs, the rectangular function r_n(x) reads as rectangular pulse:

r_{T} [k] = \{\begin{array}{ll} 1 & for | k | \leq T \\ 0 & otherwise \end{array} k \in Z,

with a width parameter T∈ℕ₀ that is gained by rounding $τ / Δ s$ to the nearest integer number:

\begin{matrix} (11) & T = ⌊\frac{τ}{Δ s} + \frac{1}{2}⌋ = ⌊\sqrt{\frac{3}{n}} \frac{σ}{Δ s} + \frac{1}{2}⌋ . \end{matrix}

Then, Eq. (10) translates in the discrete case to

\begin{array}{l} h [k] & = g * r_{T} [k] = \sum_{i = - \infty}^{\infty} g [k - i] \cdot r_{T} [i] \cdot Δ s \\ = \sum_{i = - T}^{T} g [k - i] \cdot Δ s = \sum_{i = k - T}^{k + T} g [i] \cdot Δ s, \end{array}

where element h[k] corresponds to h(k⋅Δs) and g[i] to g(i⋅Δs). Equivalently to the continuous case, the value h[k] results up to a factor Δs from putting a window of length 2T+1 centrally over the sequence element g[k] and summing up all elements covered by it.

Assuming that we already computed h[k−1], it is immediately clear that the following value h[k] results from moving the window by one position to the right and thus can be obtained very easily from h[k−1] by adding the newly enclosed sequence element g[k+T] but subtracting element $g [k - T - 1]$ that falls outside the window.

As in the earlier case of Algorithm B.1, the factor Δs cancels out in the final calculation of Eq. (9) and can therefore also be omitted here. Thus, Algorithm B.2 has 2T additions in step 1 and another 2(L−1) additions in the loop of step 3. Assuming that T is much smaller than L, an algorithmic complexity of 𝒪(L) results.

Now we are able to formulate Algorithm B.3 that computes convolutional expressions as found in the numerator and denominator of Approximation 3.

Note that due to the commutativity of $\overset{x}{*}$ and $\overset{y}{*}$ , the outer loop over index k can be moved inward within the loops over the rows and the columns, respectively. With this alternate loop layout, the field is first traversed row-wise in a single pass, where each row is convolved n times with r_T in one sweep. Subsequently, the field is traversed column-wise and each column is convolved n times. In such a way, more economic strategies with respect to memory access can be achieved and moreover, this loop order is very well suited for parallel execution, such that Algorithm B.3 can be computed very efficiently. Since in practice n is chosen constant (proven values for n lie between 3 and 6), the algorithmic complexity is 𝒪(W⋅H).

Algorithms B.1 and B.3 now allow us to state the final Algorithm B, which implements the approximate computation of Barnes interpolation Eq. (9).

If the denominator Q⁽ⁿ⁾[i,j] in step 8 is 0, which is the case if the grid point [i,j] has a greater distance than 2nT from the nearest sample point in at least one dimension, the corresponding field value F[i,j] is undefined and set to NaN.

Since Algorithms B.1 and B.3 are invoked twice and step 8 employs another W⋅H divisions, the overall algorithmic complexity of the presented approach is limited to $O (N + W \cdot H)$ , which is a drastic improvement compared to the costs of $O (N \cdot W \cdot H)$ of the naive implementation.

5 Results and further considerations

5.1 Test setup

The described Algorithm B – hereinafter denoted with “convolution” – was tested on a dataset that contained a total of 3490 QFF values (air pressure reduced to mean sea level) obtained from measurements at meteorological stations distributed over Europe and dating from the 27 July 2020 at 12:00 UTC. More specifically, we considered the geographic area $D = [- 26$ ^∘ E, 49^∘ E]×[34.5^∘ N, 72^∘ N]⊂ℝ², equipped with the Euclidean distance function defined on D×D, i.e., we measured distances in a first examination directly in the plate carrée projection neglecting the curved shape of the earth. The values of the QFF data range from 992.1 to 1023.2 hPa.

The convolution interpolation algorithm is subsequently compared with the results of two alternate algorithms. The first of them – referred to as “naive” – is given by the naive Algorithm A as stated in the Introduction. The second one – denoted with “radius” – consists of the improved naive algorithm that considers in its innermost loop only those observation points whose Gaussian weights w exceed 0.001 and thus are located within a radius of 3.717σ around the interpolation point. For this purpose, the implementation performs a so-called radius search on a k–d tree, which contains all observation points. Such a radius search can be achieved with a worst-case complexity of $O (\sqrt{N})$ .

All algorithms were implemented in Python using the Numba just-in-time compiler (Lam et al., 2015) in order to achieve compiled code performance using ordinary Python code. The tests were conducted on a computer with a customary 2.6 GHz Intel i7-6600U processor with two cores, which is in fact only of minor importance since the tested code was written in single-threaded manner. All time measurements were performed 10 times and the best value among them was set as the final execution time of the respective algorithm.

5.2 Visual results

In general, the Barnes method yields a remarkably good interpolation and results in an aesthetic illustration for regions where the distance between the sample points has the same order of magnitude as the used Gaussian width parameter σ. However, if the distance between adjacent sample points is large compared to σ, this method exhibits some shortcomings because then the interpolation converges towards a step function with steep transitions. This effect can be clearly identified, for example, in the generation of plateaus of almost constant value over the Atlantic Ocean in Fig. 3a. In the limit case, if σ→0, the interpolation produces a Voronoi tessellation with cells of constant value around a sample point that are bordered by discontinuities towards the neighboring cells.

https://gmd.copernicus.org/articles/16/1697/2023/gmd-16-1697-2023-f03

Figure 3Panel (a) shows exact Barnes interpolation with the naive algorithm for 3490 sample points depicted in red, a 2400×1200 grid with a resolution of 32 grid points per degree and $σ = {1.0}^{\circ}$ . Panel (b) shows approximate Barnes interpolation with the convolution algorithm for the same settings as for the naive algorithm above. The applied 4-fold convolution uses a rectangular mask of size 57. Areas where the denominator of Eq. (9) drops below a value of 0.001 or is even 0 are rendered with a darker shade.

The comparison of the isoline visualizations in Fig. 3a and b shows an excellent agreement between the two approaches in the well-defined areas. The result for the radius algorithm is consistently similar to the other two and is therefore not depicted.

Note that the shaded, i.e., the undefined areas of Fig. 3b correspond to those areas where Barnes interpolation produces the plateau effect. In this sense, one can state that the convolution algorithm filters out the problematic areas in an inherent way.

5.3 Time measurements

For a grid of constant size, the measured execution times in Table 1 show a linear dependence to the number N of considered sample points for the naive and the radius algorithms, while they are almost constant for the presented convolution algorithm. The costs of the injection step are obviously more or less inexistent compared to the costs of the subsequent steps of the algorithm. This fact is in total agreement with the deduced complexity $O (N + W \cdot H)$ , since in our test setup, the grid size W×H clearly dominates over N.

Table 1Execution times (in s) of the investigated algorithms for varying numbers of sample points. The grid size of 2400×1200 points with a resolution of 32 points per degree and the Gaussian width $σ = 1.0^{\circ}$ are kept constant. The convolution algorithm applied a 4-fold convolution.

Download Print Version | Download XLSX

https://gmd.copernicus.org/articles/16/1697/2023/gmd-16-1697-2023-f04

Figure 4Plot of execution times from Table 1 against number of sample points. Both axes use a logarithmic scale.

Download

Also note that between the naive and the convolution algorithms, the speed-up factor roughly ranges between 25 and 1000 for the considered number of sample points.

If the grid size is varied, all considered algorithms reveal a linear-like dependence to the number of points in the grid as can be seen in Table 2 and Fig. 5. For smaller grid sizes, the convolution algorithm provides a speed-up factor of around 2000 compared to the naive implementation, but for bigger grids, the factor drops below 1000.

This effect can be explained by the fact that the crucial parts of the convolution algorithm access memory for smaller grids with a high spatial and temporal locality and thus make optimal use of the highly efficient CPU cache memory (Patterson and Hennessy, 2014). For bigger grids, the number of cache misses increases, which result in a slightly degraded performance.

Table 2Execution times (in s) of the investigated algorithms for varying grid sizes and resolutions, respectively. The number of sample points N=3490 and the Gaussian width $σ = 1.0^{\circ}$ are kept constant. The convolution algorithm applied a 4-fold convolution.

Download Print Version | Download XLSX

https://gmd.copernicus.org/articles/16/1697/2023/gmd-16-1697-2023-f05

Figure 5Plot of execution times from Table 2 against number of grid points. Both axes use a logarithmic scale.

Download

As is to be expected, the Gaussian width parameter σ has no decisive impact on the execution times measured for the naive and the convolution algorithms (refer to Table 3 and Fig. 6). The radius algorithm, on the other hand, shows a quadratic dependence, since the relevant area around a grid point – and thus also the average number of sample points to be considered – grows quadratically with the radius of influence, which in turn depends linearly on σ.

The unstable time measurements for the naive algorithm are caused by a peculiar execution time behavior of the exponential function. Although the value of Eq. (2) is less than the smallest representable float and therefore results in 0 for distances d>38.6σ, its computational cost increases to a multiple of the time needed for shorter distances. With growing σ, this computational overhead is less and less noticeable in the test series.

Table 3Execution times (in s) of the investigated algorithms for varying Gaussian width parameters σ. The number of sample points N=3490 and the grid size of 2400×1200 points with a resolution of 32 points per degree are kept constant. The convolution algorithm applied a 4-fold convolution.

Download Print Version | Download XLSX

https://gmd.copernicus.org/articles/16/1697/2023/gmd-16-1697-2023-f06

Figure 6Plot of execution times from Table 3 against Gaussian width parameter. Both axes use a logarithmic scale.

Download

5.4 Algorithm fine tuning

In the discretization step, we round off the window width parameter T∈ℕ₀ to the nearest integer, i.e., we set

\begin{matrix} (11 recap) & T = ⌊\sqrt{\frac{3}{n}} \frac{σ}{Δ s} + \frac{1}{2}⌋, \end{matrix}

and then repeatedly convolve the input fields with a rectangular pulse signal r_T[k] that has a length 2T+1. In the extreme case, if n exceeds $12 σ^{2} / Δ s^{2}$ and hence T=0, the underlying pulse signal degrades to r₀[k], which is identical to the discrete unit pulse, the neutral element with respect to convolution. Under these circumstances, the algorithm stops rendering a meaningful interpolation. Since normally σ>Δs, the respective bound for n can go into the hundreds, which is why the described extreme case is typically not encountered for real applications.

Nevertheless, a general consequence of the rounding operation is that the effectively obtained Gaussian width σ_eff for our algorithm only approximates the desired width σ. For the following considerations, let u_T[k] be the uniform probability distribution that corresponds to r_T[k], i.e.,

u_{T} [k] = \frac{1}{2 T + 1} r_{T} [k] .

By algorithmic design, $σ_{eff}^{2}$ is the variance of u_T[k], which is convolved n times with itself and which is employed on a grid with point spacing Δs. Therefore,

\begin{array}{l} σ_{eff}^{2} = n Var (u_{T}) = n \sum_{k = - T}^{T} \frac{1}{2 T + 1} (k Δ s)^{2} \\ (12) & = \frac{2 n Δ s^{2}}{2 T + 1} \sum_{k = 1}^{T} k^{2} = \frac{n}{3} T (T + 1) Δ s^{2} . \end{array}

For T≥1, the integer number T can be fixed by Eq. (11) to the unit interval

\sqrt{\frac{3}{n}} \frac{σ}{Δ s} - \frac{1}{2} < T \leq \sqrt{\frac{3}{n}} \frac{σ}{Δ s} + \frac{1}{2},

which in turn allows us to derive sharp bounds for σ_eff when substituted into Eq. (12):

σ^{2} - \frac{n}{12} Δ s^{2} < σ_{eff}^{2} \leq σ^{2} + 2 \sqrt{\frac{n}{3}} σ Δ s + \frac{n}{4} Δ s^{2} .

The last relation reveals that the range of possible values for σ_eff grows with increasing n until σ_eff collapses to 0 when $n > 12 σ^{2} / Δ s^{2}$ , as we have seen further above. In summary, there are two diametrically opposed effects that determine the result quality of the presented algorithm. From the perspective of the central limit theorem, the approximation performance is better for large n, while on the other hand, σ_eff tends to be closer to the target σ for small n.

If full accuracy is required for the Gaussian width and the grid spacing Δs is not strictly given in advance, one can tune the spacing such that the resulting width σ_eff corresponds exactly to the desired σ. For T≥1, Eq. (12) then allows us to determine the necessary grid step as

Δ s = \sqrt{\frac{3}{n T (T + 1)}} σ .

If, however, the grid is fixed a priori and cannot be modified, the only option to attain the requested σ exactly is to refrain from using a stringent rectangular pulse and to employ a slightly more complicated signal (Gwosdek et al., 2011). For this purpose, we set this time as follows:

T = ⌊\frac{1}{2} (\sqrt{1 + \frac{12}{n} \frac{σ^{2}}{Δ s^{2}}} - 1)⌋,

which is gained from taking the positive solution of the quadratic Eq. (12) for T. Now we have $n Var (u_{T}) \leq σ^{2} < n Var (u_{T + 1})$ and we define the linearly blended signal for $0 \leq α < 1$ :

\begin{array}{l} r_{T, α} [k] = (1 - α) r_{T} [k] + α r_{T + 1} [k] \\ (13) & = \{\begin{array}{ll} 1 & for | k | \leq T \\ α & for | k | = T + 1 \\ 0 & otherwise \end{array} k \in Z . \end{array}

This modified signal r_T,α[k] is basically the pure rectangular signal r_T[k] of unit elements with a trailing element α appended at both ends. Due to the continuity of the variance, there must be a specific $\tilde{α}$ for which $n Var (u_{T, \tilde{α}}) = σ^{2}$ , whereas $u_{T, α} [k] = \frac{1}{2 (T + α) + 1} r_{T, α} [k]$ designates the probability distribution of r_T,α[k]. With

\begin{array}{l} Var (u_{T, α}) = \frac{1}{2 (T + α) + 1} \sum_{k = - T - 1}^{T + 1} r_{T, α} [k] (k Δ s)^{2} \\ = \frac{Δ s^{2}}{2 (T + α) + 1} (2 α (T + 1)^{2} + 2 \sum_{k = 1}^{T} k^{2}) \\ = \frac{Δ s^{2}}{2 (T + α) + 1} (2 α (T + 1)^{2} + \frac{1}{3} T (T + 1) (2 T + 1)), \end{array}

we conclude that the wanted $\tilde{α}$ is given by

\begin{matrix} (14) & \tilde{α} = \frac{(2 T + 1) (σ^{2} - \frac{1}{3} T (T + 1) n Δ s^{2})}{2 ((T + 1)^{2} n Δ s^{2} - σ^{2})} . \end{matrix}

The respective expression for $\tilde{α}$ derived in (Gwosdek et al., 2011) corresponds to a special case of Eq. (14) when setting Δs=1.

Remember now that the central limit theorem is valid for any PDF, which means that the mathematical framework presented in the previous chapters is also valid for the modified signal $r_{T, \tilde{α}} [k]$ . Therefore, we receive an optimized approximate Barnes interpolation by marginally adapting Algorithm B.2 to compute the convolution with $r_{T, \tilde{α}} [k]$ instead of r_T[k]. To do so, steps 2 and 5 of it have to be rewritten to $h [k] = (w + \tilde{α} \cdot (g [k + T + 1] + g [k - T - 1])) \cdot Δ s$ . Although the optimized approach requires 2L additions and L multiplications more than the original one, the adapted Algorithm B.2 remains in the complexity class 𝒪(L). Measurements (refer to Table 4 and Fig. 7) show that the optimized interpolation in fact needs only about 10 % to 30 % more time for the depicted range of convolution rounds than the original one.

Table 4Signal width parameter T and the effective Gaussian width σ_eff for the original convolution algorithm and the tail value $\tilde{α}$ for the optimized convolution algorithm as a function of the number of performed convolutions n, where N=3490, grid size is 2400×1200, with a resolution of 32 points per degree and $σ = 1.0^{\circ}$ . The root mean square error RMSE is computed for the sub-area displayed in Fig. 9 and with respect to the exact results of the naive algorithm. The execution times t_exec are measured in seconds.

Download Print Version | Download XLSX

https://gmd.copernicus.org/articles/16/1697/2023/gmd-16-1697-2023-f07

Figure 7Plot of execution times from Table 4 against number of performed convolutions for both the original and the optimized convolution algorithms.

Download

The unstable behavior of the original convolution algorithm with respect to the number of performed convolutions can be best observed in the RMSE plot of Fig. 8, where the baseline is given by the exact interpolation of the naive algorithm. A corresponding unsteady feature is also visible in the upper row of the maps shown in Fig. 9a, more precisely in the fluctuating diameter of the small high-pressure area west of the Balearic Islands.

https://gmd.copernicus.org/articles/16/1697/2023/gmd-16-1697-2023-f08

Figure 8Root mean square error with respect to the exact Barnes interpolation in dependency of the number of performed convolutions for both the original and the optimized convolution algorithms.

Download

In contrast to that, the optimal convolution algorithm shows a stable convergence towards the exact interpolation obtained by the naive algorithm, which manifests in strictly monotonic decreasing RMSE values. These results suggest that three or four performed convolution rounds already achieve a very good approximation of Barnes interpolation when used to visualize data, as done in this paper. For other applications, which require a higher precision, a 10-fold convolution or even more might make sense.

https://gmd.copernicus.org/articles/16/1697/2023/gmd-16-1697-2023-f09

Figure 9Results for a different number of performed convolutions n. Upper row (a) with results for the original convolution algorithm and lower row (b) with results for the optimized convolution algorithm.

5.5 Application on sphere geometry

So far, we applied the convolution algorithm on sample points contained in the plane ℝ² and used the Euclidean distance measure. For geospatial applications, this simplification is acceptable as long as the considered area is sufficiently small enough. If we deal with a dataset, which is distributed over a larger region – as it is actually the case in our test setup – it becomes necessary to take the curvature of the earth into account.

The adaptation of the naive Barnes interpolation algorithm to the spherical geometry on 𝒮² merely comprises the exchange of the Euclidean distance with the spherical distance. Since the calculation of the spherical distance between two points involves several trigonometric function calls, the price of such a switchover is accordingly high and consequently, the exact Barnes interpolation for a spherical geometry is roughly a factor of 2.5 times slower in our tests than that with the Euclidean approach. In other words, an already costly algorithm becomes even more expensive.

https://gmd.copernicus.org/articles/16/1697/2023/gmd-16-1697-2023-f10

Figure 10Exact Barnes interpolation with the naive algorithm for the same setup as for Fig. 3 but using spherical distances from 𝒮² instead of Euclidean distances from ℝ². The generated isolines are notably different in the northern part, while they are quite similar in the south.

However, the distance calculation does not occur explicitly in the convolution algorithm, since the latter by virtue of the central limit theorem is inherently tied to the Euclidean distance measure. Therefore, the convolution algorithm has to be transferred to the spherical geometry with a different method.

The chosen approach is to first map the sample points to a map projection, which distorts the region of interest as minimal as possible and then to apply the convolution algorithm directly in that map space. The resulting field is finally mapped back to the original map projection and provides an approximation of Barnes interpolation there with respect to the spherical geometry of 𝒮².

Projection types that are considered suitable for this purpose are conformal map projections. Conformal maps preserve angles and shapes locally, while distances and areas underlie a certain distortion. Conformal maps used often are (Snyder, 1987)

Mercator projection for regions of interest that stretch in east–west direction around the Equator,
transverse Mercator projection for regions with north–south orientation,
Lambert conformal conic projection for regions in the mid-latitudes that stretch in east–west direction or
polar stereographic projection for polar regions.

In order to replicate Fig. 10 with our fast optimized convolution algorithm, we therefore use a Lambert conformal conic map projection for our test setup with sample points in the mid-latitudes. We choose the two standard parallels that define the exact projection at latitudes of 42.5 and 65.5^∘ N, such that our region of interest is evenly captured by them. By nature of Lambert conformal conic maps, the chosen map scale is exactly adopted along these two latitudes, while it is too small between them and too large beyond them. Similarly, for a grid with a formal resolution of 32 grid points per degree that is embedded into this map, the specified resolution only applies exactly along the standard parallels, while the effective resolution between them is smaller and larger beyond them.

https://gmd.copernicus.org/articles/16/1697/2023/gmd-16-1697-2023-f11

Figure 11The effective scale of a Lambert conformal conic map in dependency of the latitude if the scale at 42.5 and 65.5^∘ N is set to 1.0. The minimum scale of 0.98 is reached at a latitude of 54.5^∘ N.

Download

We now employ the optimized convolution algorithm with a nominal Gaussian width parameter $σ = 1.0^{\circ}$ on the Lambert conformal conic map grid postulated above, in which we injected the given sample points beforehand. The resulting field shown in Fig. 12 thereby experiences a 2-fold approximation, the first one caused by the distortion of the map and the second one due to the approximation property of the convolution algorithm.

https://gmd.copernicus.org/articles/16/1697/2023/gmd-16-1697-2023-f12

Figure 12Optimized Barnes interpolation algorithm applied on the Lambert conformal map on a 2048×1408 grid with a resolution of 32 grid points per degree along the standard parallels at 42.5 and 65.5^∘ N, $σ = 1.0^{\circ}$ and a 4-fold convolution.

In a last step, the result field is mapped back to the target map, which uses a plate carrée projection in our case. In order to do this, the location of a target field element is looked up in the Lambert map source field and subsequently its element value is determined by bilinear interpolation of the four neighboring source field element values. This last operation performs an averaging and thus adds a further small error to the final result shown in Fig. 13. Similar to the case for the Euclidean distance, the comparison with the exact Barnes interpolation on 𝒮² in Fig. 10 yields a very good correspondence. This perception is also supported by measurement of the RMSE, which adds up for the same sub-area as investigated in Table 4 to 0.0467, which is negligibly larger than the corresponding 0.0367 measured for the Euclidean case.

https://gmd.copernicus.org/articles/16/1697/2023/gmd-16-1697-2023-f13

Figure 13Resulting field from Fig. 12 projected back to a plate carrée map.

Table 5Execution times (in s) for the naive and the optimized convolution algorithms using spherical distances on 𝒮² for varying grid sizes. The number of sample points N=3490 and the Gaussian width $σ = 1.0^{\circ}$ are kept constant. The convolution algorithm applied a 4-fold convolution and was executed on a Lambert map grid of the indicated size. The two split time columns show the separated execution times for the actual convolution and the subsequent back projection to the plate carrée map. For the investigated scenarios, the optimized convolution algorithm is more than 1000 times faster than the naive one.

Download Print Version | Download XLSX

5.6 Round-off error issues

Computing the convolution of a rectangular pulse in floating-point arithmetic using the moving window technique as described in Algorithm B.2 of Sect. 4 is extremely efficient, but it is also prone to imprecisions since round-off errors are steadily accumulated during the progress. Different approaches are known to reduce this error. The Kahan summation algorithm (Kahan, 1965), for instance, implements an error-compensation scheme at the expense of requiring essentially more basic operations than used for ordinary addition.

Another error-reduction technique that is effective in the context of Barnes interpolation is to center the numbers to be added around 0.0, where the mesh density of representable floating-point numbers is highest. For this purpose, an offset

\bar{f} = \frac{1}{2} (min_{1 \leq k \leq N} f_{k} + max_{1 \leq k \leq N} f_{k})

is initially subtracted from the sample values f_k, such that their range is exactly located around 0.0. The presented convolution algorithm is then applied to the shifted sample values. In a final step, the elements of the resulting field F[i,j] are shifted back to the original range by adding $\bar{f}$ to each of them. This modification basically needs $N + W \cdot H$ extra additions, such that the computational complexity of the convolution algorithm is not harmed and stays unchanged.

https://gmd.copernicus.org/articles/16/1697/2023/gmd-16-1697-2023-f14

Figure 14Constellation of QFF values over Iceland with generated isolines for $σ = 0.75^{\circ}$ . Note the 980.0 hPa observation value in the northeast of Iceland, which triggers the creation of the faulty isoline for the same value. By increasing the quantization degree of the resulting field values, the visual appearance is gradually improved.

Minimal numerical round-off errors can generate surprisingly prominent artifacts when areas of constant or near-constant data are rendered with an isoline visualization. In such cases, the value obtained by the convolution algorithm fluctuates seemingly randomly around the true constant value, and if it happens that one of the rendered isolines represents exactly this value, the visualized result may appear like a fractal curve as shown in Fig. 14. A counter measure against this effect is to apply a quantization scheme to the resulting values, which basically suppresses a suitable number of least significant bits and rounds the value off to the nearest eligible floating point number. After this operation, the obtained values in areas of constant data are actually constant and thus result in a more pleasant isoline visualization. In areas with varying data values, the quantization of the result data has no harmful impact.

6 Summary and outlook

We presented a new technique for computing very good approximations of Barnes interpolation, which offers speed-up factors for realistic scenarios that can reach into the hundreds or even into the thousands when applied on a spherical geometry. The underlying convolutional algorithm exhibits a computational complexity of $O (N + W \cdot H)$ , in which the number of sample points N is decoupled from the grid size W×H. This is a major improvement over the computational complexity of the naive algorithm of $O (N \cdot W \cdot H)$ . Our tests suggest that four to six iteration rounds of convolutions lead to approximations of Barnes interpolation that achieve highly satisfactory results for most applications.

The usage of the algorithm is not restricted to ℝ² or 𝒮² and it can easily be extended to higher dimensional spaces ℝⁿ. The algorithm allows us to incorporate quality measures that assign each sample point x_k a weight of certainty c_k, which specifies how much this point shall contribute to the overall result. To achieve this, the terms in the two sums in Eq. (1) are simply multiplied by the additional factor c_k, and likewise the same factors then also appear in Approximation 3.

Barnes interpolation is often used in the context of successive correction methods (Cressman, 1959; Bratseth, 1986) with or without a first guess from a background field. In this technique, the interpolation is not performed just once, but applied several times with decreasing Gaussian width parameters to the residual errors in order to minimize them successively. Needless to say that instead of exact Barnes interpolation, the convolutional algorithm can equally be used for the method of successive correction.

Since the presented solution for spherical geometries is only suitable for the treatment of local maps, we plan to generalize the approach to global maps in a next step. This could for instance be done by smoothly merging local Barnes interpolation approximation patches into a global approximation.

Furthermore, we also want to provide a statement about the quality of the calculated approximation depending on the number of performed convolutions by deriving a theoretical upper bound for the maximum possible error. It will also be of interest, in a similar way to Getreuer (2013), to consider other distributions for the approximation and investigate their behavior in terms of computational speed and accuracy.

Appendix A: n-fold self-convolution

For an integrable function $f (x) \in L^{1} (R) = {f : R \to R ∣ \int_{- \infty}^{\infty} ∣ f (t) ∣ d t < \infty}$ , the n-fold convolution with itself f^∗n(x) is recursively defined by

\begin{array}{l} f^{* (n + 1)} (x) = f * f^{* n} (x) \\ (A1) & = \int_{- \infty}^{\infty} f (t) \cdot f^{* n} (x - t) d t for n = 1, 2, \dots, \end{array}

with $f^{* 1} (x) = f (x)$ . The equivalent closed form representation

\begin{array}{l} f^{* (n + 1)} (x) \\ = \int_{- \infty}^{\infty} \dots \int_{- \infty}^{\infty} f (t_{1}) \dots f (t_{n}) f (x - t_{1} - \dots - t_{n}) d t_{1} \dots d t_{n} \end{array}

is in most cases only of formal interest, since in practice the effective calculation of the multiple integral will lead to the recursive definition (A1) again.

Analogously, in the case of an integrable function of two variables $f (x, y) \in L^{1} (R^{2}) = {f : R^{2} \to R ∣ \int_{- \infty}^{\infty} \int_{- \infty}^{\infty} ∣ f (s, t) ∣ d s d t < \infty}$ , the n-fold convolution with itself $f^{* n} (x, y)$ is recursively given by

\begin{array}{l} f^{* (n + 1)} (x, y) = f * f^{* n} (x, y) \\ (A2) & = \int_{- \infty}^{\infty} \int_{- \infty}^{\infty} f (s, t) \cdot f^{* n} (x - s, y - t) d s d t \end{array}

for $n = 1, 2, \dots$ and with $f^{* 1} (x, y) = f (x, y)$ .

Appendix B: Separable functions

A function of two variables $g (x, y) \in L^{1} (R^{2})$ is called separable if two functions of one variable g₁(x) and g₂(y) exist, both in L¹(ℝ), such that the following equality holds:

\begin{matrix} (B1) & g (x, y) = g_{1} (x) \cdot g_{2} (y) . \end{matrix}

The convolution of f(x,y) with a separable function can be decomposed into two unidirectional convolutions that only act along one of the coordinate axis. In order to make this clear, we define two left-associative operators $\overset{x}{*}$ and $\overset{y}{*}$ that map from $L^{1} (R^{2}) \times L^{1} (R) \to L^{1} (R^{2})$ by setting

\begin{array}{l} (B2) & f \overset{x}{*} g_{1} (x, y) = \int_{- \infty}^{\infty} f (s, y) \cdot g_{1} (x - s) d s, \\ (B3) & f \overset{y}{*} g_{2} (x, y) = \int_{- \infty}^{\infty} f (x, t) \cdot g_{2} (y - t) d t, \end{array}

where $\overset{x}{*}$ convolves along the x axis and $\overset{y}{*}$ along the y axis. With these definitions, we then find

\begin{array}{l} f * g (x, y) = \int_{- \infty}^{\infty} \int_{- \infty}^{\infty} f (s, t) \cdot g (x - s, y - t) d s d t \\ = \int_{- \infty}^{\infty} (\int_{- \infty}^{\infty} f (s, t) \cdot g_{1} (x - s) d s) g_{2} (y - t) d t \\ = \int_{- \infty}^{\infty} (f \overset{x}{*} g_{1} (x, t)) \cdot g_{2} (y - t) d t = f \overset{x}{*} g_{1} \overset{y}{*} g_{2} (x, y) . \end{array}

From the fact that we can change the order of integration, we finally infer the following:

\begin{array}{l} f * g (x, y) = f * (g_{1} (x) \cdot g_{2} (y)) \\ (B4) & = f \overset{x}{*} g_{1} \overset{y}{*} g_{2} (x, y) = f \overset{y}{*} g_{2} \overset{x}{*} g_{1} (x, y) . \end{array}

Hence, the two operands g₁ and g₂ commute, but note here that the unidirectional operators to their left have to be swapped with them as well. The n-fold convolution with a separable function thus decomposes into

\begin{array}{l} f * g^{* n} (x, y) = f * (g_{1} (x) \cdot g_{2} (y))^{* n} \\ = f \overset{x}{*} g_{1} \overset{y}{*} g_{2} \dots \overset{x}{*} g_{1} \overset{y}{*} g_{2} (x, y) . \end{array}

Due to the commutation law (B4), we can basically write the operands on the RHS of f in any order, but we prefer to group them as

\begin{array}{l} f * g^{* n} (x, y) = f \overset{x}{*} g_{1} \dots \overset{x}{*} g_{1} \overset{y}{*} g_{2} \dots \overset{y}{*} g_{2} (x, y) \\ (B5) & = f \overset{x}{*} {g_{1}}^{\overset{x}{*} n} \overset{y}{*} {g_{2}}^{\overset{y}{*} n} (x, y) . \end{array}

Because the last formula looks a bit cumbersome, we again use Eq. (B4) to join the two n-fold self-convoluted operands to a separable function, which then reads as

\begin{matrix} (B6) & f * g^{* n} (x, y) = f * ({g_{1}}^{\overset{x}{*} n} (x) \cdot {g_{2}}^{\overset{y}{*} n} (y)) . \end{matrix}

Throughout the paper, we use the concise representation of Eq. (B6) but keep in mind that it is equivalent to Eq. (B5), where each convolution operand is expressed separately, and thus indicates clearly how this expression is to be calculated.

Appendix C: Generalized approximate Barnes interpolation

For the derivation of Approximation 3, we used a separable two-dimensional PDF that consists of two uniform one-dimensional distributions. This result can be broadened, if more general marginal distributions are employed. For this purpose, let p₁(x) and p₂(x) be two one-dimensional PDFs with mean 0 and variance $\frac{σ^{2}}{n}$ . Now, we define the separable two-dimensional PDF $p (x, y) = p_{1} (x) \cdot p_{2} (y)$ , which has a mean vector 0 and a covariance matrix $\frac{σ^{2}}{n} I$ . Consequently, the PDF p(x,y) constructed in this way satisfies the assumptions of the central limit theorem (6) and thus, after performing the same conversion steps as taken in Sect. 3, follows the generalized

Approximation 4. For sufficiently large n, Barnes interpolation on the Euclidean plane ℝ² can be approximated by

\begin{matrix} (C1) & f (x, y) \approx \frac{(\sum_{k = 1}^{N} f_{k} \cdot δ_{x_{k}}) * ({p_{1}}^{\overset{x}{*} n} (x) \cdot {p_{2}}^{\overset{y}{*} n} (y))}{(\sum_{k = 1}^{N} δ_{x_{k}}) * ({p_{1}}^{\overset{x}{*} n} (x) \cdot {p_{2}}^{\overset{y}{*} n} (y))}, \end{matrix}

provided that the quotient is defined.

In practice, p₁(x) and p₂(x) will most often be chosen to be identical. Using the normal distribution φ_0,σ(x) with mean value 0 and variance σ², i.e.,

φ_{0, σ} (x) = \frac{1}{\sqrt{2 π} σ} e^{- \frac{x^{2}}{2 σ^{2}}},

it is clear from Eq. (8) that we can formulate Barnes interpolation based on a convolutional expression where even equality holds.

Theorem 1.

Let φ_0,σ(x) be the normal distribution with mean value 0 and variance σ². For Barnes interpolation on the Euclidean plane ℝ², the following then holds:

\begin{matrix} (C2) & f (x, y) = \frac{(\sum_{k = 1}^{N} f_{k} \cdot δ_{x_{k}}) * (φ_{0, σ} (x) \cdot φ_{0, σ} (y))}{(\sum_{k = 1}^{N} δ_{x_{k}}) * (φ_{0, σ} (x) \cdot φ_{0, σ} (y))} . \end{matrix}

Note that in the case of the normal distribution, it is sufficient to apply the convolution just once.

Code and data availability

The formal algorithms introduced in this paper are provided as Python implementation on GitHub https://github.com/MeteoSwiss/fast-barnes-py (last access: 11 March 2023) under the terms of the BSD 3-clause license and are archived on Zenodo https://doi.org/10.5281/zenodo.7651530 (Zürcher, 2023 a). The sample dataset and the scripts are also included, which allow us to reproduce the figures and tables presented in this paper. The interpolation code is available as well as Python package fast-barnes-py on https://PyPI.org/project/fast-barnes-py (Zürcher, 2023 b).

Competing interests

The author has declared that there are no competing interests.

Disclaimer

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Acknowledgements

The author would like to thank the topical editor Sylwester Arabas and the two anonymous reviewers for their valuable comments and helpful suggestions. All map backgrounds were made with Natural Earth, which provides free vector and raster map data at https://naturalearthdata.com (last access: 11 March 2023).

Review statement

This paper was edited by Sylwester Arabas and reviewed by two anonymous referees.

References

Barnes, S. L.: A Technique for Maximizing Details in Numerical Weather Map Analysis, J. Appl. Meteorol., 3, 396–409, https://doi.org/10.1175/1520-0450(1964)003<0396:ATFMDI>2.0.CO;2, 1964. a

Bentley, J. L.: Multidimensional Binary Search Trees Used for Associative Searching, Commun. ACM, 18, 509–517, https://doi.org/10.1145/361002.361007, 1975. a

Bratseth, A. M.: Statistical interpolation by means of successive corrections, Tellus A, 38, 439–447, https://doi.org/10.3402/tellusa.v38i5.11730, 1986. a

Cressman, G. P.: An Operational Objective Analysis System, Mon. Weather Rev., 87, 367–374, https://doi.org/10.1175/1520-0493(1959)087<0367:AOOAS>2.0.CO;2, 1959. a

Daley, R.: Atmospheric Data Analysis, Cambridge University Press, Cambridge, https://doi.org/10.1002/joc.3370120708, 1991. a

Finkel, R. A. and Bentley, J. L.: Quad Trees, a Data Structure for Retrieval on Composite Keys, Acta Inform., 4, 1–9, https://doi.org/10.1007/BF00288933, 1974. a

Getreuer, P.: A Survey of Gaussian Convolution Algorithms, Image Processing On Line, 3, 286–310, https://doi.org/10.5201/ipol.2013.87, 2013. a

Gwosdek, P., Grewenig, S., Bruhn, A., and Weickert, J.: Theoretical foundations of Gaussian convolution by extended box filtering, International Conference on Scale Space and Variational Methods in Computer Vision, 447–458, https://doi.org/10.1007/978-3-642-24785-9_38, 2011. a, b

Kahan, W.: Further remarks on reducing truncation errors, Commun. ACM, 8, 40, https://doi.org/10.1145/363707.363723, 1965. a

Klenke, A.: Probability Theory: a Comprehensive Course, 3rd Edn., Universitext, Springer, https://doi.org/10.1007/978-3-030-56402-5, 2020. a, b

Koppert, H. J., Pedersen, T. S., Zürcher, B., and Joe, P.: How to make an international meteorological workstation project successful, Twentieth International Conference on Interactive Information and Processing Systems for Meteorology, Oceanography, and Hydrology, 84th AMS Annual Meeting, Seattle, WA, 11.1, https://ams.confex.com/ams/84Annual/techprogram/paper_71789.htm (last access: 11 March 2023), 2004. a

Lam, S. K., Pitrou, A., and Seibert, S.: Numba: A LLVM-based Python JIT Compiler, LLVM '15: Proceedings of the Second Workshop on the LLVM Compiler Infrastructure in HPC, 1–6, https://doi.org/10.1145/2833157.2833162, 2015. a

Muirhead, R. J.: Aspects of Multivariate Statistical Theory, Wiley Series in Probability and Mathematical Statistics, John Wiley & Sons, Inc., https://doi.org/10.1002/9780470316559, 1982. a

Patterson, D. A. and Hennessy, J. L.: Computer Organization and Design: the Hardware/Software Interface, 5th Edn., Elsevier, https://dl.acm.org/doi/book/10.5555/2568134 (last access: 11 March 2023), 2014. a

Snyder, J. P.: Map Projections: A Working Manual, US Geological Survey Professional Paper 1395, US Government Printing Office, Washington, https://doi.org/10.3133/pp1395, 1987. a

Wells, W. M.: Efficient Synthesis of Gaussian Filters by Cascaded Uniform Filters, IEEE T. Pattern Anal., 8, 2, 234–239, https://doi.org/10.1109/TPAMI.1986.4767776, 1986. a

Zürcher, B.: MeteoSwiss/fast-barnes-py: fast-barnes-py v1.0.0 (v1.0.0), Zenodo [code], https://doi.org/10.5281/zenodo.7651530, 2023a. a

Zürcher B.: fast-barnes-py 1.0.0, PyPI [code], https://PyPI.org/project/fast-barnes-py, last access: 11 March 2023b. a