Copula-based synthetic data augmentation for machine-learning emulators

Can we improve machine-learning (ML) emulators with synthetic data? If data are scarce or expensive to source and a physical model is available, statistically generated data may be useful for augmenting training sets cheaply. Here we explore the use of copula-based models for generating synthetically augmented datasets in weather and climate by testing the method on a toy physical model of downwelling longwave radiation and corresponding neural network emulator. Results show that for copula-augmented datasets, predictions are improved by up to 62 % for the mean absolute error (from 1.17 to 0.44 W m$^{-2}$).

When it comes to training ML models for weather and climate applications two main strategies may be identified: one in which input and output pairs are directly provided (e.g. both come from observations) and a second in which inputs are provided but corresponding outputs are generated through a physical model (e.g. parameterization schemes or even a whole weather and climate model). Although the former may be considered the most common training strategy in use today, when the underlying physical processes are well understood (e.g. radiative transfer) and numerical codes are available, the latter may be of particular interest for developing one-to-one emulators (i.e. statistical surrogates of their physical counterparts), which can be used to improve computational performance for a trade-off in accuracy (e.g. Chevallier et al., 1998;Meyer et al., 2021;Ukkonen et al., 2020;Veerman et al., 2021). Here, for clarity, we will only be focusing on the latter case and refer to them as emulators.
In ML, the best way to make a model more generalizable is to train it on more data (Goodfellow et al., 2016). However, depending on the specific field and application, input data may be scarce, representative of only a subset of situations and domains, or, in the case of synthetically generated data, require large computational resources, bespoke infrastructures, and specific domain knowledge. For example, generating atmospheric profiles using a general circulation model (GCM) may require in-depth knowledge of the GCM and large computational resources (e.g. data used in Meyer et al., 2021).
A possible solution to these issues may be found by augmenting the available input dataset with more samples. Al-though this may be a straightforward task for classification problems (e.g. by translating or adding noise to an image), this may not be the case for parameterizations of physical processes used in weather and climate models. In this context, it is common to work with high-dimensional and strongly dependent data (e.g. between physical quantities such as air temperature, humidity, and pressure across grid points). Although this dependence may be well approximated by simple physical laws (e.g. the ideal gas law for conditions found in the Earth's atmosphere), the generation of representative data across multiple dimensions for most weather and climate applications is challenging (e.g. the nonlinear relationship between cloud properties, humidity, and temperature).
To serve a similar purpose as real data, synthetically generated data thus need to preserve the statistical properties of real data in terms of individual behaviour and (inter-)dependences. Several methods may be suitable for generating synthetic data such as copulas (e.g. Patki et al., 2016), variational autoencoders (e.g. Wan et al., 2017, and, more recently, generative adversarial networks (GANs; e.g. Xu and Veeramachaneni, 2018). Although the use of GANs for data generation is becoming increasingly popular among the core ML community, these require multiple models to be trained, leading to difficulties and computational burden (Tagasovska et al., 2019). Variational approaches, on the other hand, make strong distributional assumptions that are potentially detrimental to generative models (Tagasovska et al., 2019). Compared to black-box deep-learning models, the training of vine copulas is relatively easy and robust, while taking away a lot of guesswork in specifying hyperparameters and network architecture. Furthermore, copula models give a direct representation of statistical distributions, making them easier to interpret and tweak after training. As such, copula-based models have been shown to be effective for generating synthetic data comparable to real data in the context of privacy protection (Patki et al., 2016).
The goal of this paper is to improve ML emulators by augmenting the physical model's inputs using copulas. We give a brief overview of methods in Sect. 2.1 with specific implementation details in Sect. 2.2-2.5. Results are shown in Sect. 3, with a focus on evaluating synthetically generated data in Sect. 3.1 and ML predictions in Sect. 3.2. We conclude with a discussion and prospects for future research in Sect. 4.

Overview
The general method for training an ML emulator for a set of N samples involves the use of paired inputs x = {x 1 , . . ., x N } and outputs y = {x 1 , . . ., x N } to find the best function approximation for a specific architecture and configuration. For (a) Inputs x are fed to the physical model to generate corresponding outputs y; x and y are used to train the ML emulator. (b) A data generation model (here copula) is fitted to inputs x to generate synthetic inputs x ; inputs x and x are fed to the physical model to generate corresponding outputs y and y ; both x, x and y, y are used to train the ML emulator. After training, the model (m; e.g. architecture and weights) is saved and used for inference on new data.
inference, the trained ML emulator is then used to predict new outputs y * from inputs x * . Outputs y are generated through a physical model from x and fed to the ML emulator for training (Fig. 1a). In this paper we introduce an additional step: augmentation through copula-based synthetic data generation (Fig. 1b). The method is demonstrated with a toy model of downwelling radiation as the physical model (Sect. 2.4) and a simple feed-forward neural network (FNN) as the ML emulator (Sect. 2.5). To evaluate the impact of copula-generated synthetic data on predictions we focus on predicting vertical profiles of longwave radiation from those of dry-bulb air temperature, atmospheric pressure, and cloud optical depth (other parameters affecting longwave radiative transfer, such as gas optical depth, are treated as constant in the simple model described in Sect. 2.4). This task is chosen at it allows us to (i) evaluate copula-based models for generating correlated multidimensional data (e.g. with dependence across several quantities and grid points), some of which (e.g. cloud optical depth) are highly non-Gaussian; (ii) develop a simple and fast toy physical model that may be representative of other physical parameterizations such as radiation, (urban) land surface, cloud, or convection schemes; and (iii) develop a fast and simple ML emulator used to compute representative statistics. Here we define case (a) as the baseline and generate six different subcases for case (b) using (i) three levels of data augmentation factors (i.e. either 1×, 5×, or 10× the number of profiles in the real dataset) (ii) generated from three different copula types. In the following sections we give background information and specific implementation details about the general method used for setting up the source data (Sect. 2.2), data generation (Sect. 2.3), target generation (Sect. 2.4), and estimator training (Sect. 2.5) as shown in Fig. 1b. Table 1. Profiles of input and output quantities used in this study. Input quantities are dry-bulb air temperature T , atmospheric temperature p, and cloud layer optical depth τ c . T and p are taken directly from the NWP-SAF dataset (Eresmaa and McNally, 2014), and τ c is derived from other quantities as described in Sect. 2.4. The output quantity downwelling longwave radiation L ↓ is computed using the physical model described in Sect. 2.4. Atmospheric model levels are 137 for full levels (FLs) and 138 for half-levels (HLs).

Source data
Inputs are derived from the EUMETSAT Numerical Weather Prediction Satellite Application Facility (NWP-SAF; Eresmaa and McNally, 2014) dataset. This contains a representative collection of 25 000 atmospheric profiles previously used to evaluate the performance of radiation models (e.g. Hocking et al., 2021;Hogan and Matricardi, 2020). Profiles were derived from 137-vertical-level global operational short-range ECMWF forecasts correlated in more than one dimension (between quantities and spatially across levels) and extending from the top of the atmosphere (TOA; 0.01 hPa; level 1) to the surface (bottom of the atmosphere; BOA; level 137). Inputs consist of profiles of dry-bulb air temperature (T in K; Fig. 2a), atmospheric pressure (p in Pa; Fig. 2b), and cloud layer optical depth (τ c ; Fig. 2c). τ c is derived from other quantities to simplify the development of models as described in Sect. 2.4. Dry-bulb air temperature, atmospheric pressure, and cloud layer optical depth are then used as inputs to the physical model (Sect. 2.4) to compute outputs containing profiles of downwelling longwave radiation (L ↓ in W m −2 ; Fig. 2d). As both copula models and ML emulator work on two-dimensional data, data are reshaped to input X and output Y matrices with each profile as row (sample) and flattened level and quantity as column (feature) and reconstructed to their original shape where required. Prior to being used, source data are shuffled at random and split into three batches of 10 000 profiles (40 %) for training (X train , Y train ), 5000 (20 %) for validation (X val , Y val ), and 10 000 (40 %) for testing (X test , Y test ).

Data generation
Data generation is used to generate additional input samples (here atmospheric profiles) to be fed to the physical model (Sect. 2.4) and ML (Sect. 2.5) emulator. Optimally, synthetically generated data should resemble the observed data as closely as possible with respect to (i) the individual behaviour of variables (e.g. the dry-bulb air temperature at a specific level) and (ii) the dependence across variables and dimensions (e.g. the dry-bulb air temperature across two levels). Copulas are statistical models that allow these two aims to be disentangled (Trivedi and Zimmer, 2006;Joe, 2014) and to generate new samples that are statistically similar to the original data in terms of their individual behaviour and dependence.

Background on copula models
Suppose we want to generate synthetic data from a probabilistic model for n variables Z 1 , . . ., Z n . To achieve the first aim, we need to find appropriate marginal cumulative distributions F, . . ., F n . A simple approach is to approximate them by the corresponding empirical distribution functions. To achieve the second aim, however, we need to build a model for the joint distribution function F (z 1 , . . ., z n ). The key result, Sklar's theorem (Sklar, 1959), states that any joint distribution function can be written as (1) The function C is called copula and encodes the dependence between variables. Copulas are distribution functions themselves. More precisely, if all variables are continuous, C is the joint distribution of the variables U 1 = F 1 (Z 1 ), . . ., U n = F n (Z n ). This fact facilitates estimation and simulation from the model. To estimate the copula function C, we (i) estimate marginal distributions F 1 , . . ., F n , (ii) construct pseudo-observations U 1 = F 1 (Z 1 ), . . ., U n = F n (Z n ), and (iii) estimate C from the pseudo-observations. Then, given estimated models C and F 1 , . . ., F n for the copula and marginal distributions, we can generate synthetic data as follows.

Parametric copula families
In practice, it is common to only consider sub-families of copulas that are conveniently parameterized. There is a variety of such parametric copula families. Such families can be derived from existing models for multivariate distributions by inverting the equation of Sklar's theorem: For example, we can take F as the joint distribution function of a multivariate Gaussian and F 1 , . . ., F n as the corresponding marginal distributions. Then Eq.
(2) yields a model for the copula called the Gaussian copula, which is parameterized by a correlation matrix. The Gaussian copula model includes all possible dependence structure in a multivariate Gaussian distribution. The benefit comes from the fact that we can combine a given copula with any type of marginal distribution, not just the ones the copula was derived from. That way, we can build flexible models with arbitrary marginal distributions and Gaussian-like dependence. The same principle applies to other multivariate distributions and many copula models have been derived, most prominently the Student's t copula and Archimedean families. A comprehensive list can be found in Joe (2014).

Vine copula models
When there are more than two variables (n > 2) the type of dependence structure these models can generate is rather limited. Gaussian and Student copulas only allow for symmetric dependencies between variables. Quite often, dependence is asymmetric, however. For example, dependence between Z 1 and Z 2 may be stronger when both variables take large values. Many Archimedean families allow for such asymmetries but require all pairs of variables to have the same type and strength of dependence. Vine copula models (Aas et al., 2009;Czado, 2019) are a popular solution to this issue. The idea is to build a large dependence model from only two-dimensional building blocks. We can explain this with a simple example with just three variables: Z 1 , Z 2 , and Z 3 . We can model the dependence between Z 1 and Z 2 by a two-dimensional copula C 1,2 and the dependence between Z 2 and Z 3 by another, possibly different, copula C 2,3 . These two copulas already contain some information about the dependence between Z 1 and Z 3 , the part of the dependence that is induced by Z 2 . The missing piece is the dependence between Z 1 and Z 3 after the effect of Z 2 has been removed. Mathematically, this is the conditional dependence between Z 1 and Z 3 given Z 2 and can be modelled by yet another two-dimensional copula C 1,3|2 . The principle is easily extended to an arbitrary number of variables Z 1 , . . ., Z n . Algorithms for simulation and selection of the right conditioning order and parametric families for each (conditional) pair are given in Dißman et al. (2013).
Because all two-dimensional copulas can be specified independently, such models are extremely flexible and allow for highly heterogenous dependence structures. Using parametric models for pairwise dependencies remains a limiting factor, however. If necessary, it is also possible to use nonparametric models for the two-dimensional building blocks. Here, the joint distribution of pseudo-observations ( U 1 , U 2 ) is estimated by a suitable kernel density estimator (see Nagler et al., 2017).

Implementation
Here we use Synthia (Meyer and Nagler, 2021) version 0.3.0 (Meyer and Nagler, 2020) with pyvinecopulib 0.5.5 (Nagler and Vatter, 2020) to fit three different copula types: Gaussian, vine-parametric, and vine-nonparametric. Vineparametric fits a parametric model for each pair in the model from the catalogue of Gaussian, Student, Clayton, Gumbel, Frank, Joe, BB1, BB6, BB7, and BB8 copula families and their rotations (see Joe, 2014, for details on these families) using the Akaike information criterion (AIC). Vinenonparametric uses transformation local quadratic likelihood fitting as explained in Nagler et al. (2017). Copulas are fitted to X train to generate synthetic training sets X train using three augmentation factors (i.e. each containing either 1×, 5×, or 10× the number of profiles in X train ). X train plus X train form augmented training sets containing 20 000 profiles (or double the amount of training data) for 1× augmentation factor and 60 000 and 110 000 profiles for 5× and 10× augmentation factors, respectively. As the generation of new profiles with copula models is random, the generation is also repeated 10 times for each case to allow meaningful statistics to be computed.

Target generation
Target generation is used to generate outputs from corresponding inputs using a physical model. Here, outputs are computed using a simple toy model based on Schwarzschild's equation (e.g. Petty, 2006) to estimate the downwelling longwave radiation under the assumption that atmospheric absorption does not vary with wavelength as where z is the geometric height, B is the Planck function at level z (i.e. B = σ SB T 4 , where σ SB is the Stefan-Boltzmann constant, giving the flux in W m −2 emitted from a horizontal black-body surface), and a is the rate at which radiation is intercepted and/or emitted. A common approximation is to treat longwave radiation travelling at all angles as if it were all travelling with a zenith angle of 53 • (Elsasser, 1942): in this case a = Dβ e , where β e is the extinction coefficient of the medium, and D = 1/cos(53) = 1.66 is the diffusivity factor, which accounts for the fact that the effective path length of radiation passing through a layer of thickness z is on average 1.66 z due to the multiple different angles of propagation. In the context of ML, a(z) and B(z) are known and F (z) is to be predicted. Here we use the difference in two atmospheric pressures expressed in sigma coordinates ( σ , where σ is the pressure p at a particular height divided by the surface pressure p 0 ) instead of z. The layer optical depth τ = β e z is calculated from the total-column gas optical depth τ g and cloud layer optical depth τ c as τ = τ c + τ g σ i , since σ is the fraction of mass of the full atmospheric column in layer i. Then, as the downwelling flux at the top of the atmosphere is 0, the equation is discretized as follows assuming B and a are constant within a layer: where B i is the Planck function of layer i, i = 1 − e −a i z = 1−e Dτ is the emissivity of layer i, L ↓ i+1/2 is the downwelling flux at the top of layer i, and L ↓ i−1/2 is the downwelling flux at the bottom of layer i. We compute L ↓ from T , p, and τ c using the real NWP-SAF (X train ) or augmented (X train plus X train ) data. To reduce, and thus simplify, the number of quantities used in the physical model and ML emulator (Sect. 2.5), τ c is pre-computed and used instead of vertical profiles of liquid and ice mixing ratios (q l and q l ) and effective radius (r l and r l ) as 3 2 p g q l ρ l r l + q i ρ i r i , where ρ l is the density of liquid water (1000 kg m −3 ), ρ i is the density of ice (917 kg m −3 ), and g is the standard gravitational acceleration (9.81 m s −2 ). For τ g we use a constant value of 1.7 determined by minimizing the absolute error between profiles computed with this simple model and the comprehensive atmospheric radiation scheme ecRad (Hogan and Bozzo, 2018).

Estimator training
As the goal of this paper is to determine whether the use of synthetic data improves the prediction of ML emulators, here we implement a simple feed-forward neural network (FNN). FNNs are one of the simplest and most common neural networks used in ML (Goodfellow et al., 2016) and have been previously used in similar weather and climate applications (e.g. Chevallier et al., 1998;Krasnopolsky et al., 2002). FNNs are composed of artificial neurons (conceptually derived from biological neurons) connected with each other; information moves forward from the input nodes through hidden nodes. The multilayer perceptron (MLP) is a type of FNN composed of at least three layers of nodes: an input layer, a hidden layer, and an output layer, with all but the input nodes using a nonlinear activation function.
Here we implement a simple MLP consisting of three hidden layers with 512 neurons each. This is implemented in TensorFlow (Abadi et al., 2015) and configured with the Exponential Linear Unit activation function, Adam optimizer, Huber loss, 1000-epoch limit, and early stopping with patience of 25 epochs. The MLP is trained with profiles of dry-bulb air temperature (Fig. 2a), atmospheric pressure (Fig. 2b), and layer cloud optical depth (Fig. 2c) as inputs and profiles of downwelling longwave radiation (Fig. 2d) as outputs. Inputs are normalized and both inputs and outputs are flattened into two-dimensional matrices as described in Sect. 2.2. The baseline case (Fig. 1a) uses 10 000 input profiles without data augmentation for training, and copulabased cases (Fig. 1b) use either 20 000, 60 000, or 110 000 profiles. The validation dataset Y val of 5000 profiles is used as input for the early stopping mechanism, while the test dataset Y test of 10 000 profiles is used to compute statistics (Sect. 3.2). Because of the stochastic nature of MLPs, training (and inference) is repeated 10 times for each case to allow meaningful statistics to be computed. Given that the generation of random profiles in the case of augmented datasets is also repeated 10 times (see Sect. 2.3.4), any case using data 5210 D. Meyer et al.: Copula-based synthetic data augmentation for machine-learning emulators Each point corresponds to a statistic for a single iteration in arbitrary units. The x axis represents the projection of real NWP-SAF X train , while the y axis represents that of the copula-generated data X train . Results are reported for Gaussian, vine-parametric, and vine-nonparametric copulas (see legend for keys). generation includes 100 iterations in total (i.e. for each data generation run, the estimator is trained 10 times).

Copula
The quality of synthetic data is assessed in terms of summary statistics (e.g. Seitola et al., 2014) between the training X train and copula-simulated X train datasets. For each copula type we compute a vector of summary statistics s i = f p i , where f is the statistic function and p i = Dw i , with D a matrix of flattened source or simulated data and w a vector of random numbers for the ith iteration. Summary statistics are computed for mean, variance, and quantiles, iterating 100 times to allow meaningful statistics to be computed. As we consider random linear combinations of variables in source and copula-generated data, we expect these summaries to coincide only if both marginal distributions and dependence between variables are captured. Figure 3 shows scatterplots of summary statistics s i for (a) mean, (b) variance, (c) standard deviation, and (d) 10 %, (e) 50 %, and (f) 90 % quantiles. Real NWP-SAF data are shown on the x axis and copulagenerated data on the y axis, with each point corresponding to a random projection as described earlier (100 points in to-tal). For a perfect copula model, we expect all points to fall on the main diagonal, where x = y. Figure 3 shows that for all copula models, synthetically generated data are close to the real data, with larger errors in variance and standard deviation.
Qualitatively, we can evaluate copula-generated profiles in terms of their overall shape and smoothness across multiple levels, as well as range and density at each level. To this end we plot a side-by-side comparison of source (Fig. 4, left panel) and Gaussian-copula-generated (Fig. 4, right panel) profiles showing the median profile and random selection of 90 profiles grouped in batches of 3 (i.e. each having 30 profiles) for the central 0 %-25 %, outer 25 %-50 %, and 50 %-100 % quantiles calculated with band depth statistics (López-Pintado and Romo, 2009). Simulated profiles of dry-bulb air temperature (Fig. 4b) appear less smooth than the real ones across levels (Fig. 4a); however, both density and range are simulated well at each level. Simulated profiles of atmospheric pressure (Fig. 4d) are simulated well: they are smooth across all levels with similar range and density (Fig. 4c). The highly non-Gaussian and spiky profiles of cloud optical depth (Fig. 4e) make qualitative comparisons difficult; however, simulated profiles (Fig. 4f) have a similar range and density, with high density for low values, and most range between levels 80 and 120.

Machine learning
To evaluate whether ML emulators trained on augmented datasets have lower prediction errors compared to the baseline, here we use the test dataset X test of 10 000 profiles defined in Sect. 2.2. Statistics are computed based on a vector of differences d between the physically predicted baseline Box plots of MB and MAE are shown in Fig. 5. Summary MB and MAE for the ML emulator with the lowest MAE using an augmentation factor of 10× are reported in Table 2. A qualitative side-by-side comparison of baseline and MLpredicted profiles using Gaussian-copula-generated profiles with an augmentation factor of 10× is shown in Fig. 6. MBs (Fig. 5a) across all copula types and augmentation factors are generally improved, with median MBs and respective spreads decreasing with larger augmentation factors. Overall, the Gaussian copula model performs better than vine-parametric and vine-nonparametric models. MAEs (Fig. 5b) show a net improvement from the baseline across all copula models and augmentation factors. When using an augmentation factor of 1× (i.e. with double the amount of training data), the median MAE is reduced to approximately 1.1 W m −2 from a baseline of approximately 1.4 W m −2 and further reduced with increasing augmentation factors. In the best case, corresponding to an augmentation factor of 10× (i.e. with an additional 100 000 synthetic profiles), the copula and ML emulator combinations with the lowest MAE (Table 2) show that MBs are reduced from a baseline of 0.08 W m −2 to −0.02 and −0.05 W m −2 for Gaussian and vine-nonparametric, respectively, but increased to 0.10 W m −2 for vine-parametric. MAEs are reduced from a baseline of 1.17 W m −2 to 0.45, 0.56, and 0.44 W m −2 for Gaussian, vine-parametric, and vine-nonparametric copula types, respectively.
The ML training configuration with the lowest overall MB and MAE combination during inference corresponds to a Gaussian copula and augmentation factor of 10× (Table 2). Errors between the physically predicted Y test and ML-predicted Y test are shown for the baseline (Fig. 6a) and Gaussian copula case (Fig. 6b). These are shown grouped by

Discussion and conclusion
Results from a qualitative comparison of synthetically generated profiles (Fig. 4) show that synthetic profiles tend to be less smooth and noisier than the real NWP-SAF. Nevertheless, a machine-learning evaluation shows that errors for emulators trained with augmented datasets are cut by up to 75 % for the mean bias (from 0.08 to −0.02 W m −2 ; Table 2)   Table 2. Mean bias (MB) and mean absolute error (MAE) for baseline and copula cases. Statistics shown for the ML emulator combination with the lowest MAE. Baseline trained using 10 000 real NWP-SAF profiles. Copula cases were trained using 110 000 profiles (10 000 real and 100 000 synthetic), i.e. with an augmentation factor of 10×. Bold indicates the lowest error.  Table 2).
In this study, we show how copula-based models may be used to improve the prediction of ML emulators by generating augmented datasets containing statistically similar profiles in terms of their individual behaviour and dependence across variables (e.g. dry-bulb air temperature at a specific level and across several levels). Although the focus of this paper is to evaluate copula-based data generation models to improve predictions of ML emulators, we speculate that the same or similar methods of data generation have the potential to be used in several other ML-related applications, such as to (i) test architectures (e.g. instead of cross-validation, one may generate synthetic datasets of different size to test the effect of sample size on different ML architectures), (ii) generate data for un-encountered conditions (e.g. for climate change scenarios by extending data ranges or relaxing marginal distributions), and (iii) compress data (e.g. by storing reduced parameterized versions of the data if the number of samples is much larger than the number of features).
Although so far, we have only highlighted the main benefits of copula-based models, several limiting factors should also be considered. A key factor for very high-dimensional data is that both Gaussian and vine copula models scale quadratically in the number of features -in terms of both memory and computational complexity. This can be alleviated by imposing structural constraints on the model, for example using structured covariance matrix or truncating the vine after a fixed number of trees. However, this limits their flexibility and adds some arbitrariness to the modelling process. A second drawback compared to GANs is that the model architecture cannot be tailored to a specific problem, like images. For such cases, a preliminary data compression step as in Tagasovska et al. (2019) may be necessary.
As highlighted here, data augmentation for ML emulators may be of particular interest to scientists and practitioners looking to achieve a better generalization of their ML emulators (i.e. synthetic data may act as a regularizer to reduce overfitting; Shorten and Khoshgoftaar, 2019). Although a comprehensive analysis of prediction errors using different ML architectures is out of scope, our work is a first step towards further research in this area. Moreover, although we did not explore the generation of data for un-encountered conditions (e.g. by extending the range of air temperature profiles while keeping a meaningful dependency across other quantities and levels), the use of copula-based synthetic data generation may prove useful to make emulators more resistant to outliers (e.g. in climate change scenario settings) and should be investigated in future research.
Code and data availability. Software, data, and tools are archived with a Singularity (Kurtzer et al., 2017) image and deposited on Zenodo as described in the scientific reproducibility section of Meyer et al. (2020). Users may download the archive (Meyer, 2021) and optionally re-run experiments.
Author contributions. DM was responsible for conceptualization, data curation, investigation, software, resources, visualization, and validation. Formal analysis was conducted by DM and TN. DM, TN, and RH developed the methodology. DM and TN were responsible for original draft preparation, and DM, TN, and RH conducted review and editing of the paper.
Competing interests. The authors declare that they have no conflict of interest.
Disclaimer. Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.