Articles | Volume 14, issue 8
Geosci. Model Dev., 14, 5205–5215, 2021
Geosci. Model Dev., 14, 5205–5215, 2021

Development and technical paper 18 Aug 2021

Development and technical paper | 18 Aug 2021

Copula-based synthetic data augmentation for machine-learning emulators

David Meyer et al.

Related authors

A stratospheric prognostic ozone for seamless Earth System Models: performance, impacts and future
Beatriz M. Monge-Sanz, Alessio Bozzo, Nicholas Byrne, Martyn P. Chipperfield, Michail Diamantakis, Johannes Flemming, Lesley J. Gray, Robin J. Hogan, Luke Jones, Linus Magnusson, Inna Polichtchouk, Theodore G. Shepherd, Nils Wedi, and Antje Weisheimer
Atmos. Chem. Phys. Discuss.,,, 2021
Preprint under review for ACP
Short summary
Evaluating and improving the treatment of gases in radiation schemes: the Correlated K-Distribution Model Intercomparison Project (CKDMIP)
Robin J. Hogan and Marco Matricardi
Geosci. Model Dev., 13, 6501–6521,,, 2020
Short summary
The importance of particle size distribution and internal structure for triple-frequency radar retrievals of the morphology of snow
Shannon L. Mason, Robin J. Hogan, Christopher D. Westbrook, Stefan Kneifel, Dmitri Moisseev, and Leonie von Terzi
Atmos. Meas. Tech., 12, 4993–5018,,, 2019
Short summary
A benchmark for testing the accuracy and computational cost of shortwave top-of-atmosphere reflectance calculations in clear-sky aerosol-laden atmospheres
Jeronimo Escribano, Alessio Bozzo, Philippe Dubuisson, Johannes Flemming, Robin J. Hogan, Laurent C.-Labonnote, and Olivier Boucher
Geosci. Model Dev., 12, 805–827,,, 2019
Short summary
Fast matrix treatment of 3-D radiative transfer in vegetation canopies: SPARTACUS-Vegetation 1.1
Robin J. Hogan, Tristan Quaife, and Renato Braghiere
Geosci. Model Dev., 11, 339–350,,, 2018
Short summary

Related subject area

Earth and space science informatics
Automated geological map deconstruction for 3D model construction using map2loop 1.0 and map2model 1.0
Mark Jessell, Vitaliy Ogarko, Yohan de Rose, Mark Lindsay, Ranee Joshi, Agnieszka Piechocka, Lachlan Grose, Miguel de la Varga, Laurent Ailleres, and Guillaume Pirot
Geosci. Model Dev., 14, 5063–5092,,, 2021
Short summary
A spatially explicit approach to simulate urban heat mitigation with InVEST (v3.8.0)
Martí Bosch, Maxence Locatelli, Perrine Hamel, Roy P. Remme, Jérôme Chenal, and Stéphane Joost
Geosci. Model Dev., 14, 3521–3537,,, 2021
Short summary
S-SOM v1.0: a structural self-organizing map algorithm for weather typing
Quang-Van Doan, Hiroyuki Kusaka, Takuto Sato, and Fei Chen
Geosci. Model Dev., 14, 2097–2111,,, 2021
Short summary
Using Shapley additive explanations to interpret extreme gradient boosting predictions of grassland degradation in Xilingol, China
Batunacun, Ralf Wieland, Tobia Lakes, and Claas Nendel
Geosci. Model Dev., 14, 1493–1510,,, 2021
Short summary
Current status on the need for improved accessibility to climate models code
Juan A. Añel, Michael García-Rodríguez, and Javier Rodeiro
Geosci. Model Dev., 14, 923–934,,, 2021
Short summary

Cited articles

Aas, K., Czado, C., Frigessi, A., and Bakken, H.: Pair-copula constructions of multiple dependence, Insur. Math. Econ., 44, 182–198,, 2009. 
Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G., Isard, M., Kudlur, M., Levenberg, J., Monga, R., Moore, S., Murray, D. G., Steiner, B., Tucker, P., Vasudevan, V., Warden, P., Wicke, M., Yu, Y., and Zheng, X.: TensorFlow: A System for Large-Scale Machine Learning, in: 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), Savannah, GA, 265–283, 2016. 
Bolton, T. and Zanna, L.: Applications of Deep Learning to Ocean Data Inference and Subgrid Parameterization, J. Adv. Model. Earth Syst., 11, 376–399,, 2019. 
Brenowitz, N. D. and Bretherton, C. S.: Prognostic Validation of a Neural Network Unified Physics Parameterization, Geophys. Res. Lett., 45, 6289–6298,, 2018. 
Cheruy, F., Chevallier, F., Morcrette, J.-J., Scott, N. A., and Chédin, A.: Une méthode utilisant les techniques neuronales pour le calcul rapide de la distribution verticale du bilan radiatif thermique terrestre, Comptes Rendus de l'Academie des Sciences Serie II, 322, 665–672, hal-02954375, 1996. 
Short summary
A major limitation in training machine-learning emulators is often caused by the lack of data. This paper presents a cheap way to increase the size of training datasets using statistical techniques and thereby improve the performance of machine-learning emulators.