We introduce ClimateMachine, a new open-source atmosphere modeling framework which uses the Julia language and is designed to be scalable on central processing units (CPUs) and graphics processing units (GPUs). ClimateMachine uses a common framework both for coarser-resolution global simulations and for high-resolution, limited-area large-eddy simulations (LESs). Here, we demonstrate the LES configuration of the atmosphere model in canonical benchmark cases and atmospheric flows using a total energy-conserving nodal discontinuous Galerkin (DG) discretization of the governing equations. Resolution dependence, conservation characteristics, and scaling metrics are examined in comparison with existing LES codes. They demonstrate the utility of ClimateMachine as a modeling tool for limited-area LES flow configurations.

Hybrid computer architectures and the need to exploit the power of graphics processing units (GPUs) are increasingly driving developments in atmosphere and climate modeling (e.g.,

In this paper, we introduce ClimateMachine, a new open-source atmosphere model written in the Julia programming language

Since the pioneering work on turbulence in stratified flows by

One distinguishing aspect of the ClimateMachine LES is that it uses a nodal discontinuous Galerkin (DG) formulation to approximate the Navier–Stokes equations for compressible flow

In what follows, we describe the conceptual and numerical foundations and governing equations of ClimateMachine and demonstrate the model in a set of standard two- and three-dimensional benchmark simulations. Section

The working fluid of the atmosphere model is moist, potentially cloudy air, considered to be an ideal mixture of dry air, water vapor, and condensed water (liquid and ice) in clouds. Dry air and water vapor are taken to be ideal gases. The specific volume of the cloud condensate is neglected relative to that of the gas phases (it is a factor of

The density of the moist air is denoted by

Moist air mass satisfies the conservation equation:

Total water satisfies the balance equation:

The coordinate-independent form of the conservation law for momentum is

The tensor involving the diffusive flux

The specification of a thermodynamic or energy conservation equation closes the equations of motion for the working fluid. We use the total specific energy,

Total energy satisfies the conservation law

Thermodynamic constants in CLIMAParameters.

Furthermore, the flux

The terms involving

Pressure

Gibbs' phase rule states that in thermodynamic equilibrium, the temperature

Obtaining the temperature and condensate specific humidities from the state variables

This procedure allows the use of total moisture

The governing equations are discretized in space via a nodal DG approximation. To describe the DG procedure, we recast the Eqs. (

The DG solution of Eq. (

From now on, the subscript or superscript e is omitted with the understanding that all operations are executed element-wise unless otherwise stated. Furthermore, the physical elements in the

The operators defined on the reference elements are mapped onto the physical space by means of the transformation

The DG approximation of the differential Eq. (

Because the second-order derivatives in

For algorithmic efficiency, inexact quadrature is used to calculate the integrals above. By virtue of inexact integration and of Eqs. (

In order to achieve good parallel scaling it is necessary to overlap communication and computation to the fullest extent possible. With DG (and all element-based Galerkin methods) this can be naturally achieved by splitting Eq. (

ClimateMachine provides a suite of time integrators consisting of explicit Runge–Kunge methods as well as low-storage

The benchmarks presented in this paper with isotropic grid spacing are run using the fourth-order 14-stage method of

The governing equations are resolved with the discretizations presented in Sect.

The diffusive turbulent stress tensor

The diffusive flux

The unresolved flux of total enthalpy

The turbulent eddy viscosity

The SGS model developed by

The turbulent eddy viscosity of this model depends on first-order derivatives of velocities and is given by

When high-order Galerkin methods are used to solve nonlinear advection-dominated problems, spurious Gibbs oscillations may affect the solution and need to be addressed. ClimateMachine provides a set of spectral filters, cut-off filters, and artificial diffusion methods to remove these oscillations. While filters may be effective, we found that stabilizing the LES solution by means of the SGS eddy viscosity alone is effective and robust; this is in agreement with results shown by

We first demonstrate the convergence of the numerical solution to the Euler equations with an isentropic vortex advection problem. Following this, we demonstrate results from the ClimateMachine using standard benchmark problems including (1) dry rising thermal bubble in a neutrally stratified atmosphere, (2) dry density current, (3) hydrostatic and nonhydrostatic mountain-triggered linear gravity waves, (4) the Barbados Oceanographic and Meteorological Experiment (BOMEX), and (5) decaying Taylor–Green vortex in a triply periodic domain.

To demonstrate convergence of the numerical solutions, we consider the two-dimensional dry isentropic vortex advection problem with fourth-order polynomials, consistent with the benchmark cases shown in the sections that follow. We consider the pure advection of a vortex in a domain with edge lengths

Euclidean distance between initial and final solution vectors of the prognostic variables generated using Rusanov (octagons), Roe (squares), and Harten–Lax–van Leer contact (HLLC) numerical fluxes. Final solutions are evaluated at

Tests presented in Sect.

A neutrally stratified atmosphere with uniform background potential temperature

The SL and Vreman closures are used to model diffusive fluxes in this problem. The solutions show no discernible differences, and only the SL solution is shown. A visual comparison of the two becomes more meaningful when shear triggers mixing, which is shown for the density current test in Sect.

2.5D rising thermal bubble with effective resolution

The density current problem by

To reach solution grid convergence, this test is classically executed with a constant kinematic viscosity

The structure of potential temperature at the final time

2.5D density current. Potential temperature

To verify the correct behavior of the DG implementation in the presence of topographic features, the simple passive advection test described by

The initial scalar field

The topography is defined by the function

The contours of

Solution of the passive transport of scalar

To assess the correct implementation of a Rayleigh sponge layer to attenuate fast, upward-propagating gravity waves before they reach the top of the domain, two steady-state mountain-triggered gravity wave problems suggested by

These tests are affected by spurious oscillations that appear approximately

The linear hydrostatic case proposed by

The linear nonhydrostatic mountain waves are forced by a flow of uniform horizontal velocity

The steady-state solution at

Vertical velocity

The decaying Taylor–Green vortex (TGV) is a classical test to estimate the dissipative properties of turbulence models in the absence of solid boundaries.
The gravity-free flow is initialized in a triply periodic cube of dimensions

We first consider the volume-averaged kinetic energy, which provides insight into the dissipation characteristics of the flow with respect to nondimensionalized time

By means of a three-dimensional fast Fourier transform (FFT) of the velocity field, the kinetic energy spectrum is calculated
as

Taylor–Green vortex. Isosurfaces of zero Q-criterion (these identify surfaces where the vorticity norm is identical to the strain rate magnitude) on a

Results for the coarse-resolution simulations are presented in Fig.

Figure

Evolution of volumetrically averaged

Evolution of volumetrically averaged quantities in the numerical solution to the Taylor–Green vortex problem computed on

Figure

Figure

Kinetic energy spectra obtained using

BOMEX features a shallow-cumulus-topped boundary layer as described in

Figure

A large-domain simulation of BOMEX with effective horizontal resolution

BOMEX. Profile of the mean state of liquid potential temperature, total specific humidity, cloud fraction, liquid water specific humidity, and variance of the vertical velocity fluctuations averaged along the last hour of the simulation. The solutions with PyCLES and ClimateMachine were calculated with effective grid resolution

BOMEX. From left to right, time series of horizontally averaged LWP, cloud cover, and turbulence kinetic energy diagnosed from ClimateMachine using Vreman and SL. These results are consistent with ensemble results presented in Fig. 2 of the intercomparison study by

BOMEX. Instantaneous visualization of the shallow cumulus structures on a

We respectively define the time-dependent normalized total mass and energy changes as

Time evolution of the mass and total energy loss (relative change when compared against initial conditions) for a moist thermal bubble simulation. Blue line: relative change in energy; orange line: relative change in mass.

Demonstration of favorable scaling capabilities across multiple hardware types is critical to the utility of ClimateMachine as a competitive tool for large-eddy simulations. Toward this, we first examine strong scaling on CPU architectures. The rising thermal bubble problem described in Sect.

A single-rank GPU run of the test problem on a

This provides an estimate for a comparison between CPU and GPU hardware performance. However, the balance between memory bandwidth limits and computing operation limits guides the maximum scaling possible on the GPU hardware relative to its CPU counterpart, so this cannot be interpreted as a direct comparison across hardware types. Based on the present results, we conclude that it is more feasible to pursue strong scaling improvements on CPU hardware than on GPU hardware. Further optimization and exploration of scaling in ClimateMachine are ongoing work. Additional details on the hardware used for scaling tests can be found in Appendix

Speed-up of the time integration (solver) step relative to the time to solution for a single-rank simulation of the rising thermal bubble problem in an

To test the multi-GPU scalability of ClimateMachine, we first execute a BOMEX setup that is sufficiently large to saturate one GPU. The single-GPU execution represents the baseline from which we calculate the average time per time step denoted by

BOMEX weak scaling using 1D IMEX versus fully explicit time integration.

This paper introduced and assessed the LES configuration of ClimateMachine, a new Julia-language simulation framework designed for parallel CPU and GPU architectures. Notable features of this LES framework are the following:

conservative flux form model equations for mass, momentum, total energy, and total moisture to ensure global conservation of dynamical variables of interest (up to nonconservative source or sink processes);

discontinuous Galerkin discretization with element-wise evaluation of the approximations to volume and interface integrals, resulting in reduced time to solution due to MPI operations;

application of model equations to the solution of benchmark problems in typical LES codes, including atmospheric flows in the shallow cumulus regime (BOMEX); and

demonstration of strong scaling on CPUs with up to 32 MPI ranks (speed-up of 19.7 in time to solution) and weak scaling with up to 16 GPUs (95 %–98 %) in both dry and moist simulation configurations.

Rigid surfaces are considered impenetrable such that the wall-normal component of the velocity vanishes at rigid boundaries by imposing

In the case of free-slip conditions at a solid surface (indicated by subscript “sfc”), there is no viscous or SGS momentum transfer between the atmosphere and the surface, such that

As for momentum, the advective specific humidity fluxes normal to a rigid surface vanish, but the diffusive or SGS specific humidity fluxes normal to the surface may not vanish. Normal components of SGS fluxes of condensate,

As for momentum and humidity, the advective energy fluxes normal to a rigid surface vanish, but the diffusive or SGS flux of total enthalpy,

The value of

To prevent the reflection of fast, upward-propagating gravity waves at the top boundary, a Rayleigh damping sponge is added to the right-hand side of the momentum equation (see Sect.

For the boundary velocity corresponding to the impenetrable wall condition, we use the following reflecting condition:

Diffusive fluxes are applied by a direct specification of the wall-normal fluxes, and over-specified boundary conditions are avoided by using only the interior (

Since the flow is compressible, we use density-weighted Favre averages following

This section summarizes the hardware characteristics for the primary computing resources used in tests throughout this paper. This is particularly relevant to the data presented in Sect.

This section provides additional information on the comparison of the density current benchmark in ClimateMachine with existing literature references in Table D1.

Summary of frontal locations for the density current test case from existing literature. Results tabulated are of the front location at

ClimateMachine is an open-source framework that is maintained on GitHub:

The test cases presented in the paper are available in the following directory locations in the source code.

Sect.

Sect.

Sect.

Sect.

Sect.

Sect.

Sect.

AS contributed to analysis, methodology, software, and writing – review and editing. YT contributed to analysis, visualization, software, and writing – review and editing. SM contributed to conceptualization, methodology, software, and writing – original draft preparation, review, and editing. ZS contributed to software, analysis, visualization, and writing – review and editing. CK developed software. SB developed software. KP developed software and contributed to the analysis. MW was responsible for methodology and software. THG contributed to methodology, software, and writing – review and editing. JEK contributed to conceptualization, methodology, and software. VC developed software. LCW contributed to conceptualization, methodology, and software. FXG contributed to conceptualization, methodology, software, and writing – review and editing. TS contributed to conceptualization, methodology, software, project administration, and writing – original draft preparation, review, and editing.

At least one of the co-authors is a member of the editorial board of

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The computations presented here were conducted at the Resnick High-Performance Computing Center, a facility supported by the Resnick Sustainability Institute at the California Institute of Technology (formerly known as the Central HPC Cluster, with partial support by a grant from the Gordon and Betty Moore Foundation), and on the Google Cloud Platform with in-kind support by Google. We thank the Google team for their assistance with operations on the Google Cloud Platform.

This research was made possible by the generosity of Eric and Wendy Schmidt by recommendation of the Schmidt Futures program and by the Paul G. Allen Family Foundation, Charles Trimble, the Audi Environmental Foundation, the Heising-Simons Foundation, and the National Science Foundation (grants AGS-1835860 and AGS-1835881). Additionally, Valentin Churavy was supported by the Defense Advanced Research Projects Agency (DARPA, agreement HR0011-20-9-0016) and by the NSF (grant OAC-1835443). Part of this research was carried out at the Jet Propulsion Laboratory, California Institute of Technology, under a contract with the National Aeronautics and Space Administration.

This paper was edited by Travis O'Brien and reviewed by two anonymous referees.