Traditional trial-and-error tuning of uncertain parameters in global atmospheric general circulation models (GCMs) is time consuming and subjective. This study explores the feasibility of automatic optimization of GCM parameters for fast physics by using short-term hindcasts. An automatic workflow is described and applied to the Community Atmospheric Model (CAM5) to optimize several parameters in its cloud and convective parameterizations. We show that the auto-optimization leads to 10 % reduction of the overall bias in CAM5, which is already a well-calibrated model, based on a predefined metric that includes precipitation, temperature, humidity, and longwave/shortwave cloud forcing. The computational cost of the entire optimization procedure is about equivalent to a single 12-year atmospheric model simulation. The tuning reduces the large underestimation in the CAM5 longwave cloud forcing by decreasing the threshold relative humidity and the sedimentation velocity of ice crystals in the cloud schemes; it reduces the overestimation of precipitation by increasing the adjustment time in the convection scheme. The physical processes behind the tuned model performance for each targeted field are discussed. Limitations of the automatic tuning are described, including the slight deterioration in some targeted fields that reflect the structural errors of the model. It is pointed out that automatic tuning can be a viable supplement to process-oriented model evaluations and improvement.

In general circulation models (GCMs), physical
parameterizations are used to describe the statistical characteristics of
various subgrid-scale physical processes

Recent studies take advantage of optimization algorithms to automatically and
more effectively tune the uncertain parameters
(

One approach to reduce the high computational burden is to approximate and
replace the expensive model simulations with a cheaper-to-run surrogate
model, which uses the regression methods to describe the relationship between
input (i.e., the adjustable parameters of a model) and output (i.e., the
output variables of a GCM)

The purpose of this study is to describe a method that combines automatic
tuning with short-term hindcasts to optimize physical parameters, and
demonstrate its application by using CAM5. The tuning parameters are selected
based on previous CAM5 parameter sensitivity analysis works (i.e.,

The paper is organized as follows. The next section gives the description of the model and experimental design. Section 3 describes the tuning parameters, metrics and the optimization algorithm. The optimized model and results are presented in Sect. 4. The last section contains the summary and discussion.

The comparison between short-term hindcasts and long-term
Atmospheric Model Intercomparison Project (AMIP). The

In this study, we use CAM5 as an example. The dynamical core uses the
finite-volume method of

Two types of model experiments are conducted. One is the short-term hindcast
simulations for model tuning. The second is Atmospheric Model Intercomparison Project (AMIP) simulation for verification
of the tuned model. The hindcasts are initialized by the Year of Tropical
Convection (YOTC) from the European Center for Medium-Range Weather Forecasts
(ECMWF) reanalysis. The initialization uses the approach described in

The observational data are from the Global Precipitation Climatology Project
(GPCP;

For this study, we focus on tuning parameters that are associated with fast
physical processes so that short-term hindcasts can be used as an economical
way of tuning. The philosophy behind the hindcasts is to keep the model
dynamics as close to observation as possible while testing how the model
simulates the quantities associated with fast physical processes. In other
words, given the correct large-scale atmospheric conditions, errors in the
physical variables are used to calibrate the fast physics parameters. This is
different from calibration using AMIP simulations in which the circulation
responds to the physics. The feasibility of the hindcast approach is based on
the fact that errors in atmospheric models show up quickly in initialized
experiments (

A summary of parameters to be tuned in CAM5. The default and final tuned optimal values are shown, as well as the valid ranges of the corresponding parameters. ZM indicates the Zhang–McFarlane convection scheme. CAPE indicates convective available potential energy.

Parameter estimation for a complex model involves several choices, including (1) what parameters to optimize and what the range of uncertainties is in the parameters; (2) how to select and construct a performance metric; (3) how to estimate/optimize the parameters in a high-dimensional space; and (4) how to embed the parameter estimation in the process-based evaluation and development of the model. This section describes the first three questions. The last question is left to Sect. 4.

In our study, the tuning parameters are selected based on the CAM5
sensitivity results of

For the uncertainty ranges of the parameters to be used as bounds of optimal
tuning, ideally, they should be derived from the development process of the
parameterizations as part of the information from the empirical fitting to
observations or to process models. In practice, however, most
parameterizations do not contain this information. The uncertainty ranges of
the parameters in this study are based on

The selected output variables of CAM5 included in the performance metrics and the sources of the corresponding observations.

Several metrics have been used in the literature to quantitatively evaluate
and compare the performance of overall simulations of climate models

When the differences between model simulation and observation at different grid points are independent of each other and follow normal distributions, minimizing the MSE over all grids would be equivalent to the maximum likelihood estimation of the uncertain parameters. For our experimental design, however, the mismatch between the short-term forecasts and instantaneous observation could be caused by small spatial displacements due to errors in the model initial condition instead of the model parameters. In such cases, errors could be highly correlated between neighboring grids, and the dependence of the metric on the control parameters may be marginalized or obscured. This problem may be lessened in long-term climate simulations, but extra care is needed for short-term forecasts. We therefore choose to use zonally averaged fields from the model and observations in the metric calculation to focus on the effective response at global scale.

The optimization method is based on an improved downhill simplex optimization
algorithm to find a local minimum.

The optimization procedure takes two steps. First, preprocessing of
selected parameter initial values is carried out to accelerate the
convergence of the optimization algorithm and to account for the ill conditioning
of the minimization problem. Next, the improved downhill simplex optimization
algorithm is utilized to solve the problem due to its fast convergence and
low computation for low-dimensional space. Meanwhile, an automatic workflow

The preprocessing uses a sampling strategy based on the single parameter
perturbation (SPP) method, in which, at one time, it perturbs only one
parameter with others fixed. The perturbed samples are uniform distribution
across parametric space. Equation (3) defines the improvement index for each
parameter sample. The distance of samples, defined as the difference between
the indexes from using two adjacent samples, is then calculated. We call this
step the first-level sampling. If the distance between two adjacent samples
is greater than a predefined threshold, more refined samples between these
two adjacent samples are conducted. This is the second-level sampling.
Finally, the candidate initial values for the optimization method choose the

The downhill simplex algorithm calculates the parameter values and the corresponding improvement index as defined in Eq. (3) in each step of the iterations. The optimal results are achieved by expanding or shrinking the simplex geometry in each optimal step. In the processes of searching for the minimum index, the best set of tuning parameter values up to the current iteration step is kept to look for the direction and magnitude of the increments. The iteration is terminated when the tuning parameters reach quasi-steady state.

Figure 2 summarizes the workflow of the experiments. The workflow is automated. It has two components: model calibration and verification. The calibration uses the hindcasts, the predefined metric, and the optimization algorithm to derive the optimal parameter values. The verification uses the AMIP climate simulation to check how effective the auto-calibration is for the application goal, which is to improve the metric in the AMIP simulation.

Flow diagram of the automatic calibration of parameters via the short-term CAPT and the verification of optimized parameters through long-term AMIP simulations.

The change of performance index in the optimization iterations as a function of iteration step is shown in Fig. 3. The blue line is the best performance index up to the current step. The red line is the real performance up to the current step. The latter has spikes during the iteration, especially near step 70, suggesting that the performance index in the parameter space has a complex geometry. Each iteration involves 31 days of hindcasts. The iteration is stopped at about the 142nd iteration step when the searched parameters reach quasi-steady state. With 180 computing cores on a Linux cluster, each iteration takes about 50 min. The computational time for an entire optimization is equivalent to about 12 years of an AMIP simulation, which is a tremendous reduction of computing time relative to traditional model tuning.

The change of performance index in the optimization iterations. The

The tuned values of the parameters are given in the column of “tuned” in
Table 1. In the default model, the autoconversion parameter

The performance index of the tuned model in the hindcasts and the normalized
MSE of the individual fields in the metric are given in Table 3 under the
hindcasts column. The performance index is reduced by about 10 % in the
tuned model. This is relatively a significant reduction, considering the fact
that CAM5 is already a well-tuned model, and a major upgrade of the CAM model
from CAM4 to CAM5 also saw that changes in most of the variables are within a
10 % range in terms of RMSE

The optimal improvement index of each variable and total comprehensive metric of the CAPT run and AMIP run.

The spatial distribution of high cloud amount in the

Same as Fig. 4 except for LWCF.

The next critical question is whether the optimal results tuned in hindcasts are shown in the AMIP simulation. The last column in Table 3 under the heading of “AMIP” gives the performance index of the tuned model and the normalized MSE of the individual fields from the AMIP simulation. Three things are noted. First, the overall performance index is also improved by about 10 % in the AMIP simulation in the tuned model. Second, as in the hindcasts, the largest improvement is in the LWCF. Third, the fields that were improved in the AMIP simulations are the same as those in the hindcasts. We therefore conclude that the automatic tuning achieved the design goal of the algorithm.

We also examined a 10-variable metric that is used by the Atmospheric Model
Working Group (AMWG) of the Community Earth System Model (CESM)
(

The percentage biases of the 10 fields between the default and tuned models and their reference observations.

Next, we examine the physical processes behind the changed performance index in the tuned model. Figure 4a, b, and d show, respectively, the annually averaged high cloud amount in the AMIP simulation of the satellite observation from CloudSat and CALIPSO, the default model, and the model bias. It is seen that CAM5 significantly underestimated high clouds in the tropics, including the western Pacific warm pool, and the central Africa and the US, except in the narrow zonal band of the Intertropical Convergence Zone (ITCZ) in the Pacific. The model also underestimated high clouds in regions of middle-latitude storm tracks. Since high clouds have a large impact on the LWCF, these biases in the high clouds would cause underestimation of LWCF. Figure 5a, b, and d show the LWCF in the observation, the default model, and the model bias. The bias field (Fig. 5d) clearly shows that the model significantly underestimates the LWCF. Its spatial pattern largely mirrors the bias field in high cloud amount in Fig. 4d.

In the model optimization, as described before, a smaller relative humidity threshold value for high clouds in the cloud scheme and a smaller sedimentation velocity of ice crystals were derived. These two parameter adjustments can both act to increase high cloud amount and thus longwave cloud forcing. The simulated high cloud and its bias relative to observation are shown in Fig. 4b to e. It can be seen that the overall bias in high cloud is significantly reduced in the tuned model. This leads to reduced negative bias in LWCF in the optimal model (Fig. 5b to e).

Changes in clouds are inevitably accompanied by changes in the SWCF,
which was slightly deteriorated in the tuned model as
discussed previously. We find that while high clouds are increased in the
tuned model, clouds in the middle troposphere are reduced in middle and high
latitudes (Fig. 6). This reduction in middle clouds may have compensated
the impact of increased high clouds on SWCF, since SWCF is also used in the
performance metric. This reduction of middle clouds is consistent with the
increased precipitation efficiency parameter

Pressure–latitude distributions of cloud fraction in

Meridional distribution of the AMIP difference between EXP/CNTL and
observations of LWCF

The impact of the tuning on other targeted fields is less dramatic than on LWCF. To see the impact clearly, we show in Fig. 7 the zonally averaged biases in the AMIP simulation from the default CAM5 as the blue lines and the optimized model as the red lines. The two-dimensional map figures are given in the Supplement. In addition to the large improvement in the LWCF, the overall improvement in PRECT and T850 can be seen. The optimized model simulates slightly smaller precipitation (PRECT) and warmer atmosphere (T850), which are all closer to observations. The reduction in precipitation is consistent with the larger value of the convection adjustment timescale in the tuned model than in the default model. The convection scheme uses a quasi-equilibrium closure based on the CAPE. The adjustment timescale is the denominator in the calculation of the cloud-base convective mass flux. When the timescale is longer, the mass flux is smaller and so is the convective precipitation. This reduction in precipitation is one likely cause of the larger SWCF (less cloud reflection) in the tuned model. In addition to the convection adjustment timescale, other parameters also impact precipitation. In particular, the impact of the increased precipitation efficiency over the ocean in the tuned model should partially offset the impact of the longer convective adjustment timescale. The change of PRECT is the net outcome of the multivariate dependences on all parameters that is found by the automatic optimization algorithm for the overall improvement of the performance index.

The increase in LWCF and the reduced PRECT in the optimal model
are energetically consistent for the atmosphere. There is less atmospheric
longwave radiative cooling and less condensational heating in the tuned
model. The magnitude of the LWCF increase is large (2.42 W m

While consistent improvements in different fields are desired, this is not always possible. For example, a warmer atmosphere is often accompanied by a moister atmosphere. Since temperature in the tuned model is warmer than that in the default model, there is more moisture in the tuned model. The atmosphere in the default model is already too moist (Fig. 7d). As a result, the performance index in Q850 is slightly deteriorated. Since the optimization is based on a single combined metric of several target variables, the algorithm seeks to minimize this combined metric at the expense of the performance of other variables as long as the total metric is reduced. The fact that the default CAM5 overestimated water vapor and underestimated temperature as shown in Fig. 7d and e indicates structural errors in the model; improving temperature could lead to larger biases in water vapor in the current model.

In summary, the improved performance index in the LWCF is consistent with the dominant impact of the reduced values in the threshold relative humidity for high clouds and the sedimentation velocity of ice crystals. The improvement in PRECT is consistent with the increased convective adjustment timescale. The improvement in T850 is consistent with the large increase in LWCF and reduced radiative cooling of the atmosphere. The deterioration in SWCF is consistent with the impact of increased autoconversion rate, longer convective adjustment timescale, and increased threshold relative humidity of low clouds, all of which can lead to reduction of cloud water. The deterioration in Q850 is likely the result of larger T850 in the tuned model.

These results point to both the benefits and limitation of the described model tuning. The benefit is the improvement in a predefined metric, which has led to improvements in several fields. The limitation is that not all fields can be improved. Some fields may get worse as a result of the algorithm achieving the largest improvement in the total predefined metric. One may use different weights for different fields in Eq. (1) or impose conditional limits on the normalized MSE for the individual fields. The benefits of such alternative approaches will surely depend on specific applications, but structural errors cannot be eliminated by the tuning.

We have presented a method of economic automatic tuning by using short-term hindcasts for 1 month. It is used to optimize CAM5 by adjusting several empirical parameters in its cloud and convection parameterizations. The computational cost of the entire tuning procedure is less than 12 years of a single AMIP simulation. We have demonstrated that the tuning accomplished the design goal of the algorithm. We show about 10 % improvement in our predefined metric for CAM5 that is already a well-calibrated model. Among the five targeted fields of LWCF, SWCF, PRECT, T850, and Q850, the largest improvement is to LWCF, which has about 40 % improvement in the zonal mean MSE. We have shown that while the improvements in LWCF, PRECT, and T850 are consistent with the improved atmospheric energy budget, they lead to slight deterioration in the SWCF and Q850 that reflects structural errors of the model. The overall improvement is also seen in the 10-variable AMWG metrics.

The optimized model contains reduced values of the threshold relative humidity for high clouds and sediment velocity of ice crystals, which act to increase the high cloud amount and increase the longwave cloud forcing, thereby reducing its significant underestimation in the default model. The optimization gave increased convection adjustment time that can explain reduced precipitation in the tuned model and the reduction of the precipitation biases. These two changes also help to reduce the temperature bias. The gains in these fields, however, are accompanied by slight deterioration in shortwave cloud forcing that is consistent with the reduced precipitation, and slight deterioration in humidity that is consistent with the increased temperature. The optimized results can help to understand the interactive effect of multiple parameters and discover the systematic and structural errors by exploring the parameter calibration ultimate performance.

While benefits of the automatic tuning are clearly seen, there are several limitations of using the present workflow for automatic tuning of GCMs. First, not all fields can be simultaneously improved, since parameter tuning cannot eliminate structural errors in the model. Tuning is not an alternative to improving a model, but rather it is an economic way to calibrate some parameters within a candidate parameterization framework. Second, the optimized model may be caused by compensation of errors. Therefore, process-based model evaluation and physical explanation of the model improvements are always necessary. Third, the tuning by using hindcasts is only applicable for parameters affecting fast physics. For model bias that develops over long timescales, such as that from coupled ocean–atmosphere models, this approach cannot be used, although the conceptual approach may be applied with longer integrations. Finally, the choices of the model parameters, uncertainty ranges, and metrics are somewhat subjective. It would be much more satisfactory if their selections could be done automatically and more objectively. Several improvements can be made to the presented method. Different weights can be used for the targeted fields. Sensitivity to different target metrics can be studied. Multiple target metrics may be designed to optimize different sets of parameters. Constraints such as energy balance at the top of the atmosphere may be imposed. It is also possible to use time-varying solutions as metrics to target variabilities such as the Madden–Julian Oscillation (MJO) in models. These could be a subject for future research.

The source code of CAM5 is available from

The supplement related to this article is available online at:

TZ, MZ, WX, JH, and WZ designed the tuning framework. MZ and YL evaluated the optimal results. WL, HYM, and SX generated the CAPT data. HYY provided the ISCCP data. MZ, TZ, and XX designed the tuning metrics. TZ, MZ, and WL wrote the paper.

The authors declare that they have no conflict of interest.

This work is partially supported by the National Key R&D Program of China (grant nos. 2017YFA0604500 and 2016YFA0602100) and the National Natural Science Foundation of China (grant nos. 91530323 and 41776010). Additional support is provided by the CMDV project of the CESD of the US Department of Energy to Stony Brook University. Tao Zhang (partially) and Wuyin Lin are supported by the CMDV project to BNL. Hsi-Yen Ma and Shaocheng Xie are funded by the Regional and Global Model Analysis and Atmospheric System Research Programs of the US Department of Energy, Office of Science, as part of the Cloud-Associated Parameterizations Testbed. Hsi-Yen Ma and Shaocheng Xie were supported under the auspices of the US Department of Energy by the Lawrence Livermore National Laboratory (LLNL) under contract DE-AC52-07NA27344. Edited by: Patrick Jöckel Reviewed by: two anonymous referees