Physical parameterizations in general circulation models (GCMs), having various uncertain parameters, greatly impact model performance and model climate sensitivity. Traditional manual and empirical tuning of these parameters is time-consuming and ineffective. In this study, a “three-step” methodology is proposed to automatically and effectively obtain the optimum combination of some key parameters in cloud and convective parameterizations according to a comprehensive objective evaluation metrics. Different from the traditional optimization methods, two extra steps, one determining the model's sensitivity to the parameters and the other choosing the optimum initial value for those sensitive parameters, are introduced before the downhill simplex method. This new method reduces the number of parameters to be tuned and accelerates the convergence of the downhill simplex method. Atmospheric GCM simulation results show that the optimum combination of these parameters determined using this method is able to improve the model's overall performance by 9 %. The proposed methodology and software framework can be easily applied to other GCMs to speed up the model development process, especially regarding unavoidable comprehensive parameter tuning during the model development stage.

Due to their current relatively low model resolutions, general circulation
models (GCMs) need to parameterize various sub-grid-scale processes. Physical
parameterizations aim to approximate the overall statistical outcomes of
various sub-grid-scale physics

Traditionally, the uncertain parameters are manually tuned by comprehensive
comparisons of model simulations with available observations. Such an
approach is subjective, labor-intensive, and hard to extend

For the PDF method, the confidence ranges of the optimization parameters are
evaluated based on likelihood and Bayesian estimation.

Optimization algorithms can be used to search the maximum or minimum metrics
value in a given parametric space.

The data assimilation method has been well addressed for state estimation,
and can be a potential solution for parameter estimation.

A climate system model is a strongly nonlinear system, having a large number of uncertain parameters. As a result, the parametric space of a climate system model is high-dimensional, multi-modal, strongly nonlinear, and inseparable. More seriously, one model run of a climate system model might require tens or even hundreds of years of simulation to get scientifically meaningful results.

To overcome these challenges, we propose a “three-step” strategy to
calibrate the uncertain parameters in climate system models effectively and
efficiently. First, the Morris method

The paper is organized as follows. Section 2 introduces the proposed automatic workflow. Section 3 describes the details of the example model, reference data, and calibration metrics. The three-step calibration strategy is presented in Sect. 4. Section 5 evaluates the calibration results, followed by a summary in Sect. 6.

We design a software framework for the overall control of the tuning practice. This framework can automatically execute any part of our proposed three-step calibration strategy, determine the optimal parameters, and produce its corresponding diagnostic results. It incorporates various tuning methods and facilitates model tuning processes with minimal manual management. It effectively manages the dependence and calling sequences of various procedures, including parameter sampling, sensitivity analysis and initial value selection, model configuration and running, and evaluation of model outputs using user-provided metrics. Users only need to specify the model to tune, the parameters to be tuned with their valid ranges, and the calibration method to use.

The structure of the automatic calibration workflow. The input of the workflow is the parameter set interest and their initial value ranges. The output is the optimal parameters and their corresponding diagnostic results after calibration. The preparation module provides the parameter sensitivity analysis. The tuning algorithm module offers local and global optimization algorithms including the downhill simplex, genetic algorithm, particle swarm optimization, differential evolution, and simulated annealing. The scheduler module schedules as many as cases to run simultaneously and coordinates different tasks over a parallel system. The post-processing module is responsible for metrics diagnostics, re-analysis and observational data management.

There are four main modules within the framework as shown in Fig. 1. The
scheduler module manages model simulations with the capability for
simultaneous runs. It also coordinates different tasks to reduce the
contention and improve throughput. Simulation diagnosis and evaluation are
included in a post-processing module. The preparation module contains various
sensitivity analysis and sampling methods, such as the Morris

We use the Grid-point Atmospheric Model of IAP LASG version 2 (GAMIL2) as an
example for the demonstration of the tuning workflow and our calibration
strategy. GAMIL2 is the atmospheric component of the Flexible
Global–Ocean–Atmosphere–Land System Model grid version 2 (FGOALS-g2),
which participated in the CMIP5 (the fifth phase of the Coupled Model
Intercomparison Project) program. The horizontal resolution is
2.8

To save computational cost, atmosphere-only simulations are conducted for
5 years using prescribed seasonal climatology (no interannual variation) of
SST and sea ice. Previous studies have shown that a 5-year type of simulation
is enough to capture the basic characteristics of simulated mean climate
states

Model tuning results depend on the reference metrics used. For a simple
justification, we use some conventional climate variables for the evaluation.
Wind, humidity, and geopotential height are from the European Center for
Medium-Range Weather Forecasts (ECMWF) Re-Analysis (ERA)–Interim reanalysis
from 1989 to 2004

The reference metrics, including various variables in Table 2, is used to
quantitatively evaluate the performance of overall simulation skills

A summary of parameters to be tuned in GAMIL2. The default and final tuned optimum values are also shown. The valid range of each parameter is also included. Note that only four sensitive parameters are tuned and have optimum values.

Atmospheric fields included in the evaluation metrics and their sources.

Parameter tuning for a climate system model is intended to solve a global
optimization problem in theory. As the well-known global optimization
algorithms, traditional evolutionary algorithms, such as the genetic
algorithm

We choose the downhill simplex method for climate model tuning considering
its relatively low computation cost. The downhill simplex method searches the
optimal solution by changing the shape of a simplex, which represents the
optimal direction and step length. A simplex is a geometry, consisting of

Two performance criteria are used to evaluate the effectiveness and
efficiency of the optimization algorithms in this study. Selection of
optimization algorithms for parameter calibration of climate system models is
a balance between model improvement (effectiveness) and computational cost
(efficiency). In this study, model improvement is measured by an index
defined in Eq. (3). The lower this value is, the better the model tuning is.
Computational cost is measured by “core hours”, which stands for the
computational efficiency. It is computed by

Effectiveness and efficiency comparison between the original
downhill simplex method and the two global methods.

According to tuning GAMIL2, two global methods, PSO and DE, give better tuning effectiveness than the downhill simplex method, but their computational costs are approximately 4 and 5 times, respectively, those of the downhill simplex method (Table 3).

To improve the effectiveness of the downhill simplex method, we propose two important steps to significantly improve its performance. In the first step, the number of tuning parameters is reduced by eliminating the insensitive parameters. In the second step, fast convergence is achieved by pre-selecting proper initial values for the parameters before using the downhill simplex method.

The number of uncertain parameters in physical parameterizations of a climate
system model is quite large. Most optimization algorithms, such as PSO, the
downhill simplex method, and the simulated annealing algorithm

Parameter sensitivity analysis can be divided into local and global methods

The Morris method, based on the MOAT sampling strategy, reduces the number of
samples required by other global sensitivity methods

We define

Scatter diagram showing the parameter sensitivity using the Morris
sensitivity analysis. The

The parameter elimination step is critical for the final result of model
tuning. To validate the results obtained by the Morris method, we compare the
results with a benchmark method

Sensitivity analysis results from the Sobol method. The total sensitivity in Eq. (8) is denoted by the size of the color area. The total sensitivities of ke, c0_shc, and capelmt are less than 0.5 in terms of each variable.

The downhill simplex method is a local optimization algorithm and its
convergence performance strongly depends on the quality of the initial
values. We need to find the parameters with the smaller metrics around the
final solution. Moreover, we have to finish the searching as fast as possible
with minimal overhead. For these two objectives, a hierarchical sampling
strategy based on the single parameter perturbation (SPP) sample method is
used. The SPP is similar to local sensitivity methods, in which only one
parameter is perturbed at one time with other parameters fixed. The
perturbation samples are uniformly distributed across parametric space.
First, the improvement index as defined in Eq. (3) of each parameter sample
is computed. The distance is defined as the difference between the
improvement indexes using two adjacent samples, i.e., the model response
measured by a certain percentage change of one parameter. We call this step
the first-level sampling. The specific perturbation size for one parameter
can be set based on user experience. In our implementation, the user needs to
set the number of samples. For the first-level sampling, we can use a larger
perturbation size to reduce computational cost. If the distance between two
adjacent samples is greater than a predefined threshold, more SPP samples
between the previous two adjacent samples are conducted, and this is called
the second-level sampling. Finally,

At the same time, inappropriate initial values may lead to ill-conditioned simplex geometry, which can be found in the model tuning process. One issue we meet is that some vertexes in the downhill simplex optimization may have the same values for one or more parameters. As a result, these parameters remain invariant during the optimization, and this may degrade the quality of the final solution as well as the convergence speed. A simplex checking is conducted to keep as many different values of parameters as possible during the process of looking for initial values. Well-conditioned simplex geometry will increase the parameter freedom for optimization. In our implementation (Algorithm 1), the vertex leading to the ill-conditioned simplex is replaced by another parameter sample that gives another minimum improvement index value.

The same as Table 3 but showing the comparison among the three downhill simplex methods.

These methods mentioned above are summarized as the initial value pre-processing of the downhill simplex algorithm. Sometimes, the samples used during the initial value selection are the same as those in the parameter sensitivity analysis step. In this case, one model run can be used in both steps to further reduce the computational cost.

The effectiveness and efficiency of the three traditional algorithms are
compared in Table 3. “Downhill_1_step” represents the original downhill
simplex method, which is one of the most widely used local optimization
algorithms and has been successfully used in the Speedy model

Taylor diagram of the climate mean state of each output variable from 2002 to 2004 of EXP and CNTL.

Two extra steps are included before the original downhill simplex method to overcome its limited effectiveness on model performance improvement. The “Downhill_2_steps” method includes an initial value pre-processing step before the downhill simplex method, and the “Downhill_3_steps” method further introduces another step to eliminate insensitive parameters for tuning by sensitivity analysis. The two steps bring in additional overhead, 80 samples for the parameter sensitivity analysis with the Morris method, and 25 samples for the initial value pre-processing. Tables 3 and 4 show that the proposed “Downhill_3_steps” achieves the best effectiveness, improving the model's overall performance by 9 %. It overcomes the inherent ineffectiveness of the original downhill simplex method with a much lower computational cost than global methods.

Improvement indices over the global, tropical and mid–high latitudes of the Northern and Southern Hemisphere (MLN and MLS) for each variable of the EXP simulation.

Pressure–latitude distributions of relative humidity and
cloud fraction of EXP

This section compares the default simulation and the tuned simulation by the three-step method, with a focus on the cloud and TOA radiation changes. Table 1 shows the values of the four pairs of sensitive parameters between the control (labeled as CNTL) and optimized (labeled as EXP) simulations. Significant change is found for c0, which represents the auto-conversion coefficient in the deep convection scheme, and rhminh, which represents the threshold relative humidity for high cloud appearance. The other two parameters have negligible change of the values before and after the tuning, and thus it is expected that their impacts on model performance will be accordingly small.

The overall improvement after the tuning from the control simulation can be
found in the Taylor diagram (Fig. 4), with improvement for almost all the
variables, especially for the meridional winds and mid-tropospheric
(400

Meridional distributions of the annual mean difference between
EXP/CNTL and observations of TOA outgoing long-wave radiation

With a reduced RH threshold for high cloud (from 0.78 in CNTL to 0.63 in EXP,
Table 1), the stratiform condensation rate increases and the atmospheric
humidity decreases

Changes in moisture and cloud fields impact radiative fields. With reference
to ERBE, TOA outgoing long-wave radiation (OLR) is improved at the
mid-latitudes for EXP, but it is degraded over the tropics (Fig. 7a).
Compared with the CNTL, middle and high cloud significantly increase in the
EXP (Fig. 6). Consequently, it enhances the blocking effect on the long-wave
upward flux at TOA (FLUT), reducing the FLUT in mid-latitudes of the Southern
Hemisphere and Northern Hemisphere (Fig. 7a). Clear-sky OLR increases for the
EXP and this is due to the drier upper troposphere in the EXP (Fig. 6). The
decrease in the atmospheric water vapor reduces the greenhouse effect.
Therefore, it emits more outgoing long-wave radiation and reduces the
negative bias of clear-sky long-wave upward flux at TOA (FLUTC, Fig. 7b).
Long-wave cloud forcing (LWCF) at the middle and high latitudes is improved
due to the improvement in the FLUT in these areas (Fig. 7c), but improvement
in the tropics is negligible due to the cancellation between the FLUT and
FLUTC. Overall, the tuned simulation has a TOA radiation imbalance of
0.08 W m

TOA clear-sky short-waves are the same between the control and the tuned simulation since both simulations have the same surface albedo. With increased clouds, the tuned simulation has smaller TOA short-waves absorbed than the control. Compared with ERBE, the tuned simulation has better TOA short-waves absorbed at the middle and high latitudes, but it slightly degrades over the tropics.

An effective and efficient three-step method for GCM physical parameter tuning is proposed. Compared with conventional methods, a parameter sensitivity analysis step and a proper initial value selection step are introduced before the low-cost downhill simplex method. This effectively reduces the computational cost with an overall good performance. In addition, an automatic parameter calibration workflow is designed and implemented to enhance operational efficiency and support different uncertainty quantification analysis and calibration strategies. Evaluation of the method and workflow by calibrating the GAMIL2 model indicates that the three-step method outperforms the two global optimization methods (PSO and DE) in both effectiveness and efficiency. A better trade-off between accuracy and computational cost is achieved compared with the two-step method and the original downhill simplex method. The optimal results of the three-step method demonstrate that most of the variables are improved compared with the control simulation, especially for the radiation-related ones. The mechanism analysis is conducted to explain why these radiation-related variables have an overall improvement. In future work, more analyses are needed to better understand the model behavior along with the physical parameter changes. The choice of appropriate reference metrics and related observations are very important for the final tuned model performance. In future studies, we are going to use the more reliable and accurate observations, and add some constraint conditions for parameter tuning to construct a more comprehensive and reliable metrics.

The authors are grateful to the reviewers for valuable comments that have greatly improved the paper. This work is partially supported by the Ministry of Science and Technology of China under grant no. 2013CBA01805, the Information Technology Program of the Chinese Academy of Sciences under grant no. XXH12503-02-02-03, the China Special Fund for Meteorological Research in the Public Interest under grant no. GYHY201306062, and the Natural Science Foundation of China under grant nos. 91530103, 61361120098, and 51190101. Edited by: R. Neale