CondiDiag1.0: A flexible online diagnostic tool for conditional sampling and budget analysis in the E3SM atmosphere model (EAM)

Numerical models used in weather and climate prediction take into account a comprehensive set of atmospheric processes such as the resolved and unresolved fluid dynamics, radiative transfer, cloud and aerosol life cycles, and mass or energy exchanges with the Earth's surface. In order to identify model deficiencies and improve predictive skills, it is important to obtain process-level understanding of the interactions between different processes. Conditional sampling and budget analysis are powerful tools for process-oriented model evaluation, but they often require tedious ad hoc coding and large amounts of instantaneous model output, resulting in inefficient use of human and computing resources. This paper presents an online diagnostic tool that addresses this challenge by monitoring model variables in a generic manner as they evolve within the time integration cycle. The tool is convenient to use. It allows users to select sampling conditions and specify monitored variables at run time. Both the evolving values of the model variables and their increments caused by different atmospheric processes can be monitored and archived. Online calculation of vertical integrals is also supported. Multiple sampling conditions can be monitored in a single simulation in combination with unconditional sampling. The paper explains in detail the design and implementation of the tool in the Energy Exascale Earth System Model (E3SM) version 1. The usage is demonstrated through three examples: a global budget analysis of dust aerosol mass concentration, a composite analysis of sea salt emission and its dependency on surface wind speed, and a conditionally sampled relative humidity budget. The tool is expected to be easily portable to closely related atmospheric models that use the same or similar data structures and time integration methods.

Here, "host model" refers to the AGCM in which our new tool is embedded, in this case EAMv1. We summarize EAM's choice of method for coupling atmospheric processes in Sect. 2.1 and briefly describe how model variables are archived on output files in Sect. 2.2. These features of the host mode are used by our tool.

Sequential process coupling
EAMv1 solves a set of integral-differential equations to simulate the spatial variation and temporal evolution of the state of the atmosphere. Distinct physical and chemical processes are represented by different model components, e.g., the dynamical core that describes the large-scale fluid dynamics and tracer transport processes resolved by the model's computational grid, and various parameterizations that describe smaller-scale fluid processes (e.g., turbulence and convection) and non-fluid phenomena (e.g., radiative transfer, cloud and aerosol microphysics).
For time integration, the primary method used in EAMv1 for component coupling is a method we refer to as isolated sequential splitting (Figure 1). In this method, a model component produces an estimate of the rate of change of the atmospheric state by considering a single or a set of closely related physical or chemical processes in isolation (i.e., ignoring all other processes represented by other model components). The estimated rate of change, often referred to as a tendency, is used to update the atmospheric state, and then the updated state is passed to the next model component. Since EAMv1 contains many components, the atmospheric state is updated multiple times within one full time step. Here a full time step is defined as the smallest time integration cycle in which the effects of all physical processes considered in a simulation have been used to update the model state at least once in advancing the solution in time. This full time step is often loosely referred to as the "physics time step" in EAMv1 and its predecessors. In a discussion on time stepping and sub-cycling in EAMv1, Wan et al. (2021) referred to the full time steps as the "main process-coupling time steps" and denoted their length by ∆t CPLmain . The same notation is used in this paper for consistency and clarity. The so-called low-resolution configuration of EAMv1 (with 1 degree horizontal grid spacing) uses ∆t CPLmain = 30 min by default. Fig. 1 provides a schematic showing a full time step consisting of 5 hypothetical processes label as A to E.
A model component in EAMv1 might contain sub-components that are also connected using the isolated sequential splitting method, like process B depicted in Fig. 1. An example in EAMv1 is deep convection, which consists of the parameterization by Zhang and McFarlane (1995) that describes impact of convective activities on temperature and humidity and a parameterization of the convective momentum transport from Richter and Rasch (2008). These convection-related atmospheric processes are sequentially split within the deep convection parameterization.
Another situation that can also be depicted by the hypothetical process B in Fig. 1 is sub-cycling. For example, in EAMv1, the parameterizations of turbulence, shallow convection, and stratiform cloud macrophysics and microphysics are sub-cycled 6 times within each 30 min full time step. In this case, each sub-cycle can be viewed as a sub-process depicted in Fig. 1 (i.e., sub-cycle 1 corresponds to process B1, sub-cycle 2 corresponds to process B2, etc.).  Figure 1. A schematic showing a time step of model simulation involving five hypothetical atmospheric processes, A to E, either resolved or unresolved by the model's computational mesh, that are numerically coupled using isolated sequential splitting (cf. Sect. 2.1). Also shown are various locations (referred to as checkpoints, cf. Sect. 3.1) within a time step that are introduced to facilitate diagnostics using the new tool. When the tool is used in a simulation, some checkpoints are activated (i.e., selected by the user and indicated in green here in, and others are inactive and indicated in gray. No information is monitored at inactive checkpoints. The green lines with a circle on one end and an arrowhead on the other end depict how increments of model variables are defined. Further details can be found in Sections 2.1 and 3.1.

History output
EAMv1 inherited from its predecessors a flexible mechanism for handling model output (see, e.g., Eaton, 2010). The data files that contain the temporal and spatial distribution of model-simulated physical quantities are called history files. The model can write multiple series of history files with different write frequencies; these series are referred to as history tapes in the source code. Different history tapes can contain different output variables (fields).
Whether the values written out should be instantaneous, time-averaged, maximum or minimum during the output time window can be specified for each tape on a field-by-field basis.
The software infrastructure for history output uses internal data types and functions that handle the storage of fields to be written out and perform the calculation of required statistics (e.g., time averages). Typically, researchers focusing on physical or computational aspects of the model do not need to care about the internal working of this software infrastructure. Rather, they use a subroutine named outfld to transfer the values of a model variable to the infrastructure. To provide a context for some descriptions in later sections, we note that while a model variable can change its value multiple times in a time step of ∆t CPLmain , the value being recorded for output is the snapshot made when the outfld subroutine is called. The location in the time integration cycle at which the outfld subroutine is called can differ from model variable to variable.

Nomenclature and design concepts for CondiDiag
We now introduce the key concepts and design features of the new tool. The description in this section is kept general, only referring to EAM when necessary, as the methodology can be applied to or revised for other AGCMs. Details of the implementation in EAMv1 are provided in Sect. 4.

Checkpoints, field values, and increments
In order to discuss the implementation of our tool in the context of the sequential process splitting described in Sect. 2.1, we introduce the following nomenclature: -A checkpoint is a location in the time integration cycle where a snapshot of a model variable can be obtained (cf. Fig. 1).
At a checkpoint, the value of a model variable can be retrieved from data structures in the standard EAMv1. Additional quantities can be computed from available variables. Those retrieved or computed variables at that checkpoint can be saved in the data structure of our tool and transferred to the output-handling infrastructure of the standard EAM (cf. Sect. 2.2). If a process is sub-cycled with respect to ∆t CPLmain , then the end of each sub-cycle is considered to be a different checkpoint.
-The value of a model variable at a checkpoint is referred to as a field value. For example, the air temperature after process A in Fig. 1 is referred to as the field value of temperature at checkpoint A.
time. No information is retrieved, calculated, or archived by our tool at inactive checkpoints. This flexibility allows a user to focus only on the checkpoints relevant to their specific study; it also saves memory and disk space, as inactive checkpoints will not consume memory or produce information in the model's output files.
-The difference between values of the same model variable at two different checkpoints is referred to as an increment.
Since there can be inactive checkpoints, an increment calculated by our tool is the difference between the field value at the current checkpoint and the field value at the previous active checkpoint. For example, in Fig. 1, increment E is the difference between checkpoints E and C, with the inactive checkpoint D ignored.

Composite analysis
For a composite analysis, our tool expects the user to specify one or more conditional sampling criteria via run time input (e.g., namelist parameters). The handling of multiple conditions is described later in Sect. 3.3. Here we first explain the handling of a single sampling condition.
During each time integration cycle of length ∆t CPLmain , values of user-selected variables at active checkpoints are obtained and copied to a data structure internal to our tool. Increments and vertical integrals are calculated if requested. The sampling condition is evaluated at each grid cell in the global domain. Depending on whether the condition is met, the copy (including increments and integrals) is assigned either the model-computed values or a fill value, resulting in a conditionally sampled copy. This sampled copy, together with information about the sampling condition, is then transferred to the output handling infrastructure. In the next time step, the sampling condition is re-evaluated and selected model variables re-sampled. The details are explained below.

Defining a condition
A key element of a sampling strategy is the atmospheric condition to be used to categorize data. Necessary elements in the definition of a condition include (1) a metric (which can be any 2D or 3D field, e.g., air temperature or surface pressure), (2) a threshold (which is a number, e.g., -40 • C or 500 hPa), and (3) a comparison type (e.g., smaller than or equal to). In our tool, a metric can be any prognostic or diagnostic variables in the host model or a quantity that can be diagnosed from existing variables. Currently supported comparison types include (i) <, (ii) , (iii) >, (iv) , and (v) equal to within a tolerance. Type (v) can be used to select values within a range. For example, choosing a threshold of -20 • C and a tolerance of 20 • C would allow the user to sample grid cells with air temperature between -40 • C and 0 • C. The user's choices of metric, threshold, comparison type, and tolerance (if applicable) are expected to be specified through run time input.
Another key element of the definition of the sampling condition is the location in the time integration cycle at which the sampling condition should be evaluated. As explained earlier in Sect. 2.1, the atmospheric state defined by the prognostic variables of EAM's governing equations is updated multiple times within one full time step of ∆t CPLmain due to the sequential method used for process coupling. While some diagnostic quantities (e.g., relative humidity) are calculated only once or a few times per ∆t CPLmain when they are needed by a calculation (e.g., a parameterization), their values that are consistent with the model's prognostic state effectively evolve within each time step. To avoid ambiguity, our tool requires the user to specify at which checkpoint (cf. Sect. 3.1) a sampling condition should be evaluated. The implementation of this aspect in EAMv1 is discussed in more detail in Sect. 5.2.1.

Condition metric and field of flags
In this first version of our tool, the metric used in defining a sampling condition can be one of the following types of model variables: a 2D field that covers the entire horizontal domain of the model, such as the surface pressure or total cloud cover; a 3D field defined at layer mid-points or as layer averages, e.g., air temperature, cloud fraction, or the mass mixing ratio of a tracer in EAMv1, a 3D field defined at the interfaces between adjacent layers, e.g, the convective mass flux predicted by the deep convection parameterization or the net longwave radiative flux calculated by the radiation scheme in EAMv1.
For each condition metric, a flag field with the same spatial dimensions is defined in the data structure internal to our tool. After a sampling condition is evaluated at a grid cell in the 2D or 3D domain, the flag field is assigned a value of 1 if the condition is met and a value of 0 otherwise. The flag field, when averaged over time, quantifies the frequency of occurrence of meeting the sampling condition at each individual grid cell. The flags at different grid cells can be averaged in space, either over the entire 2D or 3D domain or over a subdomain, to calculate the frequency of occurrence of the sampling condition in the corresponding domain, but the spatial averages are expected to be done during post-processing instead of during model integration. A use case example involving both temporal and spatial averaging can be found in Sect. 6.3.
After the sampling condition is evaluated over the entire 2D or 3D domain, the condition metric itself is sampled, meaning that the field of values transferred to the output-handling software contains the model-computed values where the condition is met and a fill value of zero where the condition is not met. In other words, the masking indicated by the flag field is applied to the condition metric as well. Recall that the output-handling infrastructure of EAM supports both instantaneous and timeaveraged model output. Since EAM is a climate model, time-averaged output is expected to be more often used. Our tool uses a fill value of zero for archiving the condition metric and the other monitored model variables to make sure that time steps in which the sampling condition is not met make zero contributions to the time average. Later on, during post-processing, when a time average of a condition metric is divided by the time average of the corresponding flag, we get the composite mean, i.e., the average over time steps when the condition is met.

Monitored model variables
Our tool allows for multiple model variables to be monitored under the same sampling condition. To distinguish those monitored variables from the condition metric, the monitored variables are referred to as the quantities of interest (QoIs) in the remainder of this paper and in our code. QoIs monitored under the same condition can have different vertical dimension sizes: -When the QoI has the same dimension size as the condition metric, the masking indicated by the flag field can be applied in a straightforward manner.
-If the metric is 2D and the QoI is 3D, then the same 2D masking is applied to all vertical layers or interfaces.
-If the metric and the QoI are both 3D but have different numbers of vertical layers (e.g., the metric is the air temperature defined at layer midpoints while the QoI is the net longwave radiative flux defined at layer interfaces), then masking will be skipped, meaning this specific QoI will be captured for output as if no conditional sampling had happened.
-If the metric is 3D and the QoI is 2D, then a grid cell in the 2D domain is selected if any layer midpoint or interface in that column is selected. For example, to quantify the shortwave cloud radiative effect (the QoI) in the presence of ice clouds, one can choose a sampling condition of non-zero ice crystal concentration. Then, if ice crystals occur in any layer in a grid column, then the shortwave cloud radiative effect of that grid column will be sampled.
Like the archiving of the condition metric explained in Sect. 3.2.2, a QoI gets a fill value of zero at grid cells where the condition is not met, so that the composite mean can be derived by dividing the time-averaged QoI by the time-averaged flag field.

Time window of validity of an evaluated condition
Our tool is designed to evaluate a sampling condition once per each ∆t CPLmain at a user-specified checkpoint X and can monitor QoIs at multiple checkpoints within ∆t CPLmain . By default, the masking resulting from a condition evaluated at checkpoint X is applied retrospectively to all active checkpoints from X until just before the previous encounter of X (i.e., X in the previous time step). This is illustrated by condition (1) shown in purple in Fig. 2, where the sampling condition is evaluated at checkpoint C and the masking is applied retrospectively to checkpoints B2, B1, A, and E.
To provide more flexibility, our tool also allows the user to specify a different checkpoint as the end-of-validity mark for a sampling condition, which we indicate with double-bars in Fig. 2. A hypothetical example is given as condition (2)   More details can be found in Sect. 3.2.4. Like in Fig. 1, green tags are active checkpoints being monitored by the tool. Gray tags with dashed borderlines are inactive checkpoints, which are ignored in the simulation.

Multiple sampling conditions in one simulation
A single sampling condition is defined by a combination of (i) a metric, (ii) a threshold, (iii) a comparison type, (iv) a tolerance if the comparison type is "equal to", (v) a condition-evaluation checkpoint, and (vi) an end-of-condition-validity checkpoint.
Changing any of the elements will result in a new sampling condition. Our tool allows for multiple conditions to be used in a single simulation (cf. Fig. 2).
For software simplicity, the current implementation only allows one and the same set of QoIs and checkpoints to be monitored under all sampling conditions. In the example illustrated in Fig. 2 where two conditions ( (1) and (2)) and five active checkpoints (A, B1, B2, C, and E) are activated, let us assume the user has chosen to monitor four QoIs, T , q v , u, and v. The same four QoIs and five checkpoints will be monitored for both sampling conditions. The current implementation does not allow, for example, monitoring only T and q v at checkpoint A and C under condition (1) and only u and v at checkpoints A, B1, and B2 under condition (2), although this kind of flexibility can be considered for future versions of CondiDiag if needed.

Mass-weighted vertical integral of QoIs
For spatially 3D QoIs defined at layer midpoints or as cell averages, the vertical integral weighted by air mass can be calculated during the time integration and then conditionally sampled and written out as 2D variables. This applies to both field values and their increments.
One note of caveat is that in EAM's physics parameterizations, the mixing ratios of water species (vapor, cloud liquid and ice, rain and snow) are defined relative to the mass of moist air (i.e., dry air plus water vapor) while the mixing ratios of aerosols and chemical gases are defined with respect to dry air. Our tool expects the user to specify which kind of air mass (moist or dry) should be used for each QoI when vertical integrals is requested.
Furthermore, while the physics parameterizations suite uses different mixing ratio definitions for different tracers, the numerical schemes for large-scale transport assume all tracers concentrations are "wet" mixing ratios (i.e., defined relative to moist air), and the wet-to-dry and dry-to-wet conversions occur during each model time step of size ∆t CPLmain . Therefore, our tool also allows a user to (1) specify the type of air mass to be used for vertical integral at each checkpoint, and (2) clarify whether the air mass type specified for the QoI or the checkpoint should take precedence for each QoI-checkpoint combination.
The corresponding namelist parameters are explained in Sect. 5.2.4.

Implementation in EAMv1
This section explains how the design features described in Sect. 3 are implemented in EAMv1. We start with an overview of the new Fortran modules added specifically for the tool (Sect. 4.1), a general-purpose diagnostics module (Sect. 4.2), and a summary of the changes made to the original EAMv1 code (Sect. 4.3). We keep these sections brief but provide two versions of the EAMv1 code on Zenodo corresponding to the GitHub commits before and after the implementation of CondiDiag1.0, so that readers can review the details of the code changes if needed.

CondiDiag-specific new modules
Four new modules are added to define data structures and support key functionalities of our diagnostic tool.

Data structure module
The module conditional_diag contains definitions of the basic data structures used by our tool and subroutines for initializing the corresponding Fortran variables.
A Fortran variable cnd_diag_info of the derived type cnd_diag_info_t contains the metadata that describes the user's conditional sampling strategy and/or budget analyses configuration. A namelist conditional_diag_nl (cf. Sect. 5.2) is also defined in this module, and a subroutine cnd_diag_readnl parses the user's namelist input and populates the information to cnd_diag_info.
A second derived type cnd_diag_t is defined for storing the values of the metrics, flags, as well as QoI field values and increments. The corresponding Fortran variable is an array named phys_diag; the array is defined in a different module (explained in Sect. 4.3.1). The subroutines that allocate memory for elements of phys_diag and their components are included in module conditional_diag.

Key algorithm module
The module conditional_diag_main contains the key subroutine of our tool named cnd_diag_checkpoint, which In these examples, "T" and "RHI" need to be unique names within the module conditional_diag_main; these will also be the metric or QoI names that the users refer to in the namelist conditional_diag_nl (cf. Sect. 5.2). The currently implemented metric and QoI names are listed in Table A1 in Appendix A. Additional metrics and QoIs can be added following the existing examples. We note that some of the variable names in

History output module
The module conditional_diag_output_utils is responsible for adding the following items to EAM's master list of history output variables: the conditionally sampled metric field named with the pattern cnd<index>_<metric_name> where <index> is a two-digit number (e.g., cnd01_T if the first sampling condition uses air temperature as the metric); the flag field (see Sect. 3.2.2) named cnd<index>_<metric_name>_flag; one output variable corresponding to each QoI at each active checkpoint under each sampling condition, named with the pattern cnd<index>_<QOI_name>_<checkpoint_name>. For example, cnd01_CLDLIQ_DYNEND is the stratiform cloud liquid mixing ratio monitored at checkpoint DYNEND under condition 1. If increments of the QoI are calculated and archived, these will be named similar to the QoIs but with a suffix _inc append, e.g., cnd01_CLDLIQ_DYNEND_inc for the increment of CLDLIQ at checkpoint DYNEND under condition 1.
-If the mass-weighted vertical integral is requested for a QoI, then a suffix _v will be appended to the QoI name. For example, cnd01_CLDLIQ_v_DYNEND is the column burden of CLDLIQ at checkpoint DYNEND under condition 1 and cnd01_CLDLIQ_v_DYNEND_inc is the corresponding increment.
We expect that users of our tool should not need to touch the conditional_diag_output_utils module unless they want to revise the naming conventions for variables in the history files.
It is worth noting that for any of the output variables added by our tool, EAM's standard history output functionalities apply (cf. Sect. 2.2). For example, each variable can be added to or excluded from one or multiple history tapes and be written out at the user-specified frequencies. For temporal statistics, both instantaneous and time-averages can be used in the current implementation. Maximum and minimum values etc. need to be used with care as unselected grid cells are filled with zeros. In future versions, we will consider allowing the user to specify what missing values should be assigned to each QoI.

Restart module
Because our diagnostic tool uses its own data structure, new subroutines have been included to add additional contents to EAM's restart files. These subroutines are placed in the module conditional_diag_restart. As long as users do not change the data structures defined in module conditional_diag, there should be no need to touch the restart module even if they add new metrics and QoIs to the key algorithm modules conditional_diag_main and misc_diagostics.

General-purpose diagnostics module
In case a user provides their own subroutines to calculate new metrics or QoIs, like relhum_ice_percent in the code snippet above, we recommend those subroutines be placed in the module misc_diagnostics rather than in conditional_diag_main, because we view those user-provided subroutines as general-purpose diagnostic utilities that could also be used by other parts of EAM (e.g., some parameterizations).

Other code changes in EAMv1
Other than adding the five modules explained in Secitons 4.1 and 4.2, the implementation of our tool in EAMv1 only involved a very small number of code changes, as described below.

The phys_diag array and its elements
Our tool has the derived data type cnd_diag_t for storing values of the condition metrics, flags, and QoI field values as well as increments and vertical integrals, while the data storage closely follows the handling of EAM's model state variable: a Fortran array of type cnd_diag_t is declared as an array of rank one with the different array elements corresponding to different grid-cell chunks handled by the same CPU. The subroutines tphysbc and tphysac that organize the invocation of individual parameterizations and their coupling operates on a single grid-cell chunk; the scalar variable of type cnd_diag_t in tphysbc and tphysac is named diag.

Checkpoints
Most of the checkpoints listed in Tables B1 and B2 are added to subroutines tphysbc and tphysac by inserting code lines where diag is the variable of type cnd_diag_t explained in Sect. 4.3.1 and "DYNEND" is the unique string identifying this checkpoint, state, cam_in and cam_out are scalar variables of derived types declared in the original EAM code.
Checkpoints have also been included in the stratiform cloud macrophysics driver subroutine clubb_tend_cam in the form of, e.g., where the character string char_macmic_it labels the sub-steps within a full time step ∆t CPLmain . It is worth emphasizing that state1 (instead of state) is referred to in the code snippet quoted above because state1 is the atmospheric state variable that is sequentially updated by various subprocesses in clubb_tend_cam.
The new tool is expected to be useful for a wide range of simulations routinely performed by the model developers and users, including debugging simulations that are a few time steps long, short few-day simulations for preliminary testing or weather forecast style simulations for comprehensive evaluations of the model physics following protocols like Transpose-AMIP (e.g., Phillips et al., 2004;Williams et al., 2013;Williamson et al., 2005;Martin et al., 2010;Xie et al., 2012;Ma et al., 2013Ma et al., , 2014, as well as more traditional multi-year to multi-decade simulations. To obtain process-level understanding of model behavior, it can be useful to use the new tool in an iterative manner. For example, for a study like Zhang et al. (2018) where one needs to identify model processes that result in negative values of specific humidity, we can start by carrying out a few-day or one-month simulation with unconditional sampling, choosing a large number of checkpoints to monitor model processes that are expected to affect humidity or might inadvertently do so because of computational artifacts or code bugs. We let the tool diagnose and archive time averages of the specific humidity and its increment at these checkpoints to get a sense of typical values of the state variable and identify sources and sinks of moisture. In a second step of investigation, we eliminate from the previous selection any checkpoints that have been confirmed to not see humidity change in any grid cell or time step in the few-day or one-month simulation. From the shorter list, we can pick one or multiple model processes as suspected culprits of negative specific humidity. If m suspects are selected for further investigation, then m sampling conditions can be specified in the next simulation, all using q v < 0 as the sampling criterion but each evaluated after a different suspect. We also select some QoIs (e.g., temperature, specific and relative humidity, wind, total cloud fraction, cloud liquid and ice mixing ratios, etc.), to be monitored both right before and right after the model processes that are suspected to cause negative water vapor. We can request both the field values and increments of these QoIs to be archived, as time averages or instantaneous values (or both). This second step might provide useful clues of the typical meteorological conditions under which negative water vapor is predicted in the model. If pathological conditions are identified, then we can carry additional simulations using relevant sampling conditions to further investigate the causes of those pathologies.
This section explains how investigations described above can be performed using our tool. We first present a typical workflow in Sect. 5.1 to illustrate the steps that a user needs to go through when designing an analysis and setting up an EAM simulation using our tool. We then explain the namelist parameters of our tool in Sect. 5.2.

User workflow
The schematic in Fig. 3 summarizes the steps to take when designing a composite or budget analysis using our tool. It also points to relevant concepts explained in earlier sections and namelist parameters explained below.

Namelist conditional_diag_nl
Users specify their conditional sampling and budget analysis strategy via the namelist conditional_diag_nl, which consists of five groups of parameters.  Figure 3. A schematic showing a typical workflow to illustrate the steps that a user needs to go through when setting up an EAM simulation using our tool. Dashed lines indicate places where code changes or additions are needed from the user.

Specifying sampling conditions
For specifying sampling conditions, we have -metric_name, a character string array containing the names of the condition metrics to be used in a simulation; -metric_nver, an integer array specifying the number of vertical levels of each metric is defined with. This is meant to help distinguish physical quantities that (1) have no vertical dimension, (2) are defined at layer mid-points, and (3) are defined at layer interfaces. Valid values for metric_nver are 1, pver (e.g., 72), and pverp (e.g., 73), where pver and pverp are EAM's variable names for the number of vertical layers and interfaces, respectively.
-metric_cmpr_type, an integer array specifying the types of comparison to be used for each condition (one entry per condition): 0 for "equal to within a tolerance", 1 for "greater than", 2 for "greater than or equal to", -1 for "less than", and -2 for "less than or equal to"; -metric_thereshold, a double-precision floating-point array specifying the threshold values that the metrics will be compared to (one threshold for each condition); -metric_tolerance, a double-precision floating-point array specifying the tolerances for conditions with comparison type 0 (one tolerance for each condition; the value will have no effect for conditions with a non-zero comparison type); -cnd_eval_chkpt, a character string array specifying at which checkpoints the conditions will be evaluated (see Sect. 3.2.1; one checkpoint for each condition).
-cnd_end_chkpt, a character string array specifying the checkpoints defining the end of validity of an evaluated condition (see Sect. 3.2.4; one checkpoint per condition). If not specified by user, the end-of-time-step checkpoint will be set to the condition-evaluation checkpoint (cnd_eval_chkpt).

Specifying monitored model variables
The QoIs to be monitored are specified via a character string array qoi_name, and the number of vertical levels of each QoI is given by the integer array qoi_nver. If no QoIs are specified but some sampling condition have been chosen, then conditional sampling will only be applied to the metrics.
The monitoring of QoI field values are turned on by the logical scalar l_output_state. A second logical scalar, l_output_incrm, is used to turn on or off the monitoring of QoI increments. Users' choice for the two switches will be applied to all QoIs.

Choosing checkpoints
The checkpoints at which the QoIs will be monitored are specified by a character string array qoi_chkpt. The sequence in which they are mentioned in the namelist has no significance. Note that the same checkpoints will be applied to all QoIs. Also note that if the user specifies a checkpoint name that does not match any checkpoint implemented in the code (e.g., because of a typographical error), then our tool will act as if the wrong checkpoint is an inactive one -in the sense that it will get ignored when the tool attempts to obtain QoI field values and calculate increments as the simulation proceeds; the history files will contain output variables corresponding to the incorrect checkpoint name but those output variables will contain zeros.

Turning on vertical integral
The calculation of mass-weighted vertical integrals of QoIs are enabled by two integer arrays qoi_x_dp and chkpt_x_dp: -chkpt_x_dp is expected to be specified in relation to qoi_chkpt (one value of chkpt_x_dp for each checkpoint.
A value of 1 tells our tool the mass of moist air should be used while a value of 2 indicates dry air mass should be used.
Any other values assigned to chkpt_x_dp will be interpreted as no specification.
-qoi_x_dp is expected to be specified in relation to qoi_name, i.e., one value of qoi_x_dp for each QoI. 0 is interpreted as no integral; the QoI will be sampled and written out as a 3D field. If 1 (moist) or 2 (dry) are selected, the corresponding (moist or dry) air mass will be used for vertical integral of that QoI at all active checkpoints in the simulation. If a value of 101 (moist) or 102 (dry) is used, the the corresponding air mass will be used for that QoI at all active checkpoints except where chkpt_x_dp indicates a different specification for a checkpoint. For example, let us assume we set qoi_x_dp = 102 for the coarse mode dust mass mixing ratio; we choose to monitor checkpoints A, B, and C and set chkpt_x_dp = 0,0,1. Then, when our tool calculates the coarse mode dust burden, the dry air mass will be used for checkpoints A and B while the moist air mass will be used for checkpoint C. In other words, a value of qoi_x_dp larger than 100 means using mod (qoi_x_dp, 100) in general but giving chkpt_x_dp precedence when the latter is set to non-zero at a checkpoint.
If the user wishes to monitor both a 3D QoI and its vertical integral, they can specify the same QoI twice in qoi_name, and then set one of the corresponding qoi_x_dp element to 0 and the other to an appropriate value to request vertical integral.
An use case example is provided in Sect. 6.1.

Turning on history output
A user might want to write out multiple copies of the conditional diagnostics or budget diagnostics to different history files that correspond to different output frequencies and/or temporal averaging. To facilitate such needs, the integer array hist_tape_with_all_output specifies which history files will contain the full set of output variables from our tool.
For example, hist_tape_with_all_output = 1, 3 will include the output to the h0 and h2 files. Again, we note that the standard output functionalities in EAM explained in Sect. 2.2 still apply.

Using unconditional sampling
One of the main motivations for creating our tool is to facilitate budget analysis. If an analysis is to be carried out for the entire computational domain and all time steps, then a special metric named ALL can be used. In such a case, the user can ignore (skip) the other namelist parameters in 5.2.1. When ALL is used, the condition evaluation will be skipped during the model's integration (see example in Sect. 6.1). Another way to use unconditional sampling is to specify a condition that will always be fulfilled (e.g., relative humidity higher than -1%) a use case example is provided in Sect. 6.3).
This section demonstrates the usage of the new tool using three concrete examples: The first example is a global budget analysis without conditional sampling. It demonstrates how to request unconditional sampling and how to request that increments of model variables be calculated and archived as time averages. This first example also demonstrates that with our tool, it is convenient to obtain both vertical profiles and vertical integrals of the budget terms.
The second example is a composite analysis without budget terms. It demonstrates how to use multiple sampling conditions in the same simulation and also shows that the tool can be used to perform a univariate probability distribution analysis.
In the third example, the increment diagnosis and conditional sampling capabilities are combined to perform a conditional budget analysis. The example demonstrates how metrics and monitored QoIs can be chosen to be physical quantities that need to be calculated from the host model's state variables using user-provided subroutines.
The examples shown here use 1-month simulations of October 2009 with monthly (or monthly and daily) output. All simulations were carried out with active atmosphere and land surface as well as prescribed sea surface temperature and sea ice concentration, at 1 degree horizontal resolution with out-of-the-box parameters and time integration configurations of EAMv1.

A global budget analysis of dust aerosol mass mixing ratio and burden
The first example is a global dust aerosol mass budget analysis without conditional sampling. The simulation is designed to provide insight into the atmospheric processes that change the burden (vertical integrals) of dust aerosols in two size ranges (accumulation mode and coarse mode). In particular, we are interested in dust emission, dry removal (i.e., sedimentation and dry deposition at the Earth's surface), resolved-scale transport, subgrid-scale turbulent transport and activation (i.e., nucleation scavenging), as well as the wet removal caused by precipitation collecting particles by impaction, and resuspension caused by evaporation of precipitation.

Simulation setup
The namelist setup for this study is shown in Table 1. Only one condition is specified: the special metric ALL is used to select the entire model domain and all time steps.
The QoI names dst_a1 and dst_a3 are EAM's tracer names for dust mass mixing ratio in the accumulation mode and coarse mode, respectively. Each tracer name is mentioned twice under qoi_name, with corresponding qoi_x_dp values of 0 and 2, meaning that both the vertical distribution of the tracer and its column burden are monitored. With l_output_state set to .false. and l_output_incrm set to .true., the tool captures the dust mass mixing ratio increments caused by the targetted atmospheric processes but not the mixing ratios. Five checkpoints are chosen for monitoring the dust budget. The corresponding atmospheric processes are listed in Table 2. (We remind the users that, as shown in Fig. 1, the model processes that contribute to increments diagnosed at a checkpoint not only depends on where this checkpoint is located in the time integration cycle but also where the previous active checkpoint is located.)  The full set of fields tracked by our tool are sent to output files 1 (the h0 file) and 2 (the h1 file), with the h0 file containing monthly averages and the h1 file containing daily averages.   6.2 A composite analysis of sea salt emissions in relation to surface wind speed This example demonstrates the use of composite analysis (without budget terms) to provide insight into wind speed impacts on emission fluxes of sea salt aerosol in various size ranges. The intension is to examine the geographical distribution of sea salt emission fluxes under weak, medium, and strong wind conditions and quantify their relative contributions to the total emission fluxes. Table 3. Namelist setup used in the composite analysis presented in Sect. 6.2.

Simulation setup
In EAMv1, the emission of sea salt aerosol is parameterized with a scheme from Mårtensson et al. (2003) in which the emission flux is proportional to (U10) 3.41 with U10 being the wind speed (unit: m s −1 ) at 10 m above sea level Liu et al., 2012).
Four conditions are specified in the namelist setup shown in Table 3  ∆t CPLmain and their values remain available as components of the derived-type Fortran variable called cam_in (cf . Table A1).
Therefore, as long as we select any checkpoint at or after MCTCPL for assessing U10 combined with any checkpoint at or after CHEMEMIS, and before MCTCPL for monitoring the surface fluxes, the results will be equivalent. In Table 3, the same checkpoint CHEM is used for both namelist parameters cnd_eval_chkpt and qoi_chkpt, as this is the checkpoint right before the surface fluxes are used to update aerosol tracer mixing ratios.
For output, variables from our tool are included in the h0 file as monthly averages.   Here, for demonstration purposes, we only chose 3 wind speed bins and monitored sea salt mass fluxes. If one refines the wind speed ranges (e.g., use 10 to 20 bins), adds aerosol number fluxes to the QoIs, and adds the calculation of global averages to postprocessing, then diagrams like Fig. 5 in Zhang et al. (2012) can be created to investigate the simulated relationship between wind speed and particle size distribution of the emissions but without having to write out a large amount of instantaneous model output.

A conditional budget analysis for RHI
The third example demonstrates a combined use of the budget analysis and conditional sampling capabilities using our tool.
The example also requires the calculation of a diagnosed quantity (the relative humidity with respect to ice, RHI) that is not a state variable, so additional routines are invoked to calculate it. This quantity would vary before and after various processes (e.g., atmospheric dynamics, cloud microphysics, radiation etc) that operate on the atmospheric state, so it is sensitive to how and where it is calculated in the model, and it is also a function of the model sub-cycling.   The second condition effectively selects all grid cells and time steps, but we state the condition as RHI > -1% instead of using the special metric "ALL", and select the same condition-evaluation checkpoint as in condition one, so that the conditionally sampled metric cnd01_RHI and unconditionally sampled cnd02_RHI can be directly and conveniently compared. (Using the special metric "ALL" would result in a metric variable cnd02_ALL, which is a constant field of 1.0, being written to the output files.) The checkpoint before the radiation parameterization is considered the end of a full model time step and hence cnd_end_chkpt is set to PBCDIAG. Both the field values and increments of the QoIs are monitored and included in model output. The full set of fields tracked by our tool are sent to output tape 1 (the h0 file) which contains the monthly averages. can further help to understand the physical mechanisms causing the RHI changes.

Conclusions and outlook
An online diagnostic tool has been designed for and implemented in the global atmospheric circulation model EAMv1. The motivation is to introduce a systematic way to support conditional sampling and budget analysis in EAM simulations, so as to (1) minimize the need for tedious ad hoc coding and hence save code development time and avoid clutter, and to (2)  Assuming the user-chosen conditional metrics and QoIs as well as the locations in time integration cycle to monitor these quantities (referred to as checkpoints) are known to the tool, carrying out a composite or budget analysis using the new tool only requires setting a small number of namelist parameters. The addition of new conditional metrics, QoIs, and checkpoints is straightforward if the data to be sampled can be assessed through EAM's existing data structures.
The new tool has been designed for and implemented in EAMv1 and can be easily ported to EAMv1's descendants (e.g., EAMv2) or predecessors (e.g., CAM5) that use similar Fortran data structures and time integration strategies. Details of the design concepts and implementation in EAMv1 are explained in the paper together with three use case examples that demonstrate the usage of the tool.
The development of the new tool was motivated by the need to carry out conditional budget analysis to understand sources of time-step sensitivities and time-stepping errors related to EAMv1's physics parameterizations. While the current version of the tool, CondiDiag1.0, fulfills the authors' initial needs in those investigations, we are aware of several aspects in which the tool can be further extended or improved to benefit a wider range of EAM users: First, if the desired condition metric or QoI is calculated by a lower-level (in software sense) subroutine and is not saved in EAM's derived-type data structures (e.g., physics state, physics buffer, etc.), the most convenient way to pass data to CondiDiag would be to add the desired physical quantity to the physics buffer. Such cases will be further assessed and alternative methods will be explored. It is worth noting, however, that the E3SM project has been developing a brand new code base for its version 4 release. The new code uses a single "field manager" for information exchange between the host model and any resolved or parameterized atmospheric processes. An implementation of our tool in that new code base would make use of (and benefit from) this new "field manager".
Second, the specification of a sampling condition in CondiDiag1.0 takes the form of a logical expression involving the comparison of a single metric with a threshold value. Section 6.2 demonstrated how the tool can be used for a univariate probability distribution analysis. It would be useful to further extend the tool to support sampling conditions involving multiple metrics and a series of threshold values for each metric, and hence facilitating multivariate probability distribution analysis.
Along that line, it might be useful to support sampling conditions involving multiple metrics evaluated at different checkpoints.
This would be useful for investigating forcing-response relationships of multiple atmospheric processes and for evaluating the behavior of sub-stepped model components.
Third, for simulations that involve multiple sampling conditions, the current tool monitors the same set of QoIs and checkpoints under all conditions. While this will not be difficult to change, we will assess the trade-off between more flexibility and the potential risk of causing confusion for model developers and users.
Beyond the three aspects discussed above, there are some desirable extensions of the tool that will require more substantial revisions of the current design. For example, in CondiDiag1.0, the sampling conditions are re-evaluated (and the QoIs are re-sampled) every model time step. We can, however, imagine cases where a user might want to evaluate a condition at some point of a simulation and monitor the evolution of the atmospheric state in the selected grid cells for longer time periods like a few hours or a few days. Supporting such use cases will require introducing an additional mechanism to specify for how long the evaluated sampling condition is valid. Furthermore, anticipating possible modifications to the sequential splitting of    Tables B1 and B2 list checkpoints currently implemented in EAM's physics driver subroutines tphysbc and tphysac. Table B3 lists the current checkpoints in the interface subroutine clubb_tend_cam.