Incorporation of inline warm rain diagnostics into the COSP2 satellite simulator for process-oriented model evaluation

Cloud Feedback Model Intercomparison Project Observational Simulator Package (COSP) has been widely is used to diagnose model performance and physical processes via an apple-to-apple comparison to satellite measurements. Although the COSP provides useful information about clouds and their climatic impact, outputs that have a subcolumn dimension require large amounts of data. This can cause a bottleneck when conducting sets of sensitivity experiments or multiple model intercomparisons. Here, we incorporate two diagnostics for warm rain microphysical processes into COSP2, the latest version 5 of the simulator (COSP2). The first one is the occurrence frequency of warm rain regimes (i.e., non-precipitating, drizzling, and precipitating) classified according to CloudSat radar reflectivity, putting the warm rain process diagnostics into the context of geographical distributions of precipitation. The second diagnostic is the probability density function of radar reflectivity profiles normalized by the in-cloud optical depth, the so-called contoured frequency by optical depth diagram (CFODD), which illustrates how the warm rain processes occur in vertical dimension using statistics constructed from CloudSat and MODIS 10 simulators. The approach used here employs existing diagnostic methodologies that probe how the warm rain processes occur using statistics constructed from simulators of multiple satellite instruments along with their subcolumn information. The new diagnostics are designed to produce statistics online along with their subcolumn information during the COSP execution, eliminating the need to output subcolumn variables. Users can also readily conduct regional analysis tailored to their particular research interest (e.g., land–ocean differences), using an auxiliary post-process package after the COSP calculation. This 15 inline tool also generates global maps of the occurrence frequency of warm rain regimes (i.e., non-precipitating, drizzling, and precipitating) classified according to CloudSat radar reflectivity, putting the warm rain process diagnostics into the context of geographical distributions of precipitation. The inline diagnostics are applied to the MIROC6 GCM to demonstrate how known biases common among multiple GCMs relative to satellite observations are revealed. The inline multisensor diagnostics are intended to serve as a tool that facilitates process-oriented model evaluations in a manner that reduces the burden on modelers 20 for their diagnostics effort.

Abstract.The Cloud Feedback Model Intercomparison Project Observational Simulator Package (COSP) is used to diagnose model performance and physical processes via an apple-to-apple comparison to satellite measurements.Although the COSP provides useful information about clouds and their climatic impact, outputs that have a subcolumn dimension require large amounts of data.This can cause a bottleneck when conducting sets of sensitivity experiments or multiple model intercomparisons.Here, we incorporate two diagnostics for warm rain microphysical processes into the latest version of the simulator (COSP2).The first one is the occurrence frequency of warm rain regimes (i.e., nonprecipitating, drizzling, and precipitating) classified according to CloudSat radar reflectivity, putting the warm rain process diagnostics into the context of the geographical distributions of precipitation.The second diagnostic is the probability density function of radar reflectivity profiles normalized by the in-cloud optical depth, the so-called contoured frequency by optical depth diagram (CFODD), which illustrates how the warm rain processes occur in the vertical dimension using statistics constructed from CloudSat and MODIS simulators.The new diagnostics are designed to produce statistics online along with subcolumn information during the COSP execution, eliminating the need to output subcolumn variables.Users can also readily conduct regional analysis tailored to their particular research interest (e.g., land-ocean differences) using an auxiliary post-process package after the COSP calculation.The inline diagnostics are applied to the MIROC6 general circulation model (GCM) to demonstrate how known biases common among multiple GCMs relative to satellite observations are revealed.The inline multi-sensor diagnostics are intended to serve as a tool that facilitates process-oriented model evaluations in a manner that reduces the burden on modelers for their diagnostics effort.
The A-Train global observations (Stephens et al., 2002;L'Ecuyer and Jiang, 2010), consisting of the sun-synchronous and polar-orbiting multi-satellite constellation, are a powerful tool (e.g., Stephens et al., 2018) that can be used to improve GCM parameterizations by constraining aerosol-cloud relationships (Wang et al., 2012;Suzuki et al., 2013).However, direct T. Michibata et al.: New warm rain diagnostics in COSP2 comparisons between native model output and satelliteretrieved data are not always straightforward because satellite retrievals are inverse estimates from observed radiance or the radar reflectivity factor (e.g., Masunaga et al., 2010).Therefore, native model values must be converted by solving the "forward problem" using the same algorithms applied to each satellite sensor for consistent ("definitionaware") comparisons.Furthermore, the process evaluation among models and observations should be done under the same spatiotemporal scale for consistent ("scale-aware") comparison.To this end, the Cloud Feedback Model Intercomparison Project (CFMIP) community has developed the CFMIP Observation Simulator Package (COSP; Bodas-Salcedo et al., 2011), which provides "a common language for clouds" (Swales et al., 2018).With this capability, COSP has been used widely, not only in the CFMIP community, but also by many climate modelers to evaluate model uncertainties through model intercomparisons, including CMIP6 (Eyring et al., 2016;Webb et al., 2017).
The recent and significant redesign of COSP aimed to provide more robust and efficient code (Swales et al., 2018).The updated package (COSP2) enhances the flexibility by allowing native model subgrid cloud representations to be used as input for the COSP2 interface.Using inputs from a host model, simulators in COSP2 perform two main tasks (Fig. 1): (1) translating the native model variables to subcolumn-scale (pixel) synthetic retrievals and (2) aggregating the subcolumn retrievals to column-scale (grid) statistics (see Fig. 1 of Swales et al., 2018, for details).This substantial revision of COSP has extended its functionality, enabling the introduction of diagnostics constructed from multiple instrument simulators in a definition-aware and scale-aware framework (Kay et al., 2018).
To investigate microphysics at a fundamental process level, it is best to analyze the instantaneous output for the variables of interest rather than their monthly means (e.g., Konsta et al., 2016).This is because these processes typically occur over short timescales ("fast processes") and contribute to the regime dependency of important phenomena including aerosol-cloud-precipitation interactions (Michibata et al., 2016;Patel et al., 2019).This requires highfrequency data output ( ∼ 6 hourly) from COSP (see also Table 1 of Tsushima et al., 2017), which results in large amounts of data, particularly when subcolumn (pixel-scale) variables, such as a radar or lidar simulator, are involved.The CFMIP recommendation to COSP users is to assume approximately 100 subcolumns per 1 • of model grid spacing (cfmip2/cosp_input_cfmip2_long_inline.txt) to enable comparison to satellite sampling at the kilometer scale.This leads to bottlenecks in fast-process diagnostics that analyze instantaneous output in terms of both data transfer and analysis.
To address this challenge in COSP, this work incorporates an inline diagnostic tool into COSP2 to facilitate processoriented model evaluations targeted at warm rain.By introducing joint statistics from multiple satellite simulators, detailed information related to cloud microphysics is now readily available from model diagnostics without the need to output subcolumn variables.Although this tool is applied here to warm rain diagnostics, it can be extended to other microphysical processes to facilitate the efficient evaluation of models with subgrid cloud schemes of various complexity (Turner et al., 2012;Thayer-Calder et al., 2015).
This technical paper is organized as follows: the diagnostic tool that is based on the joint satellite simulators and its application to model evaluations are described in Sect.2; the scientific perspectives using the warm rain diagnostic tool and A-Train satellite data are provided in Sect.3; and a summary and future work are presented in Sect. 4. The source codes and reference satellite data are all available from public repositories (see "Code and data availability").

Concept and design
The objective of this work is to provide a specific processoriented metrics that is also compatible with scale-aware and definition-aware diagnostics (Kay et al., 2018) in the manner implemented into COSP for a fair comparison of warm clouds among GCMs and satellite retrievals.Here the main concept is using conditional statistics that "fingerprint" the process of interest by combining multiple satellite observables.One of the transformative advances recently made possible by combining active and passive satellite measurements is the ability to generate observational diagnostics of how the microphysical vertical structure of clouds varies with the surrounding environment (Marchand et al., 2009;Sorooshian et al., 2013), such as aerosol concentration (Ma et al., 2018;Rosenfeld et al., 2019) and dynamical regimes (Nam et al., 2014;Christensen et al., 2016).
As a default diagnostic from the CloudSat radar simulator alone in COSP (Bodas-Salcedo et al., 2011), the socalled contoured frequency by altitude diagram (CFAD) is prepared to provide macrophysical vertical structure including all types of hydrometeors (i.e., liquid droplets, ice crystals, raindrops, and snowflakes).In this regard, more specific statistics are useful when investigating a particular process, including the warm rain microphysical processes that are the focus of this work as described below.

Warm rain diagnostics
For this study, we incorporated two such diagnostics based on the CloudSat and MODIS satellite simulators into COSP2 to evaluate cloud-to-rain microphysical transition processes represented in GCMs using satellite observations.Both diagnostics are applied only to single-layer warm clouds (SLWCs) and their results are constructed with the aid of column simulators, as illustrated in Fig. 1.
The first diagnostic provides the fractional occurrence of warm rain regimes, which are classified according to the CloudSat column maximum radar reflectivity (Z max ) as non-precipitating (Z max < −15 dBZ e ), drizzling (−15 dBZ e < Z max < 0 dBZ e ), and precipitating (0 dBZ e < Z max ).This threshold of Z max is often used to separate non-precipitating and precipitating clouds for warm rain studies (Wood et al., 2009;Kubar et al., 2009).Since this study extracts only SLWCs, ocean-specific (Haynes et al., 2009) and landspecific (Smalley et al., 2014) thresholds originated from radar attenuation and/or phase partitioning are not used in our diagnostics (see also Kay et al., 2018).This enables us to assess global clouds uniformly.The occurrence frequencies of the non-precipitating, drizzling, and precipitating regimes are defined at the pixel scale as where i ∈ {cloud, drizzle, rain}, and n slwc is the total sample number of the SLWCs detected by CloudSat and MODIS retrievals within the grid box at longitude λ and latitude φ.
This metric provides information about where and how the warm rain occurrence frequency and intensity are biased in the model relative to the satellite observations (Jing et al., 2017;Kay et al., 2018).
The second diagnostic is the probability density function (PDF) of radar reflectivity profiles scaled as a function of the vertically sliced in-cloud optical depth (ICOD) and is commonly referred to as the contoured frequency by optical depth diagram (CFODD), as proposed by Nakajima et al. (2010) and Suzuki et al. (2010).The diagnostic reveals how the vertical microphysical structures of SLWCs tend to transition from non-precipitating to precipitating regimes as a fairly monotonic function of the cloud-top particle size.In this method, the MODIS-retrieved columnar cloud optical depth (τ c ) is redistributed into a layered ICOD at each radar height (h) bin according to the adiabatic condensationgrowth model (Brenguier et al., 2000;Szczodrak et al., 2001) as where H is the cloud geometric thickness.After scaling by ICOD (optical depth from the cloud top), the CFODD reveals particle coalescence processes (Suzuki et al., 2010) and offers a direct way to evaluate and constrain these processes in global models (Suzuki et al., 2011(Suzuki et al., , 2015)).
The A-Train analysis compared with the model statistics is also restricted to SLWCs, which are defined as having cloud-top temperatures (T top ) > 273.15 K, extracted using the CloudSat radar reflectivity and a cloud mask described by Michibata et al. (2014Michibata et al. ( , 2016)).Convective deep clouds are thus excluded from the analysis.To ensure consistency with A-Train observations, both diagnostics for GCMs-COSP2 T. Michibata et al.: New warm rain diagnostics in COSP2 use only subcolumn pixels with a scene type of stratiform clouds (fracout = 1), as shown in Fig. 1.

Computational procedure and outputs
The warm rain diagnostics (occurrence frequency of warm rain regimes and CFODD) are activated by setting the logical flags "Lwr_occfreq" and "Lcfodd" to true in the output namelist (cosp_output_nl_v2.0.txt).Both the CloudSat and MODIS simulators are included automatically in the calculations if either flag is set to true, and the specified diagnostics are generated (see Fig. 1) during COSP execution.
The generated outputs are the total number of samples in each GCM grid, which are aggregated from the subcolumn retrievals.These outputs were chosen because the diagnosed PDFs should be created by using total samples during the course of simulation.Because this requires a post-processing of the output to construct the statistics, a post-processing package is also prepared to support this procedure.The postprocessing package also facilitates regional analysis tailored to a user's particular research purpose, as discussed later.Users are recommended to output the diagnostics as an accumulated value (e.g., for each month) rather than instantaneous values to reduce the volume of output data.

Examples of model-observation intercomparisons
We used the MIROC6-SPRINTARS global aerosol-climate model (Tatebe et al., 2019;Michibata et al., 2019a) to demonstrate the warm rain analysis of the diagnostic tool.MIROC6 applies a PDF-based large-scale condensation parameterization (Watanabe et al., 2009) with Berry (1968) warm rain microphysics and an entrainment plume model for convective precipitation (Chikira and Sugiyama, 2010), including a shallow cumulus scheme (Park and Bretherton, 2009).The host model resolution was 1.4 • ×1.4 • with 40 vertical levels (T85L40).Although the number of subcolumns (NCOLUMNS) was set to 140, obtained warm rain diagnostics were insensitive to the choice of NCOLUMNS, at least in MIROC6 (not shown).The model time step was 12 min, and COSP was called every 3 h.The COSP simulator was operated for one full year after a 1-year spin-up.Simulations were conducted under climatological sea-surface temperature and sea ice, present-day aerosol emissions, and greenhouse gases with monthly mean annual cycles.A benchmark test indicated that the inline warm rain diagnostic tool increases the computational cost by only about 0.8 % when using the SX-ACE supercomputer system of the National Institute for Environmental Studies, Japan.
As a reference, we also calculated the target metrics (i.e., the occurrence frequency of SLWCs and CFODDs) using CloudSat and MODIS satellite data products (e.g., Stephens et al., 2008; the data are available at http://www.cloudsat.cira.colostate.edu,last access: 8 October 2019) for the pe-riod June 2006-April 2011.The visible cloud optical depth and 2.1 µm cloud droplet effective radius were derived from the MODIS level 2B-TAU R04 product (Polonsky, 2008), the radar reflectivity profile was obtained from the CloudSatderived level 2B-GEOPROF R04 product (Mace et al., 2007;Marchand et al., 2008), and the pressure and temperature profiles were derived from the ECMWF-AUX R04 product (Partain, 2007).Detailed descriptions of the model configuration and the analysis procedure to detect SLWCs are provided elsewhere (Michibata and Takemura, 2015;Michibata et al., 2016).
It should be noted that although only the stratiform subcolumns were analyzed in the model (defined as fracout = 1 in COSP, see also Fig. 1), A-Train analysis includes both convective and stratiform clouds.Strictly speaking, the modelobservation comparisons are in this regard not equivalent.However, given that the sampling criteria of SLWCs exclude deep convective clouds, the inconsistency in cloud type between the model and observations is minimized.

Occurrence frequency of warm clouds
Figure 2 shows the geographical distributions of the fractional occurrences of SLWCs for non-precipitating, drizzling, and precipitating regimes obtained from the MIROC6 simulation and A-Train satellite observations.Note that although the reference A-Train statistics are shown at 1.5 • × 1.5 • resolution, which is close to that of MIROC6-SPRINTARS, the statistics are constructed from the native CloudSat resolution (1.4 × 2.5 km) and subcolumns in the host model prepared by COSP (kilometer scale) to achieve a scale-aware model-satellite comparison.
We obtained 74.6 million SLWCs from the model and 7.8 million SLWCs from observations.The model generated more SLWCs than were present in the A-Train observations.This suggests that one full year of simulation with 3-hourly diagnosis is long enough, but note that this does not negate the possibility of the generation of SLWCs being too frequent in the model.In the A-Train satellite retrievals, many SLWCs are located over the typical stratocumulus (Sc) regions off the west coasts of California, Peru, Australia, Namibia, and the Canary Islands (not shown), where the nonprecipitating regime is dominant (Fig. 2d).The MIROC6 finds 48.5 % drizzling regime versus 33.3 % in the A-Train retrievals (Figs.2b and e).For the precipitating regime, although the global mean values of occurrence frequency are consistent with each other (15.9 % in MIROC6 and 17.4 % in A-Train), the geographical pattern is quite different, particularly over tropical oceans and continents (Figs.2c and f), implying that the model has biases in the warm rain formation process (e.g., Jing et al., 2019) and/or the representation of cloud types (e.g., Huang et al., 2015).
These biases in MIROC6 can be interpreted in the context of the aerosol-cloud interactions parameterized in the model.In bulk microphysics models, the onset of rain is represented by the so-called autoconversion scheme, which is generally expressed as (e.g., Berry, 1968;Beheng, 1994;Khairoutdinov and Kogan, 2000) where q c and q r are the liquid cloud water and rainwater mixing ratios, respectively; N c is the cloud droplet number concentration; and C aut , α, and β are the prescribed (uncertain) constants.This formulation describes how the model forms rain in terms of uncertain parameters.Given that the Cloud-Sat cloud profiling radar is sensitive to both cloud droplets and raindrops (Stephens and Haynes, 2007;Haynes et al., 2009), model-satellite comparisons (Fig. 2) offer useful evaluations of cloud-to-rain transition processes represented by Eq. ( 3), as also proposed by Kay et al. (2018).
Here, we demonstrate that CFODDs deduced from satellite observations illustrate systematic transitions from non-precipitating through drizzling to precipitating regimes as a function of R e and are consistent with previous observational findings showing the strong dependence of the onset of precipitation upon R e (Lebsock et al., 2008;Rosenfeld et al., 2012).On the other hand, MIROC6 simulates higher radar reflectivity even in the smallest R e category, revealing a bias in rain formation that is too early and too frequent (Suzuki et al., 2015).We attribute this discrepancy between the model and observations primarily to the following two factors: the bias in the updraft velocity (Nakajima et al., 2010;Takahashi et al., 2017a) at the subgrid scale and the uncertainty associated with the dependence of rain formation on aerosols (Wood, 2005;Suzuki et al., 2013) as characterized by β in Eq. (3).To evaluate this regime dependence of aerosol-cloud interactions (Sorooshian et al., 2009;Michibata et al., 2016;Chen et al., 2016Chen et al., , 2018)), it is useful to investigate the differences in CFODDs from various environmental regimes (e.g., updraft and aerosol loading).
Thus, we defined 13 regions (Fig. 4) to examine the detailed aerosol-cloud interactions.This regional classification is based on previous warm rain studies with various research aims (e.g., Leon et al., 2008;Terai et al., 2015) and is summarized in Table 1.Statistics can also be examined separately over land and ocean (not shown) to investigate the differences in the CFODD transition in dynamic regimes (e.g., Takahashi et al., 2017b).Alternatively, users can define specific regions to suit their research purposes.
Figure 4 shows results from a regional CFODD analysis over five regions: Eastern Asia, Tropical Warm Pool, Equatorial Cold Tongue, North Atlantic, and Australia.CFODDs for the smallest R e range (5 < R e < 12 µm) are shown.This regional analysis reveals that the model does not always show a warm rain bias in all regions that is too early and/or too frequent.For example, the CFODDs over the Eastern Asian, Australian, and Equatorial Cold Tongue regions simulated by MIROC6 are in good agreement with those derived from the A-Train observations.The model accurately captures the non-precipitating regime in the smaller R e categories, suggesting that the model partially captures slower cloud-to-rain conversions in aerosol-abundant environments (Eastern Asia) and under calm stable conditions (Australia and Equatorial Cold Tongue).These results emphasize the importance of understanding the link between microphysics and dynamics (Chen et al., 2014;Zhang et al., 2016) if we wish to develop a more reliable representation of aerosolcloud-precipitation interactions, but this is beyond the scope of this technical paper.
As discussed above, CFODDs provide valuable information on cloud-to-rain microphysical transitions associated with aerosol-cloud interactions and microphysics-dynamics interactions.Our new warm rain diagnostic tool will assist in process-oriented model evaluations with the synergistic use of A-Train multi-satellite observations.

Summary
This technical paper describes a new warm rain diagnostic tool implemented in the COSP2 satellite simulator package that extends its process-oriented diagnostic capabilities (Michibata et al., 2019b).We have introduced two new diagnostics: (1) the occurrence frequencies of non-precipitating clouds (Z max < −15 dBZ e ), drizzling clouds (−15 dBZ e < Z max < 0 dBZ e ), and precipitating clouds (0 dBZ e < Z max ) and 2) the PDF distributions of radar reflectivity profiles normalized by ICOD, the so-called contoured frequency by optical depth diagram (CFODD).These diagnostics make synergistic use of the CloudSat and MODIS simulators (Michibata et al., 2019c).
The diagnostic tool is controlled by the logical flags, "Lwr_occfreq" and "Lcfodd" in the namelist for COSP out-puts.Users are now not required to output subcolumn parameters, such as the radar or lidar signals from the simulators of active sensors, which significantly increases the efficiency of model evaluation.Adding the inline warm rain diagnostics into COSP increases the computational cost only slightly (by around 0.8 %) when using the SX-ACE supercomputer system of the National Institute for Environmental Studies, Japan.
The inline warm rain diagnostic tool is intended to facilitate model evaluations that are efficient enough to be conducted within the model development loop, specifically by providing both "performance constraints" and "process-level fingerprints" (Fig. 1).The diagnostic tool has been designed to reveal potential uncertainties in modeled warm rain processes in GCMs more effectively and simply.The multiplatform products can also be extended to include other diagnostics for mixed-phase and ice clouds (e.g., Mülmenstädt et al., 2015;Kikuchi et al., 2017)

Figure 1 .
Figure 1.Schematic flowchart of COSP2 (see also Swales et al., 2018, for details) and additional processes for warm rain diagnostics introduced in this work.

Figure 2 .
Figure 2. Geographical maps of the fractional occurrence of (a, d) non-precipitating clouds (Z max < −15 dBZ e ), (b, e) drizzling clouds (−15 dBZ e < Z max < 0 dBZ e ), and (c, f) precipitating clouds (0 dBZ e < Z max ) obtained from (a-c) the MIROC6-COSP2 simulation of one full year and (d-f) the A-Train satellite observations for the period June 2006-April 2011.Global means of the occurrence frequency are shown at the top right of each panel.

Figure 4 .
Figure 4. Definition of the 13 regions used in the post-process package.An example of the regional CFODD analysis over the (red) Eastern Asian, (purple) Tropical Warm Pool, (yellow) Australian, (green) North Atlantic, and (orange) Equatorial Cold Tongue regions obtained from the MIROC6-COSP2 and the A-Train observations for the R e range 5 < R e < 12 µm.The color scale is the same as in Fig. 3.

Table 1 .
Definition of the 13 regions used in the CFODD regional analysis, corresponding to the boxes in Fig.4.
in future work.Requests for specific diagnostics, particularly those requiring COSP subcolumn output for fast-process evaluations, are welcome.and the Collaborative Research Program of the Research Institute for Applied Mechanics, Kyushu University.Kentaroh Suzuki was supported by the NOAA Climate Program Office's Modeling, Analysis, Predictions, and Projections program with grant number NA15OAR4310153.The authors are grateful to Johannes Mülmenstädt (Universität Leipzig) and two anonymous reviewers for providing constructive suggestions and comments that helped to improve the paper.Financial support.This research has been supported by the Japan Society for the Promotion of Science (grant nos.JP18J00301, JP19K14795, and JP19H05669), the Ministry of Education, Culture, Sports, Science and Technology (Integrated Research Program for Advancing Climate Models (TOUGOU program)), Kyushu University (Collaborative Research Program of the Research Institute for Applied Mechanics), and the NOAA Climate Program Office (grant no.NA15OAR4310153).statement.This paper was edited by Axel Lauer and reviewed by Johannes Mülmenstädt and two anonymous referees.