Evaluation of ice and snow content in the global numerical weather prediction model GME with CloudSat

Introduction Conclusions References


Introduction
Ice clouds have a large impact on the Earth's climate system due to their effects on the global radiation budget.A good description of ice clouds is therefore a major challenge for both climate and numerical weather prediction (NWP) models.The CloudSat Cloud Profiling Radar (CPR) (Stephens et al., 2002) offers the opportunity to vertically resolve even thick ice clouds from space -in contrast to the numerous passive satellite-based sensors.Due to its high resolution and the near-global coverage (compared to ground-based radars) it is predestined for the evaluation of global models because it is able to penetrate clouds and to assess the occurrence of multilevel clouds (Mace et al., 2009).CloudSat also has its limitations since it does not determine ice water content (IWC) directly and the observed radar reflectivity factor is weighted towards larger particles.Though smaller ice particles can be detected well by the Cloud Aerosol Lidar and Infrared Pathfinder Satellite Observation (CALIPSO) Cloud-Aerosol Lidar with Orthogonal Polarization (CALIOP) which flies in formation with CloudSat (Winker et al., 2007), the use of these data is deliberately avoided because of the limitation to optically thin clouds and complications relating the observed backscatter coefficient to IWC.In particular, new aircraft measurements (McFarquhar et al., 2007) indicate that S. Reitter et al.: Evaluation of ice and snow content past measurements overestimate small ice particle concentrations by a factor of 2, resulting in questionable particle size distributions on which all observations, the radar-lidar retrieval included, are based.
As shown by Waliser et al. (2009), climate models today vary in annual mean ice water path (IWP) by up to two orders of magnitude and most climate models underestimate IWP in comparison to CloudSat.A likely explanation for the underestimation is that while CloudSat can not distinguish between cloud ice and snow, precipitating ice is diagnostic in many climate models and therefore does not contribute to IWP.This is typically not the case for high resolution ( x < 10 km) models (e.g., Inoue et al., 2010), that treat snow as a prognostic variable, meaning it can interact with cloudy and dry environment during sedimentation.The comparison between model cloud ice and snow with retrieved total ice (observation-to-model approach) still has limitations due to assumptions in the retrieval, e.g., particle size distribution.For this reason the model-to-observation approach can be used to perform model evaluations in the observation space.With this approach it is also possible to take the detection limit of CloudSat into account, as pointed out by Marchand et al. (2009), who use the QuickBeam simulator to investigate global hydrometeor occurrence as represented by the Multiscale Modeling Framework.Validations of operational global NWP models with CloudSat are rare; Bodas-Salcedo et al. (2008), for example, evaluate the Met Office Unified Model (MetUM) global forecast model at 40 km horizontal resolution using a radar operator.As a result, they identify an inconsistency in the parameterization of ice cloud fraction.
In order to support ongoing model development, the present study aims at evaluating the global NWP model of the German Weather Service (Deutscher Wetterdienst, DWD) GME (Majewski et al., 2002), with special focus on the performance of a diagnostic versus a prognostic precipitation scheme.The representation of grid-scale ice in the two model versions is evaluated with CloudSat CPR data and the individual contributions of cloud ice and snow to the total frozen phase are analysed.The overall goal is to develop a technique with which continuous model evaluation is enabled.For this, the two possible approachesobservation-to-model and model-to-observation -are undertaken and compared.
The structure of the paper is as follows: the GME model and its parameterizations are described in detail in Sect.2, followed by an overview on the CloudSat data in Sect.3. In Sect.4, the methodology applied in this paper is introduced, and the results are discussed in Sect. 5. Finally, summary and conclusions are presented in Sect.6.

GME model
The global NWP model of DWD, GME (Majewski et al., 2002), is a hydrostatic model of the atmosphere.The primitive equations are solved using a finite-difference method on a hexagonal icosahedral A-grid.The model has a horizontal resolution of 40 km and 40 hybrid levels in the vertical.Level thickness ranges approximately from 20 m at the Earth's surface, 500 m in 5 km height, to 750 m in 10 km height.Forecasts are available in hourly resolution, starting at 00:00, 06:00, 12:00, and 18:00 UTC.Operationally, the model is initialized using a three-dimensional variational data assimilation system.Four hydrometeor classes are implemented in the model: Cloud ice and water, snow, and rain.They are available as grid-scale parameters.In the operational version of the GME, cloud ice and cloud water are prognostic variables, whereas snow and rain are diagnostic variables.
GME, being the first part of the model chain at DWD, delivers the boundary conditions for the regional scale models COSMO-EU ( x = 7 km) and COSMO-DE ( x = 2.8 km).Therefore, efforts are in progress to adjust the parameterizations of GME to those of COSMO-EU.As an experiment (GME1007), a prognostic precipitation scheme, but still without the advection of precipitation, was implemented and run for a four-month period from 1 July 2009 to 31 October 2009.This new prognostic scheme follows Rutledge and Hobbs (1983), Lin et al. (1983), and Doms and Schättler (2004).It applies a non-equilibrium treatment of the depositional growth of cloud ice and snow (i.e., allows supersaturation with respect to ice).For cloud ice a monodisperse size distribution is assumed for hexagonal plates with a massdiameter relation of m = 130D 3 (with D in m, m in kg) and ice nucleation is parameterized by a simple temperaturedependent diagnostic relation.Snowflakes are aggregates with a mass-diameter relation of m = 0.069D 2 and a terminal fall velocity of v = 15 D 0.5 (with v in ms −1 ).Based on measurements by Field et al. (2005), a parameterization of the intercept parameter N 0,s of the exponential snow size distribution f (D) = N 0,s exp(−λD) is used, with slope parameter λ.The intercept parameter N 0,s is proportional to the number concentration of snow flakes and is described as a function of temperature T and snow mixing ratio q s : The functions a(3,T ) and b(3,T ) are given in Table 2 of Field et al. (2005).For the autoconversion of cloud ice and the aggregation of cloud ice by snow a temperature dependent sticking efficiency is assumed similar to Lin et al. (1983): with T 0 = 273.15K.The warm rain part of the scheme applies the autoconversion/accretion scheme of Seifert and Beheng (2001) with a constant cloud droplet number concentration of N c = 5 × 10 8 m −3 and a constant shape parameter ν = 2.For details on the scheme see Doms and Schättler (2004).
As a first step, routine precipitation verification at the midlatitudes, where most precipitation is generated via the ice phase, is undertaken.The frequency bias (FBI, Eq. 3), the ratio of the frequency of forecasted to the frequency of observed events, indicates whether the model over-or underforecasts precipitation events.The equitable threat score (ETS, Eq. 4), the fraction of hits adjusted for hits expected by chance, indicates how well forecasted events correspond to observed events.Figure 1 shows an improvement in FBI (Eq. 3) for GME1007 relative to GME.For lower thresholds GME1007 is very close to the results of the regional COSMO-EU, which shares the same microphysical scheme.Also in terms of ETS (Eq.3), GME1007 shows a clear improvement compared to GME for precipitation events up to 5 mm in 24 h and almost reaches the skill of COSMO-EU.Whether this improvement in terms of surface precipitation is connected with improved representation of grid-scale ice is investigated in this study.
with hits a, false alarms b, misses c, correct negatives d, and hits expected by chance a r = (a + b)

CloudSat CPR observations
As a part of the polar-orbiting, sun-synchronous A-Train (Stephens et al., 2002), CloudSat (in operational mode since June 2006) has an orbiting time of 1.5 h, with a constant local solar time overpass at a given latitude band.The payload of CloudSat -the CPR -is a nadir-looking 94 GHz (3.2 mm) radar measuring the backscattering signal of the Earth's surface and of particles in the atmospheric column as a function of distance.The backscattering signal is calibrated to give the equivalent radar reflectivity factor using the dielectric factor for liquid water and assuming Rayleigh scattering.
The equivalent radar reflectivity factor z e is then converted to the radar reflectivity factor z to account for solid phase.The CloudSat CPR features a detection limit of −27 dBz with a dynamic range up to +29 dBz.
Due to the motion of the CloudSat CPR relative to the Earth's surface, its footprint is approximately 1.4 km (acrosstrack) × 1.8 km (along track) (Tanelli et al., 2008).The data are averaged every 0.16 s along track which corresponds to a horizontal resolution of approximately 1.1 km.Vertically, the CPR's pulses sample a volume of 480 m and the data are digitized into 125 bins, each of approximately 240 m height.
The determination of ice water content (IWC) from z e is not trivial as it depends on hydrometeor size, shape, and density distribution, with the largest particles being dominant.
In the present study data from the version 5.1 IWC retrieval (contained in release R04 of the level 2B products) are utilized, which is based on the optimal estimation approach by Rodgers (1976) and assumes a lognormal size distribution.The a priori profiles, dependent on temperature (provided by European Centre for Medium-Range Weather Forecasts, ECMWF) and reflectivity factor, are used as initial values and help constrain the retrieval.The minimum detectable IWC is estimated to be approximately 0.001 g m −3 .Since the radar is not able to determine the cloud phase in a radar  profile, both a liquid and an ice retrieval are run, assuming liquid-only and ice-only conditions.Finally, the two profiles are combined, with a linear scaling between −20 and 0 • C. For details on the retrieval see Austin et al. (2009).
The quality of the radar reflectivity factor measured by the CloudSat CPR and the retrieved IWC has been comprehensively validated by several studies (e.g., Protat et al., 2009, Austin et al., 2009).The radar reflectivity factor performs well in comparison to ground-based measurements, with the weighted mean difference ranging from −0.35 dBz to +0.5 dBz for a ±1 h time lag around the overpass (Protat et al., 2009).Austin et al. (2009) find IWCs above 1 g m −3 not to be trustworthy, however, suitable reference data for IWC validation are still lacking.
Since the CloudSat CPR is not able to distinguish between snow and cloud ice, we apply the following naming convention from now on: ice water content (IWC, referring to the sum of cloud ice and snow), snow water content (SWC), cloud ice water content (CIWC), and analogously for the vertically integrated quantities IWP, SWP, and CIWP.

Matching
Model output sampling is essential when comparing model with satellite data.Temporally, for each CloudSat orbit the model output (of the 00:00 UTC run) closest to the mean time of the CloudSat orbit is chosen.Thus, forecast age varies between 1 and 24 h.Since model resolution is hourly and the duration of a CloudSat orbit is approximately 1.5 h, the maximum time mismatch between model and satellite profile is 1.25 h.To match the spatial domain of model and observation, the GME data are horizontally interpolated onto the CloudSat orbital track with the nearest neighbour technique.Due to the coarser resolution of the model, this means that one model profile is assigned to several adjacent CloudSat profiles.Vertically, as an intermediate choice, both data sets are linearly interpolated onto regular bins with 500 m height each.IWC, SWC, and CIWC are vertically redistributed onto the new bins, with regard to the conservation of the respective water path.Additionally, an along-track 37-profile moving average is applied to all CloudSat data to take the coarser horizontal model resolution into consideration.The original horizontal resolution of the CloudSat data is maintained, but by applying the running mean clouds in the observations become broader and in-cloud reflectivity maxima are attenuated.
After this pre-processing, in order to account for instrument and retrieval algorithm sensitivities, only data (from model and observation) which are firstly within the CloudSat CPR sensitivity range and secondly deemed trustworthy are included in the investigations, i.e., −26 dBz < Z < +29 dBz (no reflectivity factors below −26 dBz due to increased influence of noise) for CloudSat observations and QuickBeam simulations and 0.001 g m −3 < IWC < 1 g m −3 (cf.Sect.3) for CloudSat retrieval and model output.

Evaluation approaches
Since the radar reflectivity factor is not a direct model parameter, two principal approaches are undertaken to validate the GME: observation-to-model and model-to-observation.For the first approach the version 5.1 IWC retrieval (cf.Sect.3) is utilized.For the second approach the radar simulator Quick-Beam v1.1a developed by Haynes et al. (2007) is applied, into which a temperature-dependent exponential hydrometeor distribution shape is implemented to match the snow distribution of GME1007.This version of QuickBeam neither accounts for multiple scattering effects, nor does it simulate the bright band.However, gaseous and hydrometeor attenuation is accounted for and both Rayleigh and Mie scattering are simulated.The difference in the Z-IWC relationships resulting from the two approaches becomes clear in Fig. 2. As a reference, the commonly used Z-IWC relationship from Hogan et al. (2006) is included for two temperatures.The slope of the two approaches match well.However, the model relationship is expectedly tighter than the observational relationship, because it contains no noise.With increasing temperature (Fig. 2, bottom row) the Z-IWC relationship is shifted towards higher reflectivity factors.Note that while the Z-IWC relationships from both approaches aggree in their general behaviour, differences exist in particular concerning large IWCs and cold temperatures.Note also that the Z-IWC relationship from (Hogan et al., 2006) is based on measurements in the mid-latitudes, whereas the Z-IWC relationships from CloudSat and GME1007 in Fig. 2 are based on nearglobal data.
The two approaches have advantages and disadvantages and are therefore both applied in the present study.The observation-to-model approach has the advantage of its easy computation, and the actual model parameters are compared.However, the retrieval can introduce additional uncertainties; three parameters are retrieved out of one measurement and several assumptions (e.g., phase discrimination, size distributions, a priori profiles) are included (cf.Sect. 3 and Austin et al., 2009).The linear scaling with temperature between liquid and frozen solution, for example, may lead to a false estimation of IWC.The model-to-observation approach to some extent avoids the problem of retrieval uncertainties and is closer to the actual physics by simulating the reflectivity factor the radar would have measured in the presence of a given amount of hydrometeors.However, ice crystals are modelled as soft spheres (Haynes et al., 2007) which Liu (2004) finds to be a questionable approximation for the actual particle habit, in this case of the model.

Criteria
Since the model is not capable of assessing all regimes equally well, an effort is made to improve the comparability of model and observations by excluding those regimes, which the model can not sufficiently assess.Four criteria, based on model and/or observational parameters, are applied as a filter to both data sets, model and observations.If a threshold is not met, both model and observations of a matching pixel are discarded.
The four criteria are: (1) only temperatures lower than −10 • C to avoid liquid and most mixed phase, (2) top of convection below 1 km height to reduce subgrid and mixed phase effects, (3) cloud cover larger than 50 % to ensure homogeneous conditions, and (4) total column attenuation not larger than 3 dBz to avoid large particles and the large attenuation associated with these.
The last criterion is only applicable in the model-toobservation approach, which offers a better control.Criteria (1) and ( 4) are diagnosed from both observations and model and applied to the respective data set.Criteria ( 2) and (3), though diagnosed from model output, are assumed to be true for the observations.This is feasible for a NWP model analysis, because with a forecast age of less than or equal to 24 h (cf.Sect.4.1), the model is -in most cases -able to predict the large-scale environment in a deterministic sense.For example, the model is able to predict the large-scale occurrence of deep convection well on this time scale (possibly overestimating the convective area), though its ability to predict the related IWC correctly may be poor.Note that this conditional sampling approach, though applicable in the evaluation of a short-range NWP model, can not be applied for a climate model.
Depending on the investigated parameter, these four criteria reduce the number of included pixels to approximately 20-25 %.Especially concerning the warmer temperature regime, these criteria improve the comparability of model and satellite data distinctly.

Results
In a first evaluation step global frequency distributions are investigated (Fig. 3).For both CloudSat (Fig. 3a) and GME1007 (Fig. 3b) the occurring IWCs cover the full range of values up to the upper sensitivity threshold of the Cloud-Sat CPR.Contrary, the largest IWCs for GME (Fig. 3c) are merely 0.06 g m −3 (−1.2 in log 10 (IWC)).This is primarily due to the missing snow which -being diagnosedfalls out instantaneously after generation.Note, the diagnostic scheme of GME does assume an equilibrium precipitation profile, enabling the estimation of SWC for that profile.Though this might be the most consistent evaluation of the diagnostic scheme, this route is not pursued because the hydrological cycle of the model can not store mass in this profile.When considering GME1007 CIWC (Fig. 3d) a similiar shape of the frequency distribution as for GME IWC (Fig. 3c), but with a shift towards larger values, is notable.In general, GME1007 (Fig. 3b) captures the enhanced occurrence of smaller IWCs with decreasing temperatures which CloudSat (Fig. 3a) features well.However, the observationto-model approach also reveals a distinct difference between CloudSat and GME1007: the GME1007 maximum of the frequency of occurrence reaches up to lower temperature regimes than for CloudSat, most likely because the CloudSat retrieval does not produce a clear IWC maximum from the measured reflectivity factors in these heights, but rather produces a broad range of IWC values.
The model-to-observation approach (Fig. 3e and f), too, shows how well GME1007 reproduces the frequency distribution of CloudSat.Here, another difference between CloudSat and GME1007 is revealed: The frequency distribution is more narrow for GME1007 than for CloudSat; it spans a smaller reflectivity factor range at a given temperature level, indicating a tighter temperature-reflectivity factor relationship (and therewith tighter temperature-IWC relationship) in the model parameterizations.Also, the slope of the maximum is steeper for GME1007.Contrary to GME1007, GME hardly shows any reflectivity factors above −26 dBz (not shown).
In a next step, analyses are refined to resolve meridional variation in IWP (Fig. 4).At this point it is appropriate to demonstrate the individual influence of the applied criteria on IWP.The temperature criterion (1) alone (Fig. 4c) slightly reduces the IWP of all data sets at all latitudes, but does not change the general meridional variation in comparison to without any criteria (Fig. 4a).The convection criterion (2) alone (Fig. 4d) reduces IWP distinctly in the tropics, underlining the importance of convectively induced IWC in this region, but IWP is also reduced in the mid-latitudes.The cloud cover criterion (3) alone (Fig. 4e) appears to affect only the tropics; IWP in mid-latitudinal and polar regions remains overall the same.This emphasizes the fact that IWC in the tropics is largely connected to small scale events, which the microphysical scheme is not able to capture due to the model's resolution; subgrid-scale processes are not represented in the hydrometeor output.When applying all criteria (Fig. 4b), GME1007 realizes the zonally averaged IWP pattern of CloudSat rather well.CIWP in GME1007 is small in comparison to its IWP, underlining again the importance of SWP as a contribution to IWP, yet it remains distinctly larger than GME IWP, as shown above.However, GME1007 consequently overestimates IWP and considerably overestimates mid-latitudinal IWP by a factor of 4. This strong overestimation is not discernible in the frequency distributions shown above, because they are normalized for each data set separately to the number of included pixels.Checks with mass distributions instead of frequency distributions (not shown) confirm the overestimation of IWP in GME1007 as revealed by Fig. 4b.
The zonally averaged IWC (Fig. 5) shows that the meridional position of the IWC peaks of CloudSat is captured well by GME1007 (Fig. 5b), though these peaks are positioned at lower heights in GME1007 than in CloudSat (Fig. 5a).GME IWC and GME1007 CWIC are positioned at exactly the same heights, but the peaks are larger in GME1007 than in GME, which fits to the global frequency distributions in Fig. 3.
Further refinement -separation into three temperature regimes for three zonal regions -is applied to specify the differences in zonally averaged IWP between GME1007 and CloudSat.Contrary to the frequency distributions above, the histograms in Fig. 6 do reflect the above mentioned over-/underestimation of IWC, because they are normalized to the total number of pixels, whether cloudy or not.GME   consequently underestimates the higher IWC values, as discussed above.In general, GME1007 reproduces the shape of the distribution of CloudSat better than GME, especially in the mid-latitudes and polar regions.Here, the peak of maximum frequency of occurrence is located at roughly the same IWC, shifted by 2 bins (i.e., 0.02 g m −3 ) at the most.Yet, the peak is highly overestimated; in the warmest temperature regime by a factor of 3 in the tropics, by a factor of 1.5 in the mid-latitudes, and by a factor of 2 in the polar regions.With decreasing temperature the overestimation increases.This points to an overlong residence time of snow in the air, i.e., an underestimation of the fall speed of snow, leading to the overestimation of zonally averaged IWC and IWP seen above (Fig 4).As for the upper IWC range, this is not reproduced (or underrepresented) in the tropics in GME1007, partly compensating the overestimation of IWP in this region (Fig. 4).This might be attributed to the fact that deep convective events which produce the largest particles in the tropics are not resolved by the model and are eliminated by the criteria.Finally, small IWCs are underrepresented in GME1007 in comparison to CloudSat, which might be due to several reasons, e.g., a too fast depositional growth or the missing homogeneous nucleation of aerosols.
These features are robust, also in reflectivity factor (Fig. 7).Additionally, two further features are discernible.First, with decreasing temperature, the peak of maximum frequency of occurrence of GME1007 shifts more and more to higher reflectivity factors than for CloudSat.Second, the frequency distribution is more narrow for GME1007 than for CloudSat.These findings agree with the steeper and more narrow global frequency distribution for GME1007 seen above in Fig. 3.As in Fig. 3, GME produces small reflectivity factors which are outside the displayed range and therewith outside the detection limit of CloudSat.The same applies for GME1007 reflectivity factors calculated from CIWC only.
In order to test the hypothesis of a too small fall speed of snow being responsible for the IWC/IWP overestimation, a sensitivity study is conducted (Fig. 8): The same configuration as GME1007 is run as control simulation Exp1.Exp2 takes into account the density correction of the fall speed of precipitating hydrometeors, and Exp3 additionally applies an increased and more realistic fall speed of snow, compared to a reference fall speed based on Khvorostyanov and Curry (2005), with v = 25D 0.5 .For each experiment a 30-day simulation is performed, and only the last 25 days are analysed to exclude effects of model spin-up.As expected, the faster falling snow leads to a reduction of SWC (Fig. 8) while largescale surface precipitation is only marginally affected (not shown).Globally averaged, this amounts to a reduction of mean SWP from 81 g m −2 to 62 g m −2 for Exp2 and a further reduction to 40 g m −2 for Exp3.CIWC and CIWP, respectively, increase slightly with increased snow fall speed.Therefore, the unrealistically small fall speed of snow in GME1007 can explain most of the positive bias in IWC and IWP, respectively, which is found compared to CloudSat.
To explain the remaining IWC bias, we note that a further increase of snow fall speeds might occur in regions of heavy riming or graupel formation, however, both is currently not taken into account for grid-scale clouds in GME.Furthermore, other model errors than cloud microphysics might also contribute to the remaining unexplained IWC bias.

Summary and conclusions
This study evaluates the global NWP model GME with respect to frozen particles, and in doing so focuses on the performance of a prognostic versus a diagnostic precipitation scheme.As a reference, CloudSat CPR observations are utilized, which offer the so far unique opportunity of vertically resolving clouds at a near-global scale.
The prognostic scheme is found to capture the shape and magnitude of the CloudSat CPR frequency distributions of IWC and reflectivity factor well.In contrast, the diagnostic scheme considerably underestimates the larger IWC and reflectivity factor values, a result of the fact that snow falls out instantaneously.As a consequence of the improved overall performance, the prognostic scheme presented here went operational on 2 February 2010.
Furthermore, the height-resolving CloudSat CPR enables the continuous assessment of processes within clouds.It is shown that the prognostic scheme still requires improvements, especially concerning the overestimation of IWP.One source of error, the too small fall speed of snow, is identified: With the introduction of a -currently neglected -densitydependency the fall speed increases with height, thereby reducing IWP.Due to this further improvement in performance, the microphysical choices of Exp2 went operational on 1 December 2010.
The presented multi-parameter validation enables the comparison of the two approaches: The general features are robust and captured by both approaches.However, details are captured by merely one or the other approach, in which case both approaches together deliver the largest informational content.Having to decide for one approach only, the choice would be the model-to-observation approach, since its uncertainties are easier to assess and it ensures a better control over the comparison, notably through the attenuation criterion.The developed criteria successfully filter out situations the model is not able to capture (e. g., subgrid-scale processes contributing to IWC) and thereby improve the comparability between model and observations distinctly.
Finally, the present evaluation shows that snow is the dominant contributor to IWC and IWP.This finding agrees well with the aircraft measurements of Field et al. (2005), which reveal that snow (aggregates) contributes up to 90 % to IWC in frontal clouds.This might help to explain why most climate models, which do not resolve snow and rain explicitly, tend to underestimate IWC (Waliser et al., 2009).

Fig. 1 .
Fig. 1.Frequency bias (left) and equitable threat score (right) of 24 h accumulated precipitation.Model results are verified against a gridded precipitation data set for Germany based on more than 600 rain gauges from 1 July 2009 to 31 October 2009.