A radar reflectivity operator with ice-phase hydrometeors for variational data assimilation (version 1.0) and its evaluation with real radar data

A reflectivity forward operator and its associated tangent linear and adjoint operators (together named RadarVar) were developed for variational data assimilation (DA). RadarVar can analyze both rainwater and ice-phase species (snow and graupel) by directly assimilating radar reflectivity observations. The results of three-dimensional variational (3D-Var) DA experiments with a 3 km grid mesh setting of the Weather Research and Forecasting (WRF) model showed that RadarVar was effective at producing an analysis of reflectivity pattern and intensity similar to the observed data. Two to three outer loops with 50–100 iterations in each loop were needed to obtain a converged 3-D analysis of reflectivity, rainwater, snow, and graupel, including the melting layers with mixed-phase hydrometeors. It is shown that the deficiencies in the analysis using this operator, caused by the poor quality of the background fields and the use of the static background error covariance, can be partially resolved by using radar-retrieved hydrometeors in a preprocessing step and tuning the spatial correlation length scales of the background errors. The direct radar reflectivity assimilation using RadarVar also improved the short-term (2–5 h) precipitation forecasts compared to those of the experiment without DA.


Introduction
Over the past several decades, radar reflectivity observations have been used in many data assimilation (DA) studies (Borderies et al., 2018;Caumont et al., 2010;Gao and Stensrud, 2012;Hu et al., 2006;Jung et al., 2010Jung et al., , 2008a;;Liu et al., 2019;Putnam et al., 2014;Snook et al., 2012Snook et al., , 2015;;Sun and Crook, 1997;Sun and Wang, 2013;Tong and Xue, 2005;Wang et al., 2013b;Wang and Wang, 2017;Wattrelot et al., 2014;Xiao et al., 2007;Xue et al., 2006) and they have demonstrated that assimilating this radar reflectivity improves the initial conditions of the convective scale and benefits the subsequent forecasts.To assimilate the reflectivity, it is necessary to transform the model's prognostic variables (e.g., rainwater, snow, and graupel) to the observed radar reflectivity.To perform this transformation, early studies (e.g., Sun and Crook, 1997;Xiao et al., 2007) used the Marshall-Palmer distribution of raindrop size (Z-R relationship).However, this relationship is only valid in precipitation areas without ice-phase species; thus, its applications (e.g., Schwitalla and Wulfmeyer, 2018) are often limited to layers lower than 4 km or 8 km above ground level (a.g.l.).To overcome this deficiency, more comprehensive operators that involve snow and graupel have been developed (Gao and Stensrud, 2012;Tong and Xue, 2005).Several studies (e.g., Gao and Stensrud, 2012;Wang and Wang, 2017) have demonstrated that involving these ice species in the reflectivity operator improves the analysis of hydrometeors in terms of their spatial distribution, especially in the vertical direction.
Although these operators have been successfully applied in many convective-scale DA studies, they were developed for a specific band of radar (e.g., S-band radar; Gao and Stensrud, 2012;Sun and Crook, 1997) and specific microphysics characteristics (e.g., with a fixed intercept parameter; Sun and Crook, 1997).However, mixed-phase species Published by Copernicus Publications on behalf of the European Geosciences Union.such as wet snow and wet graupel have not been considered in these operators.Recently, the contributions from mixedphase species have been studied (Jung et al., 2008a, hereafter, J08;Posselt et al., 2015).To compute the mixed-phase species' contributions, J08 proposed an operator that was based on the expressions given by Zhang et al. (2001).Their expressions were derived according to the scattering amplitudes that were estimated through the T-matrix method and the Rayleigh scattering approximation (J08).For computational efficiency, these expressions were rewritten in the polynomial form that was only valid for S-band radar in which the Rayleigh assumption was satisfied.Later, a more general and exact operator that fully used the T-matrix scattering method was proposed (Jung et al., 2010).This operator was given as the integral of the complex backscattering amplitudes over the size distribution of the precipitation particles (i.e., rainwater, snow, and graupel).In addition to these operators, several complex reflectivity operators in the integral form have also been proposed (Borderies et al., 2018;Caumont et al., 2006;Pfeifer et al., 2008;Ryzhkov et al., 2011;Wattrelot et al., 2014).Some were designed for a specific band of radar (e.g., W band; Borderies et al., 2018), whereas some were designed for the bin microphysics scheme (e.g., Ryzhkov et al., 2011).
Despite the difficulty, some efforts have been undertaken for reflectivity assimilation with the TL/AD operators (Gao and Stensrud, 2012;Kawabata et al., 2018;Liu et al., 2019;Xiao et al., 2007), and reasonable results have been obtained in terms of hydrometeor analysis and precipitation forecasts.However, none of these studies employed operators as complex as those proposed by J08.Kawabata et al. (2018) adopted the expressions of Zhang et al. (2001) and developed the TL/AD operator for C-band radar but without taking into account the contributions from ice-phase species.
The main purpose of this study is to develop a TL/AD operator based on Jung et al. (2008a) with the contributions of ice-phase precipitation and apply it in a variational DA framework.For convenience, the operator implemented in this study is called RadarVar to represent that it was developed for variational DA and contains ice-phase species.The original J08 operator is called J08orig.The reminder of this paper is organized as follows.In Sect.2, the J08 operator is reviewed, and its TL and AD operators are derived.The experimental design is given in Sect.3, and the new operators are verified in Sect. 4. The performance of RadarVar is discussed in Sect.5, and the conclusions are presented in Sect.6.

Review of the J08 operator
The radar-observed reflectivity, Z, is given in logarithmic form as where Z e is the equivalent reflectivity factor, which is the sum of the contributions from pure rainwater (Z r ), dry snow (Z ds ), dry graupel (Z dg ), wet snow (Z ws ), and wet graupel (Z wg ) as follows: To compute Eq. ( 2), the mixing ratios of mixed-phase species (wet snow and wet graupel) are required.However, many widely used microphysics schemes, such as the Lin, WSM6, and Morrison schemes, do not predict or diagnose the mixedphase species; thus, the amount of rainwater in wet snow or graupel cannot be directly extracted from the model output.To solve this issue, J08 modeled the rain-snow (raingraupel) mixture using a fraction that is given by F = [min(q r /q x , q x /q r )] 0.3 F max , where F max is the maximum fraction, which is 0.5 (0.3) for rain-graupel (rain-snow) mixtures; q r is the mixing ratio of rainwater; and q x is the general form of the mixing ratio of ice-phase species.The subscript "x" can be either "s" for snow or "g" for graupel.With this fraction, the mixing ratios of pure rainwater, dry snow, dry graupel, and mixed-phase species are given by q pr = (1 − F ws − F wg )q r q ds = (1 − F ws )q s q dg = (1 − F wg )q g q ws = F ws (q s + q r ) q wg = F wg (q g + q r ), where the subscripts "ws" and "wg" are added to F to represent the fractions of wet snow and wet graupel, respectively, Geosci.Model Dev., 12, 4031-4051, 2019 www.geosci-model-dev.net/12/4031/2019/ and the subscripts "pr", "ds", and "dg" represent pure water, dry snow, and dry graupel, respectively.The mixed-phase density, ρ wx , is not a constant and is parameterized by with The subscript "x" in ρ wx , ρ x , and f wx represents either snow (s) or graupel (g), and f wx is called the water fraction.

Contribution from rainwater
In accordance with J08, all of the contributions are computed by integrations over the drop size distribution (DSD) weighted by the scatter cross section determined by the density, shape, and DSD.The DSD is modeled by an exponential distribution.After performing the integration, the contribution from pure rainwater, Z r , is written in a simple form (Zhang et al., 2001;Posselt et al., 2015;Kawabata et al., 2018) as follows: where λ is the wavelength of the radar, which is 107 mm for S-band radar, and N 0r is the intercept parameter of rainwater, which is 8 × 10 6 m −4 in this study.N 0r values are typically fixed (or constant) in single-moment microphysics schemes; for a two-moment scheme, this value should be determined using the predicted number concentration.|K w | 2 is the dielectric factor for rainwater and is equal to 0.93, and α ra and β ra are 4.28 × 10 −4 and 3.04, respectively.The complete gamma function is written as (. . .).The slope parameter of rain, r , is where ρ r = 1000 kg m −3 is the rain density, q pr is given by Eq. ( 4), and ρ a is the density of air.By substituting Eq. ( 8) and the constant parameters into Eq.( 7), we can rewrite Eq. ( 7) as a function of q pr as follows: Z r (q pr ) = P r (q pr ) 1.77 , where The value of P r is approximately 4.8 × 10 9 with an air density of 1.0 kg m −3 .This value has the same magnitude as those proposed by Sun and Crook (1997) and Gao and Stensrud (2012).

Contribution from dry snow/graupel
The contributions from both dry and mixed-phase ice species after integration have the same form but differ in their coefficients.For dry ice species, the contribution is given by where the subscript x represents either snow (s) or graupel (g).The intercept parameters of these species are denoted by N 0x , the values of which are 3 × 10 6 and 4 × 10 5 m −4 for snow and graupel, respectively.Both values are consistent with the default values of Advanced Regional Prediction System (ARPS) EnKF where the J08 operator was implemented.The slope parameter in Eq. ( 11) for either dry snow or dry graupel is written as where ρ x is either the density of snow (100 kg m −3 ) or graupel (400 kg m −3 ).The parameters A, B, and C in Eq. ( 11) are functions of the mean ( φ) and the standard deviation (σ ) of the canting angle and are given by According to J08, φ is zero for all hydrometeors, and σ differs for snow (20 • ) and hail (60 • ).Here, we assume that σ for graupel is also 60 • .The horizontal reflectivity that is regarded in this study is not sensitive to the canting angle (will be demonstrated in Sect.2.4), although the differential reflectivity is sensitive to canting angle (Aydin and Seliga, 1984).In Eq. ( 11), A, B, and C are multiplied by coefficients (α dxa and α dxb ) that describe the backscattering amplitudes.
For dry ice-phase species, these coefficients are precalculated constants and are listed in Table 1.The coefficients of graupel are calculated using the backscattering amplitudes (for particle size < 10 mm) in the pyCAPS-PRS v1.1 software (Dawson et al., 2014;Johnson et al., 2016;Jung et al., 2010;Jung et al., 2008a) provided by the Center for Analysis and Prediction of Storms (CAPS), and fitted to the polynomial function of f wg .α dga and α dgb are the coefficients when f wg is zero, which indicates no rainwater.For brevity, Eq. ( 11) is rewritten as a function of q dx for dry snow and dry graupel and is given by where

Contribution from wet snow/graupel
The equation for the contribution from mixed-phase species has the same form as Eq. ( 11), except that the subscript "d" is replaced by "w" to represent wet species.The slope parameter for mixed-phase species also has the same form as Eq. ( 12), except that the subscript "d" is replaced by "w", and ρ wx is substituted for ρ x .For wet species, σ in A, B, and C is a function of f w and q wg .Additional details are given in Sect.3c of J08.The coefficients that are multiplied by A, B, and C are functions of f w and are written as where "x" is "s" (g) for snow (graupel), ε x is 10 −4 (10 −3 ) for snow (graupel), P wxak and P wxbk are precalculated constants for S-band radar, the value of n is 6, and the superscript k denotes the index of these constants.All these values are based on J08, except for those of graupel, which are computed using the same method mentioned in Sect.2.1.2.The values of P wxak and P wxbk are listed in Table 2.
To simplify the derivation of the TL/AD operators, we rewrite Eq. ( 11) as a function of the mixed-phase mixing ratio (q wx ) and the water fraction (f w ) as follows: where the coefficients P wx and P xk are given by and respectively.P Axk , P Bxk , and P Cxk in Eq. ( 19) are given by where P wxai and P wxbi are precalculated constants in Eq. ( 16) and are listed in Table 2.The subscript "x" in these constant coefficients represents either snow (s) or graupel (g).The derivations of Eqs. ( 17) to ( 20) are given in the Appendix.

Tangent linear operator
Because RadarVar is highly nonlinear and complex, performing the derivation for every nonconstant variable in this operator is difficult.In addition, some components of Radar-Var (e.g., the fraction F in Eq. 3) are discontinuous and may cause serious convergence problems in the minimization (Janisková and Lopez, 2013;Janisková et al., 1999).
Although this issue can be addressed by performing regularization for the discontinuous components in RadarVar, it is beyond the scope of this study.Here, we assumed that five variables in RadarVar are not changed in the minimization: (i) the air density, (ii) the fraction F , (iii) the intercept parameter, (iv) the standard deviation of the canting angle σ , and (v) the density of the mixed-phase species.The air density is held constant for simplification and to focus on the impact of the changes in the hydrometeors.The fraction F is assumed to remain unchanged because of its secondorder discontinuity at the point that q x is equal to q r .The intercept parameter is constant because RadarVar was currently designed for single-moment microphysics schemes.
Although multi-moment microphysics schemes are more realistic because they predict the number density of hydrometeors, single-moment microphysics schemes are still widely used to compute reflectivity and the polarization variables (e.g., Posselt et al., 2015).For simplification, the standard deviation of the canting angle and the density of the mixedphase species remain unchanged in the minimization.These assumptions are discussed in Sect. 4.

Linearization for rain
Considering the assumptions presented above, P r in Eq. ( 9) becomes a constant in the minimization.Therefore, the linearized form of Eq. ( 9) for q pr is given by δZ r (q pr ) = ∂Z r (q pr ) ∂q pr

Linearization for dry and wet snow/graupel
For dry snow and graupel, the linearized form of Eq. ( 14) is given by The linearization of Eq. ( 17) can be categorized into two parts, which represent the variations in Z x caused by changes in q wx and f wx , respectively.The linear equation of Eq. ( 17) is written as where the subscript "x" represents either snow (s) or graupel (g).Using s (g) to replace the subscript "x" in Eq. ( 23), we can obtain the tangent linear operator of the contribution from wet snow (wet graupel).
The linearization of Eq. ( 1) is given by where δZ e is the sum of δZ r , δZ ds , δZ dg , δZ ws , and δZ wg .
Note that Z e cannot be zero.Moreover, a too-small Z e may result in an extremely large gradient during the minimization, which is undesirable.Therefore, the accepted minimum Z e is set to 1.0 in the current RadarVar.When the background value is smaller than this minimum value, DA system will discard the corresponding observation to prevent a large gradient.This minimum value can be smaller to ingest more observation and will be tuned in future work.

Adjoint operator
The adjoint operator is the transpose of the tangent linear operator.Because the tangent linear operator is applied to the model variables q r , q s , and q g , the adjoint operator is written for these variables.First, the adjoint operator is written for Eq. ( 24).This operator has the following form: where the superscript "A" means adjoint.
For rainwater in Eq. ( 21), the adjoint operator is given by The parameter q A r on the right-hand side of Eq. ( 26) is the accumulated q A r before computing Eq. ( 26).This rule is also valid for q x .
The adjoint operator of Eq. ( 22) is given by Because Eq. ( 23) is the derivation with respect to both q r and q x , the adjoint operator of Eq. ( 23) contains two parts: one for rainwater and the other for ice species.For rainwater involved in Eq. ( 23), the adjoint operator is given by For ice species in Eq. ( 23), the adjoint operator is given by www.geosci-model-dev.net/12/4031/2019/Geosci.Model Dev., 12, 4031-4051, 2019 The reflectivity as (a) a function of q r (red), q s (green), and q g (blue) for pure water and dry snow/graupel and (b) a function of q r -q s (colors) and q r -q g (contours) for wet snow/graupel.

Sensitivity of RadarVar to changes in hydrometeors
Since RadarVar is highly nonlinear, understanding its response to changes in q r , q s , and q g assists in the analysis of DA results using this operator.The response functions for Eqs. ( 9), ( 14), and ( 17) are plotted in Fig. 1. Figure 1b is a two-dimensional plot because Eq. ( 17) involves two kinds of hydrometeors.Figure 1a shows that the reflectivity changes more rapidly for all three hydrometeors when the mixing ratios are less than 0.5 g kg −1 .As the mixing ratios increase to 2.0 g kg −1 or greater, the relationship between the reflectivity and the mixing ratio is approximately linear.This feature indicates that the reflectivity is more sensitive at small mixing ratios, and it also implies that the tangent linear approximation may give larger errors when the background reflectivity is small.In addition, the reflectivity contribution from wet snow increases more substantially than that from dry snow when q s or q r are in the range of 0-0.5 g kg −1 .Figure 1b shows that the reflectivity reaches 35 dBZ when both q s and q r are approximately 0.2 g kg −1 , while this reflectivity value requires q s of ∼ 1.2 g kg −1 for dry snow.This result is expected because many observation studies (e.g., Zhang et al., 2008) have shown that wet snow causes a bright band (large reflectivity) in the melting layer.This result also implies that the approximation error in the melting layer could be large.
3 Data and experimental design

Case review
This study focuses on a precipitation case that occurred in the northern US.The precipitation initiated at approximately 21:00 Z on 1 June 2018, when convective cells formed near the border between South Dakota and Nebraska.By 00:00 Z on 2 June 2018, these cells had developed into a linear convective system that was approximately 300 km long in the northeast-southwest direction and stretched from south of South Dakota to northern Nebraska, as shown in Fig. 2. Note that there is also a weaker precipitation system near the north boundary of the domain.The top of the convective system at this time, identified by reflectivity greater than 5 dBZ, reached 16 km a.g.l.This convective line developed further and moved to southeast Nebraska from 00:00 to 04:00 Z.During this period, a bow echo was observed on the radar mosaic provided by the National Centers for Environmental Information (NCEI).By 08:00 Z, the convective line had moved to the northern border of Kansas, followed by a large area of stratiform clouds that covered eastern Ne-  (d-f) the horizontal correlation length scale as a function of EOF mode, and (g-i) the vertical correlation coefficients for q r , q s , and q g .braska.The precipitation caused by this convective system lasted for nearly 20 h and ended at approximately 18:00 Z on 2 June 2018.

Settings of the forecast model
Version 3.9.1 of the Weather Research and Forecasting (WRF; Skamarock et al., 2018) model was employed.All of the experiments were performed on a 450×450×42 domain centered at 41 • N, 96 • W, in eastern Nebraska (Fig. 2).The horizontal grid spacing was 3 km.The terrain-following vertical grid was employed with the model top at 50 hPa.All of the experiments used the same physical parameterizations: no cumulus parameterization, the Thompson microphysics scheme (Thompson et al., 2004), the Rapid Radiative Transfer Model for general circulation models (RRTMG) longwave and shortwave radiation scheme (Iacono et al., 2008;Mlawer et al., 1997), and the Unified Noah land-surface model (Chen and Dudhia, 2001).Note that in the current RadarVar implementation, the intercept parameters are fixed, while they spatiotemporally vary in the Thompson scheme.This inconsistency may increase the adjustment time for model initialization.However, this issue is secondary in the present work because no improvement in terms of forecast skill which will be introduced was found in our early tests using a single-moment microphysics scheme with the density and intercept parameters being identical to RadarVar.The primary DA issue in this study is the poor background quality (due to no hydrometeor in Global Forecast System (GFS) analysis or the precipitation displacement).The inconsistency between the operator and microphysics scheme will be considered in the future.The initial conditions and the lateral boundary conditions were generated with the GFS data at 00:00 Z on 2 June 2018.

Generation of the background error covariance
RadarVar was implemented in the WRF data assimilation (WRFDA) (Barker et al., 2012) system.To perform the variational DA with this newly developed radar operator, it is necessary to generate the background error covariance matrix with q r , q s , and q g as part of the control variables.The generalized software package for the background error covariance statistics (GEN_BE) developed by Descombes et al. (2015) was used.The GEN_BE package can generate the univariate background error statistics for 11 variables, including these three hydrometeors.The background error statistics were computed using the National Meteorological Center (NMC) method (Parrish and Derber, 1992), which uses pairs of the differences between the 12 and 24 h forecasts.In total, 27 d forecasts from 20 May to 15 June 2018 were employed to generate the background error covariance.
The background error statistics for q r , q s , and q g are shown in Fig. 3.The vertical distributions of the background error's standard deviation (Fig. 3a, b, c) are consistent with those of the corresponding hydrometeor profiles: the error of the rainwater mainly appears in the lower levels, while those of snow and graupel mainly exist in the upper levels.The graupel may fall from the upper levels into the lower levels, so the graupel error has a broad vertical distribution.The horizontal correlation length scales of the background errors are often less than four grids (< 12 km), and the vertical correlation of each hydrometeor can be large at the associated precipitation levels.These spatial correlations of the hydrometeor errors determine the remote horizontal and vertical influences of the observed reflectivity.

Observation data and verification data
Radar data at 00:00 and 01:00 Z on 2 June 2018 were selected to evaluate the radar operator in a variational analysis framework because the convection was sufficiently deep to contain ice-phase species.To fully cover the convective system at 00:00 Z, data from KABR, KFSD, KLNX, KOAX, KUDX, and KUEX were used, and KLNX was the closest radar from the convective line (Fig. 2).These radars were located in Nebraska and South Dakota.The radar data were stored in level-II format and converted to WRFDA format using a modified 88D2ARPS package, which is widely used in radar DA studies (Putnam et al., 2014;Snook et al., 2011Snook et al., , 2012)).During this conversion, the radar data were also horizontally remapped to the model grids but remained at the radar elevations in the vertical direction; in other words, the horizontal resolution of the radar data after the conversion was consistent with that of the model.To be consistent with the work using the J08 operator (Jung et al., 2008b), the observation error of the reflectivity was set to 2.0 dBZ.Our early tests using different observation errors indicated that the errors of the analysis reflectivity were comparable when using reflectivity errors ranging from 0.5 to 2.0 dBZ.For computational efficiency, we selected the remapped data every two grids in both the x and y directions for the DA.
The NCEP gridded stage IV (ST4) dataset (Lin and Mitchell, 2005) was used for the precipitation forecast verification.ST4 data with a horizontal resolution of 4 km were interpolated into the 3 km model grid mesh to evaluate the precipitation prediction.At each model grid, the interpolated value was the weighted average of the ST4 data within 10 km of the grid; these data were weighted by the square of the inverse of the distance between the model grid and the ST4 data location.

Experimental design
As the first attempt to implement and apply RadarVar in WRFDA, this study focused on the quality of the analysis using the univariate three-dimensional variational (3D-Var) DA method in terms of the root mean square difference (RMSD) against the observed reflectivity and the similarity between the observed reflectivity distributions and the analysis.The forecast performance is the secondary concern and will be explored more thoroughly in a future study with multivariate analysis using more advanced DA techniques.
All of the DA experiments analyzed only rainwater, snow, and graupel by assimilating only the radar reflectivity observations.The first DA experiment, called Exp_ref, is considered the benchmark experiment; it mostly used the default configurations of WRFDA-3D-Var, except the number of outer loops was set to six, and the number of maximum iterations in each outer loop was set to 150.More outer loops were necessary due to inaccurate background hydrometeors and the high nonlinearity of the radar operator.To determine the tradeoff between the analysis quality and computational cost, two variants of Exp_ref were conducted with 50 and Figure 5. (a) Vertical distributions of the maximum absolute values of the perturbed q r (red), q s (green), and q g (blue).Reflectivity scatter plots of all_prt (x axis) versus (b) F_unprt (y axis), (c) rhom_unprt (y axis), and (d) SD_unprt (y axis) at model levels 1-12 (black, lower than 3 km a.g.l.), 12-20 (red, between 3 and 7 km a.g.l.), and 21-30 (green, above 7 km a.g.l.). 100 inner iterations.In each experiment, the radar DA analyses were performed at 00:00 and 01:00 Z.The background for the 00:00 Z analysis was interpolated from the GFS analysis with zeros for the hydrometeor fields, and the background for the 01:00 Z analysis was the 1 h WRF forecast from the 00:00 Z analysis with more realistic hydrometeor fields.
Note that TL/AD of RadarVar will not be able to create reflectivity increments with the zero-hydrometeor background (serves as the base state of TL/AD in the first outer loop), which is the case for the 00:00 Z analysis.An approach to address this issue is to reset the zero background values of q r , q s , and q g to small values that can range from 10 −9 to 10 −6 kg kg −1 (e.g., Wang and Wang, 2017).However, this approach will result in the fraction F being a constant, while it is expected that F will peak near the middle of the melting layer.Therefore, we introduced a hydrometeor preprocessing step before performing the analysis, which constructs a new background with the weighted sum of the radar-retrieved hydrometeor mixing ratios and their background counterparts.The hydrometeor retrieval followed the procedure that is available in WRFDA, which is based on Gao and Stensrud (2012).The weight coefficients are arbitrarily set to 0.1 for the retrieval part and to 0.9 for the background.The small weight for the retrieval part was used to minimize the impact of the retrieval, mainly to ensure a nonzero background.In addition, current RadarVar cannot work with the weight coefficient of the retrieval part being smaller than 6×10 −4 if the background contains no hydrometeor.A larger weight coefficient of the retrieval part (e.g., 0.5) reduces the difference between the background and the observation, which weakens the impact of direct DA using RadarVar and is contradictory to the purpose of this study.The hydrometeor preprocessing was performed at both 00:00 and 01:00 Z in Exp_ref as a reference.
To examine the analysis performance with a bad background, the Exp_ref analysis at 00:00 Z was also run with a very small retrieval weight of 6×10 −4 .In addition, the impact of the hydrometeor preprocessing on a relatively "good" background (01:00 Z) was examined by comparing it with an experiment without hydrometeor preprocessing.
Table 3. Perturbation samples for the verification of the tangent linear operator.q r (kg kg −1 ) q s (kg kg −1 ) q g (kg kg In several previous studies (Ban et al., 2017;Choi et al., 2017;e.g., Shen and Min, 2015), the horizontal correlation length scale factor of the background error had a large impact on the analysis.Therefore, two additional experiments, Exp_ls0.5 and Exp_ls0.125, were performed at 00:00 Z with the length scale factors of 0.5 and 0.125, respectively.
Short-term forecasts from some of these analyses and from the GFS analysis (referred to as the "noDA" experiment) were also carried out and evaluated.

Radar operator validation
Before evaluating the performance of RadarVar in WRFDA-3D-Var, we first examined the consistency between J08orig and RadarVar.Because RadarVar follows the J08 operator, the operator implemented in ARPS EnKF (Jung et al., 2008a) was used to serve as the J08orig operator.To be comparable to J08orig operator, the coefficients of graupel listed in Tables 1 and 2, as well as the intercept parameter and hydrometeor density, are replaced by those listed in J08, which were designed for hail.These hail-associated coefficients are used only in Sect. 4 for comparison.The noDA 4 h forecast initialized at 00:00 Z was used as the input of the radar operators.The results, which are shown in Fig. 4, indicate that the difference between J08orig and RadarVar is small and acceptable.The small difference is likely caused by the subtle programming differences between the two software packages.
To verify whether RadarVar is sensitive to the unchanged variables during the minimization (see Sect. 2b), four tests were conducted.The same noDA 4 h forecast was used.In these tests, q r , q s , and q g were randomly perturbed with standard deviations proportional to their input values; in other words, a large background mixing ratio caused a large perturbation.The tests called F_unprt, rhom_unprt, and SD_unprt represent that the fraction F , the mixed form density, and the standard deviation of canting angle, respectively, were unperturbed (i.e., fixed input calculated from the forecast input).The test called all_prt denotes that all of the variables in RadarVar were calculated using perturbed mixing ratios.The standard deviation of the perturbation was set to 10 % of the background value; thus, the maximum perturbation could be large, as shown in Fig. 5a. Figure 5 shows that the reflectivities computed in F_unprt, rhom_unprt, and SD_unprt do not differ significantly from those computed in all_prt.This result indicates that keeping these variables unchanged during the minimization is an acceptable approximation.The most noticeable difference occurs in the middle layer, which is marked in red circles that plot off the diagonal line by several dBZ (Fig. 5c), but there are few of these samples.
The tangent linear operator of RadarVar was verified through a ratio, which is given by where H and H represent the nonlinear operator and the corresponding TL version, respectively, x is the vector of the model state variables (q r , q s , q g ), whose perturbation is denoted by δx, and ε is the perturbation magnitude and is greater than zero; δx used the perturbations generated for all_prt.Table 3 shows that the tangent linear operator is sufficiently accurate with a ratio close to 1.0.A correct adjoint operator must satisfy the relationship that is given by The meanings of all of the symbols in this equation are consistent with those in Eq. ( 30), and H T is the adjoint operator.All of the perturbations used for the tangent linear test were adopted in the adjoint test.In the double precision test, there were 14 identical digits on both sides of Eq. ( 31); in the single precision test, there were seven identical digits.c) 00:00 Z and (d, e, f) 01:00 Z analyses of q r (red), q s (green), and q g (blue) at model levels 5 (left column), 15 (middle column), and 20 (right column) for outer loop 6 in Exp_ref.Model levels 5, 15, and 20 approximately correspond to 0.7, 4.0, and 8.0 km a.g.l., respectively.

Analyses in the observation space and the model space
The Exp_ref analyses (six outer loops, 150 inner iterations, and 0.1 weight for the retrieval part) are first examined in terms of the radar reflectivity and the mixing ratios of rain, snow, and graupel.As expected, the analyzed reflectivity agrees much more closely with the observed reflectivity than the background reflectivity for both analysis times (00:00 and 01:00 Z) (Fig. 6).In addition, considering that the 00:00 Z background bias is much larger than that at 01:00 Z, the comparable analysis error for both times indicates that the analysis is generally relatively insensitive to the initial background.However, a small number of points in the analysis have zero reflectivity, while the corresponding observations can be greater than 10 dBZ (Fig. 6b, d).Further examination indicates that these failed points are related to the locations with very small background values of q r , q s , and q g , where the nonlinearity of the radar operator and the deficiency of the TL/AD operator are more pronounced.
Figure 7 shows the horizontal distributions of the observed and analyzed reflectivities at 2 km, 4 km (melting layer), and 6 km a.g.l.In general, they match well in observed areas with a weaker analysis at 00:00 Z, which is likely caused by the bad background.Spurious echoes appear over unobserved areas in the analyses at both 00:00 and 01:00 Z and most likely resulted from the spatial correlations in the background error covariance; these correlations allow the propagation of information from observed areas to unobserved areas both horizontally and vertically.This result implies that the statistical correlation length scales obtained from a 1-month forecast sample may be too large for this squall line case.The results of tuning the horizontal correlation length scales will be given in Sect.5.4.Another cause of these spurious echoes is that non-precipitation echoes were not assigned in the observation data in this study such that DA has no impact outside the observed convective area.An approach to suppress the spurious echoes is to determine the non-precipitation points, assign a specific value like 0 dBZ to these points, and assimilate these non-precipitation echoes.The non-precipitation echoes will be considered in the future.The 00:00 and 01:00 Z analyses of q r , q s , and q g at three model levels are shown in Fig. 8.The direct assimilation of the reflectivity data using RadarVar successfully retrieved the lower level rain, the upper level snow or graupel, and mixed rain/snow and/or rain/graupel in the melting layer (model level 15).Note that the analysis increment can be created only at levels where the SD of the background error is larger than zero (see Fig. 3).

Convergence of the minimization
Figure 9 shows the cost function and the norm of its gradient as a function of the number of inner iterations for the first, third, and sixth outer loops.For the 00:00 Z Exp_ref analysis, both the cost function and the gradient norm decrease rapidly in the first outer loop due to the large adjustment from the bad initial background.Using the improved guess after two outer loops, the third outer loop starts from an ∼ 85 % smaller cost function, which is then reduced more gradually with increasing iterations.Performing more outer loops does not further reduce the cost function substantially, and outer loop 6 even has a slightly larger cost function than that of outer loop 3. Similar behavior can also be observed in the 01:00 Z analysis, but performing more than three outer loops can further reduce the cost function to a small extent.This indicates that performing six outer loops may not be necessary and that two or three outer loops may be sufficient.
To more quantitatively measure the impacts of the number of outer loops and inner iterations, the correlation coefficients between the Exp_ref analysis (i.e., six outer loops and 150 iterations in each loop) and the analyses with one to five outer loops and 50/100/150 inner iterations in each loop are calculated and shown in Fig. 10.In the 00:00 Z analysis, the correlation coefficients of q r and q g are already greater than 0.9 after the first outer loop, and using 50 or 100 inner iterations also results in good agreement with the Exp_ref analysis for q r and q g .However, this is not the case for q s , which is likely an indication of a bad q s background (0.1 of the retrieved values).Given that the correlation coefficients of reflectivity reach 0.9 after the second outer loop, the low correlation coefficients of q s also imply that it is difficult to distinguish the contributions from ice-phase species.Even with a more realistic background at 01:00 Z, the issue of the q s analysis with less iterations and outer loops (Fig. 10b) still exists, but the higher correlation coefficient of the run with 150 iterations after four outer loops, compared to the counterparts in the 00:00 Z analysis, indicates the importance of the background quality.Note that the hydrometeor's phase information could be better determined by assimilating polarization measurements from dual-polarization radar, which will be explored in a future study.
Figure 10.The correlation coefficients of q r (red), q s (green), q g (blue), and reflectivity (gray) between the Exp_ref analysis (with six outer loops and 150 inner iterations each loop) and that with different numbers of outer loops and inner iterations at (a) 00:00 Z and (b) 01:00 Z.

Importance of hydrometeor preprocessing
To further investigate the sensitivity of the analysis to the background quality, additional analyses without the hydrometeor preprocessing step were performed at 00:00 and 01:00 Z.At 00:00 Z, the nonzero background was constructed by using a very small weight (7 × 10 −4 ) for the re-trievals.At 01:00 Z, the background is simply taken from the 1 h WRF forecast.Compared to Fig. 6b, the analysis at 00:00 Z (Fig. 11b) has many more failed points (i.e., those with zero analyzed reflectivity) due to the excessively small background hydrometeor values.In contrast, the 01:00 Z analysis (Fig. 11d) without preprocessing has a comparable error bias and SD to those with the preprocessing (Fig. 6d).The analysis reflectivity at 4 km a.g.l. and (b, d) the analyses of q r (red), q s (green), and q g (blue) at model level 15 at 00:00 and 01:00 Z for the experiments without hydrometeor preprocessing.
These results indicate that the preprocessing step is mostly needed to address the bad background.
Figure 12a shows that the analyzed reflectivity in the melting layer has a reasonable fit to the observations even though it started from very small values of the background reflectivity.The analysis in the model space (Fig. 12b) is also similar to that with the preprocessing (Fig. 8b).However, the reflectivity coverage (> 35 dBZ) in Fig. 12a is smaller than those in the observation and Exp_ref, which corresponds to the horizontally distributed points on the bottom of Fig. 11b.In contrast, the 01Z analysis without the preprocessing step (Fig. 12c, d) closely resembles that with the preprocessing (Figs.7h and 8e) in terms of the radar reflectivity and hydrometeor mixing ratios, especially over the strong convective line (dashed black line in Fig. 12d).However, removing this preprocessing step results in a large spurious echo (marked by "A" in Fig. 12c) that is smaller in Fig. 7h.This implies that the preprocessing step will still be helpful for some "bad" locations (e.g., mismatched convective cells be-tween the model background and observation) for a generally "good" background.

Impact of the spatial correlation scale
Figure 13 shows the 00:00 Z analysis reflectivity at 4 km a.g.l. with the horizontal correlation length scales of the background errors reduced by factors of 2 and 4. The spurious echoes weaken as the length scale decreases, especially for the spurious echoes marked by "A".In addition, reducing the length scale improves the intensity analysis at certain locations (e.g., the convective cell marked by "B"; also see Fig. 7e, f).Note that the background error statistics used for these 3D-Var analyses were obtained from forecast samples over a month-long period and are likely not optimal for this particular squall line case.A better solution would be the use of the flow-dependent background error covariance, which will be investigated in the future using more advanced DA methods, such as 4D-Var and hybrid-3D/4DEnVar that are available in WRFDA.

Impact on the precipitation forecast
The performance of the precipitation forecasts was quantitatively evaluated using the stage IV dataset with the fractions skill score (FSS;Roberts and Lean, 2008).Hourly precipitation forecasts between 02:00 and 05:00 Z were evaluated because assimilating only the reflectivity is expected to have an impact mostly on the short-term forecast.Similar to Schwartz et al. (2014), the aggregated FSS over the period from 02:00 to 05:00 Z is shown in Fig. 14 for the forecasts initialized at 00:00 and 01:00 Z and for different rainfall thresholds.For the forecast from 00:00 Z (Fig. 15a), Exp_ref obtained greater FSSs than those of noDA for thresholds greater than 5 mm h −1 with a radius of influence smaller than 40 km.With a larger radius, this difference remains for the lighter rain (< 20 mm h −1 ) but disappears for the heavy rain (≥ 20 mm h −1 ).Similar behavior can be observed for the forecast at 01:00 Z (Fig. 15b) but with a more positive im-pact from the radar DA for heavier rainfall and for a radius of influence smaller than 50 km.Further examination (not shown) indicated that the improvement in the moderate rain (> 5 mm h −1 ) prediction was associated with the larger snow area in the analysis, so the light rain missed by noDA was better captured in Exp_ref.This examination also showed that the higher FSSs obtained by Exp_ref for the heavier rainfall were mostly associated with the smaller displacement error.

Conclusions
This study developed tangent linear and adjoint operators based on the J08 reflectivity operator and implemented them in WRFDA.This new operator can compute the reflectivity contributed by ice-phase species and is called RadarVar.
RadarVar is effective at analyzing rainwater, snow, and graupel by directly assimilating US WSR-88D S-band radar reflectivity with WRFDA's 3D-Var.The analysis accuracy is somewhat sensitive to the numbers of outer loops and in- ner iterations.The results indicated that two to three outer loops with 50-100 iterations in each loop are needed to obtain a sufficiently accurate analysis.Two problems of Radar-Var were found in our test.One issue is the analysis failures at locations with observed radar echoes but with zero or excessively small model background values of the hydrometeors.This issue can be partially resolved using a preprocessing step with radar-retrieved hydrometeors to improve the "bad" background before the analysis.Another issue is the spurious radar echoes (i.e., precipitation) in the analysis caused by the spatial correlations in the background error covariance.These can be reduced by decreasing the correlation length scales.In addition, the short-term (2-5 h) precipitation forecast is improved by the direct reflectivity DA even though the inexpensive univariate 3D-Var technique is used in this first attempt of applying RadarVar.
A more thorough evaluation of reflectivity DA with Radar-Var will be examined in a future study using a more advanced hybrid ensemble-variational DA technique, which allows flow-dependent background errors with multivariate correlations and is expected to further reduce the aforementioned deficiencies.Moreover, RadarVar will be extended to include the computation of polarimetric quantities to better determine the phases of the hydrometeors, especially in the melting layers.

Figure 1 .
Figure 1.(a)The reflectivity as (a) a function of q r (red), q s (green), and q g (blue) for pure water and dry snow/graupel and (b) a function of q r -q s (colors) and q r -q g (contours) for wet snow/graupel.

Figure 2 .
Figure 2. The simulation domain with radar sites marked by radar icons and names.The areas of precipitation greater than 5 mm h −1 are plotted for 00:00 Z (red), 02:00 Z (green), and 04:00 Z (blue) on 2 June 2018.

Figure 3 .
Figure 3. (a-c) The standard deviations of the background errors at different vertical levels,(d-f) the horizontal correlation length scale as a function of EOF mode, and (g-i) the vertical correlation coefficients for q r , q s , and q g .

Figure 4 .
Figure 4. Scatter plot of the reflectivity for J08orig (x axis) versus RadarVar (y axis).The bias, standard deviation (SD), and number of samples are listed in the plot.

Figure 6 .
Figure 6.Scatter plots of the observed (x axis) versus (a, c) background and (b, d) analysis reflectivity at (a, b) 00:00 Z and (c, d) 01:00 Z in Exp_ref.The mean bias and SD between the observations and the background (or analysis) are listed in each plot.

Figure 9 .
Figure 9.The cost function and normalized gradient norm as functions of the iteration step during the minimization at (a, c) 00:00 Z and (b, d) 01:00 Z in Exp_ref.

Figure 12 .
Figure12.(a, c) The analysis reflectivity at 4 km a.g.l. and (b, d) the analyses of q r (red), q s (green), and q g (blue) at model level 15 at 00:00 and 01:00 Z for the experiments without hydrometeor preprocessing.

Figure 14 .
Figure14.The FSSs of the hourly precipitation forecasts aggregated over the period from 02:00 to 05:00 Z as functions of the radius of influence for forecasts launched from the (a) 00:00 and (b) 01:00 Z analyses for Exp_ref (solid lines) and noDA (dashed lines).The hourly precipitation thresholds are denoted by green (1 mm h −1 ), blue (5 mm h −1 ), orange (10 mm h −1 ), and red (20 mm h −1 ) lines.The straight dashed line represents the skillful FSS at 1 mm h −1 .