Comparison of the ensemble Kalman filter and 4 D-Var assimilation methods using a stratospheric tracer transport model

An ensemble Kalman filter (EnKF) assimilation method is applied to the tracer transport using the same stratospheric transport model as in the four-dimensional variational (4D-Var) assimilation system BASCOE (Belgian Assimilation System for Chemical ObsErvations). This EnKF version of BASCOE was built primarily to avoid the large costs associated with the maintenance of an adjoint model. The EnKF developed in BASCOE accounts for two adjustable parameters: a parameter α controlling the model error term and a parameter r controlling the observational error. The EnKF system is shown to be markedly sensitive to these two parameters, which are adjusted based on the monitoring of aχ2 test measuring the misfit between the control variable and the observations. The performance of the EnKF and 4D-Var versions was estimated through the assimilation of Aura-MLS (microwave limb sounder) ozone observations during an 8-month period which includes the formation of the 2008 Antarctic ozone hole. To ensure a proper comparison, despite the fundamental differences between the two assimilation methods, both systems use identical and carefully calibrated input error statistics. We provide the detailed procedure for these calibrations, and compare the two sets of analyses with a focus on the lower and middle stratosphere where the ozone lifetime is much larger than the observational update frequency. Based on the observation-minusforecast statistics, we show that the analyses provided by the two systems are markedly similar, with biases less than 5 % and standard deviation errors less than 10 % in most of the stratosphere. Since the biases are markedly similar, they most probably have the same causes: these can be deficiencies in the model and in the observation data set, but not in the assimilation algorithm nor in the error calibration. The remarkably similar performance also shows that in the context of stratospheric transport, the choice of the assimilation method can be based on application-dependent factors, such as CPU cost or the ability to generate an ensemble of forecasts.


Introduction
Two of the most important and widely used data assimilation methods are the four-dimensional variational method (4D-Var: Talagrand and Courtier, 1987) and the ensemble Kalman filter (EnKF: Evensen, 1994;Houtekamer and Mitchell, 1998;Evensen, 2003).Although they solve similar estimation problems, they are built around different constraints and thus have different strengths and weaknesses.The BASCOE (Belgian Assimilation System for Chemical ObsErvations) system was originally developed with the 4D-Var assimilation method applied to a stratospheric chemical transport model (CTM) (Errera et al., 2008;Errera and Ménard, 2012).This variational method determines the initial conditions which optimize the fit between model forecast and observations over a period, i.e. an assimilation window.In atmospheric chemistry, an assimilation window of 12 h (Flemming et al., 2009) or 24 h (Errera et al., 2008;Elbern et al., 2010) is typically used.The 4D-Var provides an accurate solution, but requires the development and maintenance of an adjoint model, which may be a time consuming task in the CTM context.
The most popular alternative to the 4D-Var is the EnKF which consists in a Monte Carlo method (Evensen, 1994).As the 4D-Var, the EnKF is built on the assumption of S. Skachko et al.: EnKF and 4D-Var using a stratospheric tracer transport model Gaussian-distributed observation errors to estimate the minimum variance in the misfit between model forecast and observations.But the EnKF computes this minimum variance estimate at each time step of the model by explicitly computing its error covariances.It does not require an adjoint model but assumes that the forecast errors are Gaussian-distributed.In the 4D-Var scheme, the evolution of forecast error within the assimilation window is computed by the model (whether it is accurate and appropriate or not) and is generally used as a strong constraint.By contrast, the EnKF relaxes this assumption into a weak constraint by adding a model error covariance to the analysis error covariance which becomes dynamically propagated (for more details, see Lorenc, 2003;Ménard and Daley, 1996).Hence, the model error covariance is of great importance for the filter performance.Moreover, the uncertainty of the EnKF analysis is directly provided by the spread of the ensemble of analyses.
The 4D-Var and the EnKF have comparable computational costs.The advantages and disadvantages of each method in the meteorological context have been discussed in several papers (e.g.Hamill, 2006;Kalnay et al., 2007).A rigorous intercomparison was also presented by Buehner et al. (2010b) in the context of global NWP (numerical weather prediction) system with real observations.In this context, it was shown that the EnKF error variance is larger than with the 4D-Var.In their intercomparison paper, Buehner et al. (2010a) also conducted different variational experiments using static covariances with horizontally homogeneous and isotropic correlations as well as flow-dependent EnKF covariances with spatial localization.The authors went further and made a hybrid system called ensemble 4D-Var using flow-dependent EnKF covariances without the need of the tangent-linear or adjoint versions of the model.An overall conclusion obtained by Miyoshi et al. (2010) with the Japanese weather prediction system is that both systems have essentially comparable performance.
In the context of chemical modelling, Lahoz and Errera (2010) and Sandu and Chai (2011) reviewed different assimilation methods and challenges in chemical data assimilation.Data assimilation systems based on a CTM are often developed within the variational approach (Khattatov et al., 1999;Errera et al., 2008), but also with sequential filtering (Khattatov et al., 2000;Ménard et al., 2000;Miyazaki et al., 2012).Recently Sekiyama et al. (2011) constructed a total ozone assimilation system on the basis of a four-dimensional local ensemble transform Kalman filter (LETKF).Nakamura et al. (2013) applied the EnKF to stratospheric ozone data assimilation using a multi-model approach.Meanwhile some developments based on the EnKF have begun to address tropospheric composition (Constantinescu et al., 2007a;Liu et al., 2012).
In the context of chemical data assimilation, few studies have been devoted to the comparison of the 4D-Var and EnKF methods.Constantinescu et al. (2007a) have compared the EnKF with an operational-like 4D-Var setting using common background errors modelled by autoregressive processes applied to a tropospheric chemistry model.Wu et al. (2008) presented an intercomparison of four assimilation methods including the 4D-Var and the EnKF.The paper was organized as a sensitivity study with respect to different model and assimilation parameters.The experiments were conducted over short periods of typically one or two days.One of their conclusions was that the EnKF is superior to the 4D-Var, but also that optimum interpolation is superior to the EnKF.However from the study, it was unclear whether each assimilation system was tuned to provide its best performance.Hence, these conclusions were not entirely convincing as the individual systems may have performed differently with different parameter values.
In the present study, the EnKF and the 4D-Var are both tuned to provide their best performance while using the same spectral formulation for the prescribed background error covariance.First of all, the background error covariance is calibrated within the 4D-Var using the National Meteorological Center (NMC) method (Parrish and Derber, 1992).The calibrated errors are passed to the EnKF to generate the initial ensemble and the model error term.The EnKF is then tuned to provide its best results with χ 2 diagnostics close to one (Ménard et al., 2000) by calibration of the observation and model error covariance.The 4D-Var uses the observation covariance error calibrated within the EnKF experiments.We have not attempted to introduce a localization of error covariances in the 4D-Var because the localizations in a 4D-Var and EnKF are not strictly equivalent (Buehner et al., 2010a).However, the prescribed correlation length-scales in the EnKF were adjusted to match, after localization, those prescribed in the 4D-Var.
The next section describes the configurations of the EnKF and the 4D-Var data assimilation systems used in this study.Section 3 describes the experimental set-up and specifically the calibration of the error variances in the two systems.Section 4 compares their results.Finally, some conclusions are given in Sect. 5.

Configuration of the 3-D CTM
The comparison of EnKF and 4D-Var is performed using a tracer version of the BASCOE CTM.The model in its usual configuration includes 57 chemical species with a full description of stratospheric chemistry (Errera et al., 2008).All species are advected via the flux-form semi-Lagrangian scheme (Lin and Rood, 1996).For the purposes of this study as well as to reduce the CPU time, the chemistry is turned off as in Errera and Ménard (2012), and only the advection of ozone (O 3 ) is considered.The CTM is driven by winds and temperatures obtained from the European Centre for Medium-Range Weather Forecasting (ECMWF) ERA-Interim reanalysis (Dee et al., 2011).The horizontal resolution of the model grid is 3.75 • longitude by 2.5 • latitude.Vertically, the model uses a subset of 37 levels of the ERA-Interim 60 levels which excludes most tropospheric levels.The vertical domain extends from 0.1 hPa down to the surface.Hence, the model state is described by the vector x ∈ R n of length n = 96 × 73 × 37 ≈ 2.6 × 10 5 .Finally, the model time step is set to 30 min.

The 4D-Var system
The detailed description of the BASCOE 4D-Var data assimilation system is provided in Errera and Ménard (2012).Here we give only the features relevant to the aims of this study.
The evolution of the model state vector between the time step k − 1 and k is computed by the model operator: where k is the time index, M k−1,k is the model operator between t k−1 to t k and K is the number of time steps within the assimilation window.In the 4D-Var experiments performed in this study, the assimilation window is set to 24 h such that, considering the model time step of 30 min, K = 48.4D-Var data assimilation is carried out by minimizing the so-called cost function (Talagrand and Courtier, 1987): where x b (t 0 ) ∈ R n is the background model state; B 0 ∈ R n×n is the background error covariance matrix; H k is the observation operator at time t k ; the vector is the first-guess innovation vector at time t k ; the y(t k ) ∈ R m k and R k ∈ R m k ×m k represent the observational vector and its associated error covariance matrix at time t k , respectively; m k is the number of observations assimilated during time step k.
The BASCOE system has been, up to now, designed to assimilate observational profiles delivered by limb-scanning instruments.Hence, the observation operator H k simply consists in a linear interpolation of the model value at the observation tangent point.We assume that the observation errors are uncorrelated both horizontally and vertically.The observation error covariance matrix R k is thus defined diagonal: where r is an adjustable observation error parameter and σ y (i) t k is the measurement error at level i and time t k .The observations and their errors are described in Sect.3.1 while the adjustment of r is described in Sect.2.5.Note that the parameter r governing the observational error matrix R k is introduced into the BASCOE 4D-Var system to allow for comparison with the EnKF.
The dimension n of the matrix B 0 makes the computation of the background term of Eq. (2) unfeasible by current computers.To avoid the inversion of B 0 , a control variable transform is introduced: where ξ is a new control variable, δx 0 is the analysis increment and L is the square root of B 0 : Hence, the cost function is then re-written as The method used to formulate the operator L is discussed in Sect.2.4 and additional information of this incremental form of 4D-Var may be found in Errera and Ménard (2012).The present study used BASCOE 4D-Var version b07.27.

The EnKF system
In this section, we describe a specific variant of the EnKF algorithm as implemented into the BASCOE system (BAS-COE EnKF version b08.06).The general algorithm follows the theoretical formulation of the EnKF with perturbed observations (Houtekamer and Mitchell, 1998;Evensen, 2003).
Here, we provide only details that are essential to understand the performed experiments.An ensemble of initial states is produced by adding, to a model state, a set of spatially correlated perturbations according to the prescribed initial error covariance.The details of the procedure are described in details in Sect.2.4.The ensemble of model states is propagated forward in time using the same tracer version of the BASCOE CTM as used in the 4D-Var system (see Sect. 2.1).In a practical implementation, the model error covariance is represented by the addition of a stochastic noise η i to each ensemble member at each model time step: where N is the size of the ensemble and the superscripts f and a stand for model forecast and analysis, respectively.All other symbols have the same meaning as in the previous section, and the procedure to simulate the model noise η i is discussed in the next section.
To derive the analysis equation, we define first the matrix holding the ensemble members at time t k , x i (t k ) ∈ R n : In practice, the ensemble size N is much smaller than the dimension of the model state vector n.The ensemble mean is stored in the vector x(t k ) ∈ R n : Let us note the perturbation of an ensemble member as The ensemble perturbation matrix X (t k ) is then written as The ensemble forecast error covariance matrix B e (t k ) ∈ R n×n is obtained from this ensemble perturbation matrix: The matrix B e (t k ) can be also rewritten in terms of individual perturbations as Using the same notation as in the previous section, we define the matrix of perturbed observations as where j (t k ) ∈ R m k are observation perturbation vectors at time t k generated by random Gaussian numbers characterized by a zero mean distribution and a standard deviation equal to the observational error The observation error covariance matrix R k is defined as in the 4D-Var version by Eq. ( 3).The analysis equation in the ensemble Kalman filter stochastic formulation, i.e. with perturbed observations (Houtekamer and Mitchell, 2001;Evensen, 2003), is written as where X a (t k ) is the analysis ensemble matrix, X f (t k ) is the forecast ensemble matrix and is the ensemble innovation matrix at time t k .
A widely known issue with the EnKF method is its tendency to produce analyses with noisy spatial correlations at large distances in the analysis covariance.This is due to the finite and relatively small size of the ensemble compared to the size of the model state vector (Houtekamer and Mitchell, 2001).To filter out this noise, we follow the method proposed by S. E. Cohn and R. Ménard in 1997 and applied in many EnKF systems (e.g.Houtekamer and Mitchell, 2001;Hamill et al., 2001;Constantinescu et al., 2007b;Fertig et al., 2007;Sakov et al., 2010).The method consists of using the Schur (element-wise) product of the ensemble covariance matrix with a compact support correlation function, here denoted ρ.The function ρ used in this study is the fifth-order piecewise rational function of Gaspari and Cohn (1999) which is isotropic and decreases monotonically with distance depending on the correlation length scale L loc .The function ρ is positive only for distances that are less than 2L loc and zero otherwise.We applied this procedure to both horizontal and vertical correlations, using the compact support correlation functions ρ h and ρ v , with correlation length scales L h loc and L v loc , respectively.The choice of these parameters is discussed in Sect.3.2.2.
The actual implementation of the analysis equation is thus written as follows (omitting the time index): where the notation A • B denotes the Schur product between two matrices A and B. The indexes m and o are introduced to show that the dimension of the matrix ρ corresponds to the model and observation space dimensions, when the Schur product is applied to the matrix B e H T and HB e H T , respectively.The observational operator H involves the vertical and horizontal interpolations on the model grid.And the function ρ has a length scale which is much broader than the interpolation distances.Hence, the order of the interpolation and the Schur product can be interchanged without significant loss of accuracy.So, Eq. ( 17) is written approximately as The application of the Schur product to the ensemble covariances has several advantages.First, the correlation function filters out small and noisy correlations related to observations at large distances.Second, it allows the EnKF to perform reasonably well even with a small number of ensemble members.Houtekamer and Mitchell (2001) stated that the use of the Schur product improves the conditioning of the matrices B e H T and HB e H T .They also argued that the Schur product tends to reduce and smooth the effect of observations at intermediate distances.
In practice, the forecast error covariance matrix B e is never computed explicitly.The ensemble representation (Eq.12) is used instead: In our system the number of observations per model time step is rather small, allowing the inversion of the innovation matrix [HB e H T + R] for a reasonable CPU cost.

Ensemble initialization and model error generation
Several authors have reported the problem of the EnKF divergence: the decreasing ability of the filter to correct the ensemble state towards the observations after a certain number of assimilation cycles (Houtekamer and Mitchell, 1998;Hamill, 2006).The exact cause of this filter divergence is not entirely clear, but two main reasons have been raised: (i) the variance of the ensemble forecast error becomes too small when the effect of model error in the prediction is not considered (Lorenc, 2003); and (ii) the finite sample size causes a mismatch between estimated and true error variance (Houtekamer and Mitchell, 1998).A common method preventing the filter divergence is to increase artificially the ensemble covariance.In our system, the error covariance is increased by adding a state-wide model error η i (Eq.7) at every model time step to each ensemble forecast.
Let us first provide a short description of the method to formulate the variational background error covariance matrix, as proposed by Courtier et al. (1998) and adopted to the 4D-Var version by Errera and Ménard (2012).In this study, the method is used not only to compute the matrix B 0 in the 4D-Var system (Eq.5), but also to compute the initial ensemble and the model error in the EnKF system.The EnKF uses flow-dependent ensemble forecast error covariance (Eq.12) evolving in time with the ensemble.On the contrary, 4D-Var reinitializes the background error covariance every 24 h.
As stated in Errera and Ménard (2012), the formulation of the background error covariance matrix is crucial for any variational data assimilation system.The matrix B 0 should be sufficiently compact to be implemented numerically and sufficiently complex to represent adequately realistic error covariances of the first guess field.To achieve this goal, there are several approaches.The proposed method expresses the spatial correlations on a spherical harmonic basis (Courtier et al., 1998).It is based on the fact that on such basis, homogeneous and isotropic horizontal correlations are represented by a diagonal matrix with repeating values on the diagonal (for the same zonal wave number).
In this case, the operator L introduced in Eq. ( 4) is defined by: where is the (diagonal) background error standard deviation matrix; 1/2 is the spatial correlation matrix defined on a spherical harmonic basis hence diagonal; S is the spectral transform operator from the spectral space to the model space.
In the present study, the spatial correlation matrix considers Gaussian correlations in the horizontal and in the vertical directions with correlation length scales fixed to L h 0 = 800 km horizontally and L v 0 = 1 level vertically.The vertical profile of the background standard deviation matrix is estimated using the NMC method (Parrish and Derber, 1992) and is shown in Fig. 1.This profile is used for every point of the horizontal grid.
The operator L is also used to generate the initial deviation xi (t 0 ) and the model error η i (t k ) of the EnKF system.In the case of the initial deviation, this ensures that at the initial time, both EnKF and 4D-Var systems have the same error statistics.For an initial deviation, we have: while for a model error, we have where ζ i (t 0 ) and ψ i (t k ) are normally distributed random numbers with zero mean and variance equal to 1, defined in the spectral space; and where α is a model error parameter smaller than 1.Normalizing to a normal distributed random deviate is exactly what should also be done for the simulation of the model error term.In the theory of the Kalman filter (Kalman, 1960), the model error η i is uncorrelated with the observation error and with the initial condition error.The model error and the analysis error must remain uncorrelated at later times.Hence, the perturbations ψ i should be different at each model time step.
The algorithm to generate EnKF state perturbations is then identical to the algorithm of the 4D-Var background error covariance generation.However, the operator L is applied to the normally distributed random deviate ζ i (Eq.21) rather than to the control variable ξ (Eq.4).

Methodology of tuning the parameters r and α
As stated before, the EnKF system has two adjustable parameters: r and α.The observation error parameter r and www.geosci-model-dev.net/7/1451/2014/ the model error parameter α are adjusted statistically using a χ 2 diagnostic introduced by Ménard and Chang (2000) for the Kalman filter.This diagnostic compares the innovation vector d with the innovation covariance matrix S = HB e H T + R (Eq.16) using a Mahalanobis norm (Talagrand, 2010).Specifically, at every analysis time step k, the value of χ 2 k is computed as follows: An assimilation system is said to be optimal when χ 2 k is equal to the number of observations m k at time t k where denote the statistical expectation.Since the number of observations per time step is relatively large, i.e. about 1100 in our case, we can approximate χ 2 k by a realization of χ 2 k for a given set of observed values (i.e. for a realization of the observation error).
As shown by Ménard and Chang (2000), modifying the model error parameter changes the trend (or slope) of χ 2 k over time, while modifying the observational error parameter r changes the mean value of χ 2 k .Since these two parameters have distinguishable effects on the time series of χ 2 k (mean and trend), they can be tuned separately, as summarized by Khattatov et al. (2000): 1. Run the assimilation system and monitor χ 2 k /m k .If its value increases (decreases) consistently with time, increase (decrease) α.This procedure is repeated until the mean value of χ 2 k /m k does not show a trend in its time series.

If the average value of χ 2
k /m k is larger (smaller) than 1, increase (decrease) the observation error scaling factor r.
3 Experimental set-up

Observations
The data used in this study are ozone profiles given by EOS (Earth Observing System) Aura-MLS (microwave limb sounder) version 2.2 (Froidevaux et al., 2008).The observations of ozone cover the latitude range between 82 • S and 82 • N with an along-track separation of around 165 km between consecutive scans.Around 3500 vertical scans are performed every day.Ozone profiles have a vertical resolution of around 3 km in the stratosphere and they are valid for scientific studies between 215 and 0.02 hPa.However, ozone data are not assimilated above 1 hPa because the tracer assumption is not valid above this pressure level.The observational error σ y (Eq. 3) is set from the instrumental error provided with each observation and increased if necessary to represent at least 5 % of the observation value.This accounts for the representativeness error because smaller errors would give too large a weight to observations.

Calibration of the systems
To perform a proper comparison between the 4D-Var and EnKF, we must calibrate both systems in such a way that they use the same error statistics.Our starting point is the calibration of the error covariance matrix B 0 used by the 4D-Var system.This is realized through a calibration of the spatial correlation operator L, i.e. the background error spatial correlation matrix and the background error standard deviation matrix (Eq.20).The calibrated operator L is then used in the EnKF system, where the model error parameter α and the observation error parameter r are estimated using the χ 2 diagnostic.Once the parameter r is estimated, its value is passed to the 4D-Var system for a final test of performance.

4D-Var
The matrix of the 4D-Var system has been calibrated using the NMC method (Parrish and Derber, 1992;Rabier et al., 1998;Bannister, 2008).For this purpose, a 6-month assimilation experiment (May-October 2008) has been performed assuming a matrix set-up as 30 % of the background field and a matrix assuming Gaussian correlations with correlation length scales L h 0 = 800 km horizontally and L v 0 = 1 level vertically.The NMC method assumes that the B 0 matrix may be estimated by the difference between pairs of forecasts of different lead times but same validity times.In meteorology, the forecast pairs have typically 24 and 48 h lead times.In our case the forecast pairs have 0 and 24 h lead times, i.e. the difference between the forecast pairs is equivalent to the analysis increments of the 4D-Var system.Indeed contrary to the meteorological case, there is no need in chemistry to perform a 24 h forecast to balance the model fields.
The calibration of B 0 with the NMC method has been computed for several periods in 2008: May-July, August-October and May-October.No significant differences in the estimated and have been found.So the period May-October is used in this study.Moreover, to parameterize the diagonal values of the matrix , two variants of it have been tested using the NMC method.They assume that the background error standard deviations are defined by (1) a one-dimensional pressure profile and (2) a two-dimensional latitude-pressure field.The 4D-Var assimilation experiments using these two parameterizations of have not shown important differences in results.So the one-dimensional profile of (see Fig. 1) has been used to compare the 4D-Var and EnKF systems.We have also estimated the correlation matrix with the NMC method.But the differences between the 4D-Var assimilation considering the NMC and the Gaussian (where both experiments use the NMC ) have not shown an important difference in results.So the Gaussian has been kept to ease its implementation in the EnKF system -specifically its explicit formulation of compact support correlation length scales.

EnKF
Once the matrices and are calibrated in the 4D-Var system, the resulting operator L is passed to the EnKF system.As explained in Sect.2.3, the EnKF uses a Schur product with a compact support correlation function as a localization method.The use of Schur product reduces the resulting correlation length scales.In order to maintain the correlations of the EnKF analysis comparable to those of the 4D-Var system, a different setting of the correlation length scales is adopted to generate the model error (Eq.22).Let C be a matrix resulting from the Schur product of two matrices A and B: C = A • B. If the correlation length scales of A and B are, respectively L A and L B , the correlation length scale of C is given by (Gaspari and Cohn, 1999) 1 (24) In our case, L A corresponds to the correlation length scale L loc of the compact support correlation function ρ and L B corresponds to the correlation length scale of the forecast ensemble covariance matrix B e , denoted in the following by L e .
Similarly, L C corresponds to the correlation length scale of the analysis ensemble covariance matrix, denoted in the following by the effective correlation length scale L eff .As we would like to maintain the L eff equal to the Gaussian correlation length scales used in the 4D-Var (i.e.L h 0 = 800 km and L v 0 = 1 level), we need to set L loc and L e such that L eff = L 0 .First, reasonable values for the localized correlation length scales were chosen: L h loc = 2000 km and L v loc = 1.5.In this configuration, the correlation length scales used to generate the model error in the EnKF are defined by L h e = 872 km and L v e = 1.3 model level.The next step in the calibration of the EnKF is the tuning of α and r using a χ 2 diagnostic (see Sect. 2.5). Figure 2 shows the time evolution of χ 2 k /m k for three EnKF runs.The first run assumed r = 1 and α = 0, resulting in χ 2 k /m k at ∼ 2.8 initially and growing quickly during the following days.The model error parameter α was then adjusted by trial and error until the time series of χ 2 k /m k displayed no trend, a condition met by a run using α = 0.025.This second run still resulted in a too-large χ 2 k /m k , around 3. A second series of trial and error adjustments for the observation error parameter r led to the final run for the EnKF calibration: setting r = 1.65 resulted in analyses with χ 2 k /m k close to 1. Figure 3 displays the observation-minus-forecast (OmF) statistics, biases and standard deviations with respect to the assimilated MLS data, for these three EnKF experiments with [α = 0, r = 1], [α = 0.025, r = 1] and [α = 0.025, r = 1.65].A clear improvement is found after the tuning of α, i.e. the presence of the model error term is essential for the EnKF to function properly.The impact of the tuning of r is not so visibly marked; however, the final EnKF experiment using α and r parameters both tuned shows systematically better results than the experiment where only α is tuned.Overall,  this illustrates an important sensitivity of the EnKF to the adjustable error parameters.The value of r = 1.65 has been passed to the 4D-Var system for a final experiment to ensure the use of common observation error statistics.Note that no significant differences in the OmF statistics have been found between the 4D-Var analysis with r = 1 and r = 1.65 (not shown).Thus, while the 4D-Var requires important work to develop an adjoint operator, the tuning of error parameters does not require large efforts in the context of stratospheric chemistry.

Numerical performance
We tried to configure the EnKF and 4D-Var systems to allow comparable total CPU costs.Preliminary experiments with the 4D-Var system show that the 4D-Var performs reasonably well using about 20 iterations.Accounting for the adjoint model integration in the 4D-Var, we have chosen for the EnKF an ensemble size of 40 members.
In terms of numerical performance, the 4D-Var requires about 750 s on a single processor to integrate one assimilation window of 24 h.The EnKF algorithm consists of two separate phases: the ensemble propagation and the analysis (Eq.19).The analysis phase of the EnKF requires 550 s to perform 48 analyses, covering the period of 24 h, (48 analyses correspond to the model time step of 0.5 h) and 500 s to propagate the ensemble during the same period on a single processor.The actual EnKF configuration allows solving Eq. 19 on multiple processors, which helps to gain an important wall clock time: the analysis phase requires 100 s on 16 processors.Note that the computation of the Kalman gain in our EnKF is performed using Cholesky decomposition where the full observation vector is considered at a given time step.No simplification is used to compute the inversion of the innovation matrix [HB e H T + R] or the matrix B e H T .The CPU cost of the EnKF can be improved using local domain decomposition and integration of the ensemble members on different processors in parallel.These two tasks will be a subject of our future work.

Results and discussion
In this section, we discuss the performance of the EnKF and 4D-Var after calibration as described in the previous section.The performance is evaluated using standard OmF statistics, i.e. the average of the differences (bias) between observations and forecasts, as well as their standard deviation.Here we use 24 h forecasts and the assimilated MLS O 3 profiles as observation data.This statistical diagnostic is performed in five different latitudinal bands covering the globe.Figure 4 shows the OmF statistics for the period of September and October 2008.The OmF errors are computed in percents from the observation values.In order to evaluate if the difference between OmF errors provided by EnKF and 4D-Var are statistically significant for a confidence interval of 95 %, the Student's (t test) and the Fisher's test (F test) are computed for biases and standard deviation errors, respectively.The application of these statistical tests are explained in the Appendix.Their results are depicted in Fig. 4 by green/red stars, which correspond to significant/not significant differences for the chosen confidence interval.
As seen from the figure, both systems exhibit a small OmF bias, generally less than 2 % in the pressure range 1-100 hPa.The OmF standard deviations are generally less than 10 % in the range 1-50 hPa and increase to a maximum of ∼ 50 % in the tropical upper troposphere (200 hPa).In the upper troposphere-lower stratosphere (UTLS), the limb sounding observations by Aura-MLS are less precise and less accurate than in the mid-stratosphere (Froidevaux et al., 2008) which explains the increase in the bias and OmF standard deviation below 50 hPa.According to the t and F tests, the OmF of the EnKF and 4D-Var 24 h forecasts are statistically equivalent at almost all levels and in all latitude bands.We note however that decreasing the value of the confidence interval from 95 to 90 %, for example, reveals that following the t and F tests, the OmF of the two systems are different at pressure levels around 4 and 70 hPa in the South Pole region.We note also that in this same region the OmF bias is much larger for both systems in the lower and upper stratosphere (but still smaller than 6 %).The biases of two systems reach agreement with the same sign and roughly of the same magnitude.This indicates that the origin of the bias is common for both systems.
Figure 5 shows a time series of the OmF bias and standard deviation in the upper stratosphere (1-10 hPa) above the South Pole region.The bias and standard deviations of both systems are very similar and remain stable between June and September, with a negligible bias and standard deviations around 6-7 %.From September to November, these values increase up to 7 % for the bias and 12 % for the standard deviation.During that period, the dynamics becomes relatively active in the upper stratosphere above the South Pole due to the breakup of the vortex.The ERA-Interim wind fields used to drive the transport model may be either insufficiently accurate or their update period (6 h) may be too long to allow the systems to provide accurate 24 h forecasts.
Figure 6 shows the time series of the OmF bias and standard deviation in the lower stratosphere (10-100 hPa) above the South Pole region.Again, both systems have similar OmF departures.Between May and September, the biases are generally less than 2 % from the observation values, and   the standard deviations are less than 10 % until the end of August.The standard deviations increase quickly during the first days of September and reach maximum values -around 15-17 % -in mid-September, the 4D-Var providing values slightly lower than those from the EnKF.At the beginning of November, the standard deviations have decreased back to pre-September levels.In September, the bias of both experiments also slightly increases up to 4 % where 4D-Var shows again values slightly lower that those by the EnKF.In mid-October, the bias becomes negative but with values lower than −2 %.Since the months of September and October are precisely the period of photochemically driven ozone destruction, we attribute these degradations in the OmF series to the absence of chemistry in our simplified model: during the ozone hole period, the tracer transport approximation is clearly not adequate.In this situation, the 4D-Var delivered a slightly better performance than the EnKF.This may be due to the assimilation window of 24 h used by the 4D-Var, compared with the sequential assimilation of the EnKF.
Outside the ozone hole period/region and based on the OmF statistics between MLS and the analysis, no significant differences have been found between the systems.Observation-minus-forecast statistics have also been computed against independent observations by Envisat/MIPAS (Michelson Interferometer for Passive Atmospheric Sounding) (Raspollini et al., 2013).Although the OmF statistics differ slightly, no statistical differences between the EnKF and 4D-Var systems have been found either (not shown).Hence, the slightly different OmF statistics are only due to the differences between the MLS and MIPAS data sets.We thus conclude that both systems are statistically equivalent in the observation space except during the ozone hole period, where the 4D-Var delivers analyses with slightly smaller OmF departures than the EnKF.
However, individual analyses provided by the two systems can exhibit larger differences than those given by OmF statistics.Figure 7 (upper row) shows the ozone distribution at 54.6 hPa on 15 September 2008 by both systems and their differences.Although both analyses display similar patterns, they are clearly not identical.But looking at the monthly averaged maps (Fig. 7, lower row) these difference become small.Hence, while there are some differences between the analyses by the two systems, these are not systematic and can be attributed to noise in the analyses.
Finally, let us come back to the time series of the χ 2 diagnostic for the EnKF system (Fig. 2).A small but sharp increase occurs during the first days of September.We attribute this jump to the onset of photochemically driven ozone depletion.Figure 8 shows the formation of the ozone hole through ozone analyses delivered by the EnKF system at 54.6 hPa (model level 22) in the Southern Hemisphere from 29 August to 20 September (one snapshot every two days).While it takes several days to see a clear ozone hole above Antarctica (even on 20 September ozone depletion is not complete yet), ozone depletion started during the first days of September, i.e. exactly during the sudden growth of χ 2 .If this growth is really due to a missing process in our model (in this case the ozone polar chemistry), then χ 2 may be used as a tool to monitor the model error.Note that although the time series of the standard deviation in the OmF also increases during the formation of the ozone hole (see Fig. 6), the growth in the standard deviation is smoother than displayed by the χ 2 and thus provides a less clear signal.Future work will extend the comparison to the full BASCOE CTM including the ozone polar chemistry, and if our explanation is correct this sharp increase should disappear from the χ 2 time series.

Conclusions
The first aim of this paper was to present the implementation of the EnKF method in the BASCOE system.This system was originally based on 4D-Var, and our motivation was to bypass the development and maintenance of an adjoint model.The new EnKF version of BASCOE was developed accounting for two adjustable parameters: the parameter α controlling the model error term of the EnKF and the parameter r controlling the observational error.These two parameters have been adjusted based on the monitoring of a χ 2 test measuring the misfit between the control variable and the observations.In this study, we have turned off the chemistry in the CTM of BASCOE and have considered only ozone transport.This configuration allowed considerably faster execution of both systems and a large number of assimilation experiments.The second aim of this paper was to properly compare the EnKF and 4D-Var despite the fundamental differences between the two methods.To this end, we have used the same numerical model, an identical set of observations, an identical observation operator H and the same observation error covariance matrix R. Furthermore, the background error covariances have been carefully calibrated in both systems.First, the background error covariance matrix B 0 of the 4D-Var system has been calibrated using the NMC method.Two components of B 0 , i.e. the background error spatial correlation matrix and the background error standard deviation matrix , have been transferred to the EnKF system to generate the initial ensemble and the model error term.The background error statistics were then carefully designed to use the same correlation models and equivalent length-scales in both systems.The EnKF parameters α and r have been calibrated using the χ 2 test.The value of r has been passed to the 4D-Var system to ensure that both systems use identical observation error covariances.Despite a straightforward implementation of the EnKF numerical algorithm, the resulting EnKF version of BASCOE was shown to be extremely sensitive to the parameter values.Thus, an accurate adjusting procedure of these parameters is of great importance to the performance of the system.
We note that the EnKF has a model error term whereas the 4D-Var is considered as a strong constraint problem.It may thus be argued that their comparison cannot be totally fair.Note however that the EnKF is not able to work properly without a model error, and the implementation of the model error term was relatively fast.On the other hand, the effort required to implement a weak constraint in the 4D-Var seems quite large.We thus have considered both methods in their original form.Consequently, we have not attempted to introduce the background error covariance localization within 4D-Var.Such localization would have resulted in a comparison with a form of 4D-Var that is never used in practice.
The two systems were compared through the assimilation of Aura-MLS ozone observations for the period May-December 2008, thus comprising the polar night and ozone hole periods.We focused our attention on the lower and middle stratosphere where the ozone lifetime is much larger than the observational update frequency.
We have assessed the performance primarily in terms of observation-minus-24 h-forecast statistics and found that the analyses provided by the two systems are significantly similar for confidence interval of 95 %, with biases smaller than 5 % and standard deviation errors smaller than 10 % in most of the stratosphere.In September and October, the two systems display an increase in their OmF above the South Pole due to (1) the vortex breakup in the upper stratosphere and (2) the ozone hole formation in the lower stratosphere.The degradation of the OmF is these cases are attributed to (1) inaccuracies in the modelling of the dynamics and (2) the omission of ozone hole chemistry in the model.
Since the biases are markedly similar, they most probably have the same causes: these can be deficiencies in the model and in the observation data set.The remarkably similar performance also shows that in the context of stratospheric transport, the choice of the assimilation method can be based on application-dependent factors, such as CPU cost or the ability to generate an ensemble of forecasts.
The BASCOE 4D-Var system can provide analyses taking stratospheric chemistry explicitly into account, in the forward as well as the adjoint model.The EnKF system presented here accounts only for stratospheric transport, not chemistry.The application of the EnKF method to the fullchemistry model may require a careful tuning procedure for each chemical species, a task that can be time consuming.Hence, an adaptive calibration procedure of the error covariances (similar to Anderson, 2009, or Li et al., 2009) should be implemented.The implementation of EnKF with chemistry is ongoing and will be reported in future studies.

Fig. 1 :Figure 1 .
Fig. 1: Vertical profile of the background error as used for the standard deviation matrix Σ.

Fig. 4 :
Fig. 4: OmF statistics for the EnKF (red lines) and the 4D-Var (blue lines) with respect to the assimilated EOS Aura-MLS data for the period September-October 2008.Bias (top row) and standard deviation (bottom row) for 5 different latitudinal bands.The green or red stars show the result of the Student-and Fisher-tests of significance on the 95% level (see text). 21

Figure 4 .Fig. 5 :Figure 5 .
Figure 4. Observation-minus-forecast statistics for the EnKF (red lines) and the 4D-Var (blue lines) with respect to the assimilated EOS (earth observing system) Aura-MLS data for the period of September and October 2008.Bias (top row) and standard deviation (bottom row) for five different latitudinal bands.The green or red stars show the result of the Student's and Fisher's tests of significance on the 95 % level (see text).

Figure 7 .
Figure 7. Ozone at 54.6 hPa (model level 22) from 4D-Var (left), EnKF (middle) and their absolute differences (right).The upper row corresponds to a snapshot on 15 September 2008 while the lower row corresponds to a monthly mean for the month of September 2008.

Figure 8 .
Figure 8. Ozone distributions at 54.6 hPa (model level 22) for the Antarctic region.The snapshots are taken for the period between 29 August and 20 September every two days.Contours are plotted for every 0.25 ppmv.