An ensemble Kalman filter data assimilation system for the whole neutral atmosphere

Koshin, Dai; Sato, Kaoru; Miyazaki, Kazuyuki; Watanabe, Shingo

doi:https://doi.org/10.5194/gmd-13-3145-2020

Articles | Volume 13, issue 7

https://doi.org/10.5194/gmd-13-3145-2020

Articles | Volume 13, issue 7

Development and technical paper

13 Jul 2020

Development and technical paper |

| 13 Jul 2020

An ensemble Kalman filter data assimilation system for the whole neutral atmosphere

Dai Koshin, Kaoru Sato, Kazuyuki Miyazaki, and Shingo Watanabe

Abstract

A data assimilation system with a four-dimensional local ensemble transform Kalman filter (4D-LETKF) is developed to make a new analysis dataset for the atmosphere up to the lower thermosphere using the Japanese Atmospherics General Circulation model for Upper Atmosphere Research. The time period from 10 January to 20 February 2017, when an international radar network observation campaign was performed, is focused on. The model resolution is T42L124, which can resolve phenomena at synoptic and larger scales. A conventional observation dataset provided by the National Centers for Environmental Prediction, PREPBUFR, and satellite temperature data from the Aura Microwave Limb Sounder (MLS) for the stratosphere and mesosphere are assimilated. First, the performance of the forecast model is improved by modifying the vertical profile of the horizontal diffusion coefficient and modifying the source intensity in the non-orographic gravity wave parameterization by comparing it with radar wind observations in the mesosphere. Second, the MLS observational bias is estimated as a function of the month and latitude and removed before the data assimilation. Third, data assimilation parameters, such as the degree of gross error check, localization length, inflation factor, and assimilation window, are optimized based on a series of sensitivity tests. The effect of increasing the ensemble member size is also examined. The obtained global data are evaluated by comparison with the Modern-Era Retrospective analysis for Research and Applications version 2 (MERRA-2) reanalysis data covering pressure levels up to 0.1 hPa and by the radar mesospheric observations, which are not assimilated.

Download & links

How to cite.

Received: 09 Sep 2019 – Discussion started: 07 Nov 2019 – Revised: 10 May 2020 – Accepted: 14 May 2020 – Published: 13 Jul 2020

1 Introduction

It is well known that the earth's climate is remotely coupled: for example, when El Niño occurs, convective activity in the tropics strongly affects midlatitude climate with the appearance of the Pacific–North American pattern (Horel and Wallace, 1981). Convective activity in maritime continents also modulates midlatitude climates by generating the Pacific–Japan pattern (Nitta, 1987). Most of these climate couplings between the tropics and midlatitude regions are caused by the horizontal propagation of stationary Rossby waves (Holton and Hakim, 2013). Teleconnection through stratospheric processes has also been known. For example, the sea-level pressure in the Arctic rises during El Niño. It was shown that this teleconnection occurs by modulation of planetary wave intensity and propagation in the stratosphere (Cagnazzo and Manzini, 2009). It is also well known that the occurrence frequency of stratospheric sudden warming (SSW), which exerts a strong influence on the Arctic oscillation of sea-level pressure (Baldwin and Dunkerton, 2001), is high during the easterly phase of the quasi-biennial oscillation (QBO) in the equatorial stratosphere (Holton and Tan, 1980). This is also due to the modulation of the propagation of planetary waves in the stratosphere. Thus, the stratosphere is an important area that brings about the remote coupling of climate.

Recently, the presence of interhemispheric coupling through the mesosphere has been reported as well. When the temperature in the polar winter stratosphere is high, the temperature in the polar summer upper mesosphere is also high, with a slight delay (Karlsson et al., 2009). This coupling is clear for at least a 1-month average (Gumbel and Karlsson, 2011). The interhemispheric coupling, which is initiated by SSW in the winter hemisphere, occurs at shorter timescales (Körnich and Becker, 2010). When SSW occurs in association with the breaking of strong planetary waves originating from the troposphere, the westerly wind of the polar night jet significantly weakens or, in strong cases, even turns easterly. The critical-level filtering of the gravity waves toward the mesosphere is then modulated, and the gravity wave forcing that drives the mesospheric meridional circulation with an upward (downward) branch on the equatorial (polar) side becomes weak. Thus, the temperature in the equatorial region increases and the poleward temperature gradient in the summer hemisphere weakens. The weak wind layer above the easterly jet in the summer hemisphere lowers so as to satisfy the thermal wind relation. The eastward gravity wave forcing region near the weak wind layer also descends and the upward branch of the meridional circulation, which maintains extremely low temperature in the summer polar upper mesosphere, weakens.

However, there is little observational evidence of gravity wave modulation in the mesosphere. The Interhemispheric Coupling Study by Observations and Modeling (ICSOM: http://pansy.eps.s.u-tokyo.ac.jp/icsom/, last access: 26 June 2020) is a project to understand mesospheric gravity wave modulation associated with SSWs on a global scale through a comprehensive international observation campaign with a network of mesosphere–stratosphere–troposphere (MST), meteor, and medium-frequency (MF) radars as well as complementary optical and satellite-borne instruments. Since 2016, four campaigns have been successfully performed.

In the ICSOM project, we are also proceeding with a model study using a gravity-wave-permitting high-top general atmospheric circulation model (GCM) that covers the entire troposphere and middle atmosphere (up to the lower thermosphere) simultaneously. However, this is not easy because the GCMs including the entire middle atmosphere are not yet sufficiently mature even for relatively low resolutions that do not allow explicit gravity wave simulation (e.g., Smith et al., 2017). Therefore, verification of the GCMs by high-resolution observations is necessary. In the ICSOM project, by validating the high-top GCM using data from the comprehensive international radar observation campaigns, it is expected to reproduce high-resolution global data with high reliability. Using these global data, we plan to confirm the regional representation of gravity wave characteristics detected by each radar and deepen the understanding of interhemispheric coupling quantitatively with a resolution of gravity wave scales.

Gravity wave simulation research using high-resolution GCMs has been performed in the past (e.g., Hamilton et al., 1999; Sato et al., 1999, 2009, 2012; Watanabe et al., 2008; Holt et al., 2016). However, reproducing gravity wave fields in the global atmosphere at a specific date and time requires significant effort (Eckermann et al., 2018; Becker et al., 2004). Data assimilation up to the scale of gravity waves is ideal to create global high-resolution grid data sequentially. However, current data assimilation schemes work well for geostrophic motions such as Rossby waves but not necessarily for ageostrophic motions such as gravity waves. Recent studies (Jewtoukoff, et al., 2015; Ehard et al., 2018) reported that gravity waves observed in the European Center for Medium-Range Weather Forecasts (ECMWF) operational data are partly realistic in the lower and middle stratosphere, but more validation with observation data is necessary. It has also been shown that the difference in horizontal winds between reanalysis datasets is quite large in the equatorial region where the Coriolis parameter becomes zero (Kawatani et al., 2016). The reasons for this problem may be the insufficient maturity of the models to accurately express ageostrophic motions and/or the shortage of observation data, including gravity waves, to be assimilated.

Data assimilation for the mesosphere is particularly not easy, partly because the energy ratio of Rossby waves and gravity waves is reversed there (Shepherd et al., 2000) and partly because observational data for the mesosphere are significantly limited compared to those for the lower atmosphere. In addition, it has been shown that, in the upper stratosphere and the mesosphere, Rossby waves are generated in situ due to baroclinic–barotropic instability caused by wave forcing associated with breaking or critical-level absorption of gravity waves propagating from the troposphere (Watanabe et al., 2009; Ern et al., 2013; Sato and Nomoto, 2015; Sato et al., 2018). It has been found that gravity waves are spontaneously generated in the middle atmosphere from the imbalance of the polar night jet (Sato and Yoshiki, 2008; Snyder et al., 2007; Shibuya et al., 2017), from an imbalance caused by the wave forcing due to primary gravity waves (Vadas and Becker, 2018; Hayashi and Sato, 2018), and also by shear instability caused by primary gravity wave forcing (Yasui et al., 2018). The Rossby wave generation in the middle atmosphere due to primary gravity wave forcing is regarded as a compensation problem, which makes it difficult to understand the change in the Brewer–Dobson circulation in terms of the relative roles of Rossby waves and gravity waves for climate projection with the models (Cohen et al., 2013). However, these instabilities and the in situ generation of waves in the middle atmosphere could significantly affect the momentum and energy budget in the middle atmosphere and above (Sato et al., 2018; Becker, 2017). Hence, it is necessary to understand the roles of these waves as accurately as possible based on credible, high-resolution model simulations validated by high-resolution observations.

In view of the situation described above, the following method may be one of the best existing ways to create high-resolution data for the entire middle atmosphere including gravity waves to understand the teleconnection through the mesosphere. First, a data assimilation is performed using a high-top but relatively low-resolution model to create grid data for the real atmosphere from the ground to the lower thermosphere including only larger-scale phenomena such as Rossby waves. Second, the analysis data obtained by the assimilation are used as initial values for a free run of high-resolution GCMs to simulate gravity waves. Eckermann et al. (2018) and Becker and Vadas (2018) have performed pioneering studies on the effectiveness of such free runs.

Reanalysis data over a long time period are produced using modern data assimilation schemes and released by meteorological organizations for climate analysis. These include the following: the ECMWF interim reanalysis (ERA-Interim; Dee et al., 2011) and the fifth reanalysis (ERA5; Hersbach et al., 2018) produced by a four-dimensional (4D) variational assimilation scheme (Var); MERRA (Rienecker et al., 2011) and the following version 2 (MERRA-2; Gelaro et al., 2017) by the National Aeronautics and Space Administration (NASA) by a three-dimensional Var (3D-Var); the National Centers for Environmental Prediction (NCEP) Climate Forecast System Reanalysis (CFSR; Saha et al., 2010) and the Climate Forecast System version 2 (CFSv2; Saha et al., 2014); and the Japanese 55-year reanalysis (JRA-55; Kobayashi et al., 2015) by a 4D-Var. ERA-Interim and JRA-55 cover up to a pressure of 0.1 hPa, NCEP/CFSR and NCEP/CFSv2 up to 0.266 hPa, and MERRA, MERRA-2, and ERA5 up to 0.1 hPa. However, global data for the middle and upper mesosphere to the lower thermosphere are not created regularly. As stated above, considering the importance of ageostrophic motions in the mesosphere and lower thermosphere (MLT), the data assimilation used for such meteorological organizations may not work very well for the middle stratosphere and above (Polavarapu et al., 2005). Therefore, in recent years, significant efforts have been made to assimilate data using GCMs that include the MLT region. Currently, the data available for studying the MLT region come from the Aura Microwave Limb Sounder (Aura MLS; beginning in 2004), Thermosphere Ionosphere Mesosphere Energetics and Dynamics (TIMED) Sounding of the Atmosphere using Broadband Emission Radiometry (SABER; beginning in 2002), and the Defense Meteorological Satellite Program (DMSP) Special Sensor Microwave Imager/Sounder (SSMIS; Swadley et al., 2008).

Global data for the atmosphere including the MLT region are valuable from the following viewpoints. First, they can improve prediction of the polar stratosphere (e.g., Hoppel et al., 2008, 2013; Polavarapu et al., 2005). It seems that anomalies in the MLT region start about 1 week earlier than stratospheric anomalies such as SSWs, propagating down to the troposphere. Thus, better understanding the MLT physics and chemistry has the potential to improve long-range weather forecasts. Second, it is possible to quantitatively understand the transport of minor species from the MLT region (e.g., Hoppel et al., 2008; Polavarapu et al., 2005). For example, high-energy particles originating from the upper atmosphere contribute to the production of NO_x, which modulates the ozone chemistry in the stratosphere. Thus, the quantitative evaluation of the transport of such species is important for the prediction of the ozone layer. Third, they contribute to space-weather prediction, particularly for the prediction of the near-space environment (e.g., Hoppel et al., 2013). Atmospheric waves excited in the lower and middle atmosphere, including gravity waves, Rossby waves, and tides, are main drivers of the general circulation in the height range of 100–150 km in the lower thermosphere (e.g., Akmaev, 2011; Miyoshi and Yigit, 2019). Thus, it is important to examine the properties of these waves in the mesosphere. Last but not least, it is interesting to understand middle atmosphere processes as a pure science (e.g., Hoppel et al., 2008).

The first attempt to create analysis data for the whole middle atmosphere using data assimilation was made by a Canadian group. They employed 3D-Var using the Canadian Middle Atmosphere Model (CMAM) with full interactive chemistry and nonlocal thermodynamic equilibrium (non-LTE) radiation (Polavarapu et al., 2005; Nezlin et al., 2009). The assimilation of the data in the troposphere and stratosphere has been shown to improve the analysis of large-scale phenomena (zonal wavenumber s<10) in the mesosphere (Nezlin et al., 2009). The daily mean time series from their data assimilation are validated by radar observations (Xu et al., 2011). Sankey et al. (2007) used the CMAM to carefully discuss the effectiveness of digital filters in the data assimilation. A series of studies at the Naval Research Laboratory (NRL) is remarkable. Hoppel et al. (2008) performed the first mesospheric data assimilation at the Advanced Level Physics and High-Altitude (ALPHA) prototype of the Navy Operational Global Atmospheric Prediction System (NOGAPS) using a 3D-Var assimilation system (NAVDAS). After that, they introduced a 4D-Var to assimilate data using the NRL Navy Global Environmental Model (NAVGEM), a successor of NOGAPS (Hoppel et al., 2013). In this system, the SSMIS data were also assimilated along with the SABER and Aura MLS data. The calculation of the background error covariance matrix was accelerated by introducing ensemble forecasts, and assimilation shocks to the model were reduced by using digital filters (McCormack et al., 2017; Eckermann et al., 2018). Global data with short time intervals were made by combining model forecasts with the assimilation products, and both short-term and annual variations of diurnal migrating tides were successfully captured (McCormack et al., 2017; Dhadley et al., 2018; Eckermann et al., 2018). These assimilation data products are utilized for the study of quasi-2 d waves and 5 d waves, as well as tides (Eckermann et al., 2009, 2018; Pancheva et al., 2016), and for observation projects such as the Deep Propagating Gravity Wave Experiment (DEEPWAVE; Fritts et al., 2016). A data assimilation study using the Whole Atmosphere Community Climate Model (WACCM) at the National Center for Atmospheric Research (NCAR) has been also conducted. Pedatella et al. (2014b) applied a Data Analysis Research Testbed (DART) ensemble adjustment Kalman filter (EAKF), which is a 3D-Var combined with a statistical scheme, to the WACCM and made analysis data for the largest recorded SSW event, which occurred in 2009. They indicated that better analysis of the mesosphere requires assimilation of the mesospheric observational data. A similar discussion was made by Sassi et al. (2018) using the Specified Dynamics WACCM (SD-WACCM), in which a nudging method was implemented. The reality of the analysis highly depends on the model's performance in the MLT region. One of the critical components to determine the MLT region in the model is gravity wave parameterizations (Pedatella et al., 2014a; Smith et al., 2017). According to Pedatella et al. (2018), the analysis of the SSW in 2009 by the WACCM using DART showed that the expression of the downward transport of chemical components by the data assimilation is better than by the nudging method.

Nowadays whole-atmosphere models covering the surface to the exosphere have been developed (Akmaev, 2011). Data assimilation and data nudging studies using a whole-atmosphere model were performed focusing on the SSW in 2009. These include studies using the whole-atmosphere data assimilation system (WDAS), which includes the whole-atmosphere model and a 3D-Var analysis system (Wang et al., 2011), the Ground-to-topside model of Atmosphere and Ionosphere for Aeronomy (GAIA) with a nudging method (Jin et al., 2012), and SD-WACCM (Chandran et al., 2013; Sassi et al., 2013). Outputs from a long-term run using GAIA, which was nudged to the reanalysis data up to the lower stratosphere, were used for a momentum budget analysis in the whole middle atmosphere, and the importance of the in situ generation of gravity waves and Rossby waves in the middle atmosphere was suggested (Sato et al., 2018; Yasui et al., 2018).

Although most 4D data assimilation studies described above used 4D-Var, the method using an ensemble Kalman filter is also possible. The 4D-Var codes need to be developed for each model. In contrast, the four-dimensional local ensemble transform Kalman filter (4D-LETKF) developed by Miyoshi and Yamane (2007), which is a statistical assimilation method, is versatile and can thus be implemented in any model relatively easily. This study develops an assimilation system using the 4D-LETKF with a GCM with a top in the lower thermosphere. As the first step of the ICSOM project, we used a low-resolution version of the GCM and examined the best parameters of the assimilation system for the middle atmosphere (i.e., the atmosphere up to the turbopause, ∼ 100 km), as no studies employ the 4D-LETKF to assimilate data for such a high atmospheric region. The observation datasets used for the data assimilation are Aura MLS (v.4.2) temperature, which covers the whole stratosphere and mesosphere, and NCEP PREPBUFR, which is a standard dataset for the troposphere and lower stratosphere. The target time period is from January to February 2017, which includes the second ICSOM observation campaign. On 1 February 2017, the criteria of the major SSW were satisfied. The structure of this paper is as follows. Section 2 describes the forecast model, observation data, and data assimilation system. Section 3 presents the results of the parameter assessment. Section 4 presents the results of analysis regarding fields in the middle atmosphere in ICSOM-2 using data from the best parameter setting. Section 5 gives the summary and concluding remarks.

2 Methodology

2.1 Forecast model

We used the Japanese Atmospheric GCM for Upper Atmosphere Research (Watanabe and Miyahara, 2009) as a forecast model, which we refer to as JAGUAR in this paper. This model has a high model top of approximately 150 km and is based on the T213L256 middle atmosphere GCM developed for the Kanto project (Watanabe et al., 2008) and the Kyushu-GCM (e.g., Yoshikawa and Miyahara, 2005). This model uses important physical parameterizations for the MLT region such as radiative transfer processes, including non-LTE and solar radiative heating due to molecular oxygen and ozone. The effects of ion drag, chemical heating, dissipation heating, and molecular diffusion are also parameterized in the model. In this study, a standard-resolution JAGUAR with a triangularly truncated spectral resolution of T42 corresponding to a horizontal resolution of about 300 km (a latitudinal interval of 2.8125^∘) is used for the assimilation. The model has 124 vertical layers with a uniform vertical spacing of approximately 1 km in the middle atmosphere and 100–800 m in the troposphere (see Fig. A1 of Watanabe et al., 2015, for the vertical layers). Unlike a high-resolution JAGUAR, which resolves a certain portion of gravity waves (Watanabe and Miyahara, 2009), gravity waves are sub-grid-scale phenomena for a standard-resolution JAGUAR. For this reason, both orographic (McFarlane, 1987) and non-orographic (Hines, 1997) gravity wave parameterizations are used. The wave source distribution of non-orographic parameterization is given based on the results of a gravity-wave-resolving high-resolution GCM (Watanabe, 2008), and the intensity of the source is treated as one of the tuning parameters. Horizontal diffusion is set as an e-folding time of 0.9 d for the minimum resolved wave length in the troposphere and stratosphere, and it exponentially increases with increasing height over the MLT region. In this study, the vertical profile of horizontal diffusion above the stratopause is also treated as one of the tuning parameters. The monthly ozone mixing ratio from United Kingdom Universities Global Atmospheric Modeling Programme (UGAMP; Li and Shine, 1999) and monthly sea surface temperature and sea ice concentration from the Met Office Hadley Centre sea ice and sea surface temperature dataset (HadISST; Rayner et al., 2003) are linearly interpolated in time and used as boundary conditions.

2.2 Measurements used in the assimilation

2.2.1 PREPBUFR

Observation data used for the assimilation include the PREPBUFR global observation dataset compiled by the National Centers for Environmental Prediction and archived at the University Corporation for Atmospheric Research (https://rda.ucar.edu/datasets/ds337.0/, last access: 26 June 2020), which includes surface pressure as a function of longitude and latitude, as well as temperature, wind, and humidity as functions of longitude, latitude, and pressure (or height) from radiosondes, aircrafts, wind profilers, and satellites. Ground-based observations are mainly distributed in the height range from the ground to the lower stratosphere, and approximately 70 % of the data are taken at stations located in the Northern Hemisphere. Since May 1997, daily data have been uploaded with a delay of several days. The number of data points per one assimilation step (every 6 h) is 1000–20 000 for balloon-borne radiosonde measurements, ∼ 1000 for aircraft measurements, ∼ 40 000 for satellite wind measurements, ∼ 10 000 for meteorological radar measurements, ∼ 50 000 for measurements at the ground, and ∼ 500 000 for sea scatterometer measurements.

The observation errors provided in the PREPBUFR dataset as a function of the type of measurements and altitude¹ were used in the data assimilation. For example, the observation errors in radiosonde temperature data are 1.2 K at 1000 hPa, 0.8 K at 100 hPa, and 1.5 K at 10 hPa. The horizontal resolution of the GCM used in this study is not sufficient to represent the fine structure captured by these observations. Representativeness errors, which come from the difference in resolutions between individual measurements and the model, could degrade the data assimilation performance. If representation errors are random and large numbers of observations are assimilated, their impact could be negligible. Because substantial numbers of observations are available within a model grid cell in a data assimilation cycle in our analysis, the observation data were thinned before assimilation to reduce the computational cost of the data assimilation analysis. Original data from aircraft and satellite winds are trimmed by taking one of every four consecutive data points. Radiosonde data at the standard pressure levels of 1000, 925, 850, 700, 500, 400, 300, 250, 200, 150, 100, 70, 50, and 10 hPa were used for the data assimilation. These settings are the same as the ALERA2 (Enomoto et al., 2013)

2.2.2 Aura MLS

The MLS instrument onboard the Aura satellite was launched in 2004. The satellite takes the polar orbit 14 times a day. Vertical profiles of several atmospheric parameters are retrieved from a limb sounding of the thermal emissions of the atmosphere. We used temperature data (v.4.2) retrieved from the radiation of oxygen (O₂; 118 GHz) and the oxygen isotope (O¹⁸O; 239 GHz) of Aura MLS (Livesey et al., 2018) for the assimilation. The data are distributed at 55 vertical layers from 261 to 0.001 hPa at ∼ 2 km intervals. The estimated retrieval errors are ∼ 0.5 K at 261–10 hPa, ∼ 1 K at 10–0.3 hPa, ∼ 2 K for 0.3–0.04 hPa, and ∼ 3 K for 0.04–0.001 hPa. For the observation operator, we included weighting functions (called “averaging kernels”) to consider the vertical sensitivity of the measurements. The weighting functions at the Equator and at 70^∘ N are available on the Aura MLS mission website (https://mls.jpl.nasa.gov/data/ak/, last access: 26 June 2020). Assuming that the measurement vertical sensitivity is invariant for a wide area, the averaging kernel for the Equator and that for 70^∘ N are respectively applied to the latitudinal range of 40^∘ S–40^∘ N and the remaining high-latitude regions (i.e., 40–90^∘ N and 40–90^∘ S).

The horizontal intervals of the Aura MLS observation data along the track, which is almost parallel to the meridional direction, are approximately 2^∘, so two or three profiles are included in the area represented by a grid point of the forecast model. Note that the horizontal intervals of the Aura MLS observation data between subsequent orbits are approximately 30^∘, which is much coarser than the model resolution. To reduce the computational cost of the data assimilation, the observations are horizontally averaged for the along-track direction to reduce the resolution comparable to the forecast model resolution before the assimilation, without considering any correlation between individual observation errors. Errors in the retrievals in some parameters can be correlated in space, but their quantitative estimates are difficult. The measurement error is used as the diagonal component of the observation error covariance matrix. Moreover, this average is effective to remove gravity waves that cannot be resolved by the current model. We have confirmed the importance of the averaging by comparing the results with and without the averaging (not shown).

It has been suggested that the Aura MLS data include observation bias (e.g., Randel et al., 2016). In this study, a bias correction is performed, and the effect of the bias correction on the analysis data is examined. In addition to the retrieval quality flag information, a gross error check was applied in the quality control to exclude observations that are far from the first guess. The best settings for the gross error check are considered to be different between the mesosphere and lower atmosphere because of the different growth rates of model error in a specific period of time (e.g., a data assimilation window). Thus, the appropriate degrees of the gross error check are also examined.

2.3 Data assimilation system

The 4D-LETKF (Miyoshi and Yamane, 2007) is used as a data assimilation method. This method is an extension of the 3D-LETKF (Hunt et al., 2007), which includes the dimension of time (4D ensemble Kalman filter; Hunt et al., 2004). The base of the program used in this study has already been applied to many types of forecast models, such as the Global Spectral Model (GSM; Miyoshi and Sato, 2007), the Atmospheric GCM for Earth Simulator (AFES; Miyoshi et al., 2007), and the Non-hydrostatic Icosahedral Atmospheric Model (NICAM; Terasaki et al., 2015).

This section introduces the formulas used in the 4D-LETKF. The analyses, forecasts, and observations are denoted by x^a, x^f, and y^o, respectively. The optimal value of x^a is derived from x^f and y^o by the following equation:

\begin{matrix} (1) & x^{a} = x^{f} + K (y^{o} - H x^{f}) = x^{f} + K d, \end{matrix}

where K is a weighting function, $d (\equiv y^{o} - H x^{f})$ is the innovation, and H is an observation operator that converts the model space variables into observational space variables. For assimilating MLS retrievals, the observation operator includes the averaging kernel and the spatial operator. The second term on the right-hand side represents data assimilation corrections (i.e., increments). Using the differences from the true value (x^t), $δ x^{a} = x^{a} - x^{t}$ , $δ x^{f} = x^{f} - x^{t}$ , and $δ y^{o} = y^{o} - H x^{t}$ , Eq. (1) can be rewritten as follows:

\begin{matrix} (2) & δ x^{a} = δ x^{f} + K (δ y^{o} - H δ x^{f}) = (I - KH) δ x^{f} + K δ y^{o}, \end{matrix}

where I is an identity matrix. The analysis error covariance is defined as

\begin{matrix} (3) & P^{a} \equiv 〈 δ x^{a} (δ x^{a})^{T} 〉 = (I - KH) P^{f} {(I - KH)}^{T} + {KRK}^{T}, \end{matrix}

where P^f≡〈δx^f(δx^f)^T〉 is the forecast error covariance and R≡〈δy^o(δy^o)^T〉 is the observation error covariance. The correlation between the forecast error and the observation error is supposed to be zero (〈δx^f(δy^o)^T〉=0). The optimal x^a should minimize the summation of the analysis error covariance (tr(P^a)). This means that

\begin{matrix} (4) & \frac{\partial}{\partial K} t r (P^{a}) = 0 . \end{matrix}

Solving Eq. (4) with respect to the weight matrix K yields

\begin{matrix} (5) & K = P^{f} H^{T} {({HP}^{f} H^{T} + R)}^{- 1}, \end{matrix}

and the analysis x^a is derived by Eq. (1). The weight matrix K is called the “Kalman gain”. Using K, the analysis error covariance P^a is rewritten as

\begin{matrix} (6) & P^{a} = (I - KH) P^{f}, \end{matrix}

which gives the relationship $K = P^{a} H^{T} R^{- 1}$ .

The size of P^f and P^a is the square of the degree of freedom in the model. Thus, for systems with huge degrees of freedom, such as GCMs, the calculation of P^f and P^a requires a large computational cost. This problem is avoided by replacing the forecast, analysis, and each error with the mean and variance for m members of an ensemble. This is called the ensemble Kalman filter (EnKF; Evensen, 2003). The ensemble mean $\overline{x}$ and background error covariance matrix P are written as follows:

\begin{matrix} (7) & \overline{x} = \frac{1}{m} \sum_{i = 1}^{m} x_{i}, \end{matrix}

\begin{matrix} (8) & \begin{aligned} P & = 〈 δ x (δ x)^{T} 〉 \approx \frac{1}{m - 1} \sum_{i = 1}^{m} (x_{i} - \overline{x}) {(x_{i} - \overline{x})}^{T} \\ = \frac{1}{m - 1} \sum_{i = 1}^{m} δ x_{i} {(δ x_{i})}^{T} . \end{aligned} \end{matrix}

However, with a limited number of ensembles, the forecast error tends to be underestimated in a system with a large degree of freedom. A variety of methods have been proposed to overcome this problem (e.g., Whitaker et al., 2008). In our study, the forecast ensemble perturbation (δx^f) is multiplied by the factor F, which is a little larger than 1 ( $F = 1 + Δ$ ; Δ is called an “inflation factor”):

\begin{matrix} (9) & δ x_{i}^{f} \leftarrow (1 + Δ) δ x_{i}^{f} \end{matrix}

is employed, and the Δ value is optimized.

The Kalman gain is simply written by using E, which is the root of P. Using

\begin{matrix} (10) & \sqrt{m - 1} E \equiv [δ x_{1} | \dots | δ x_{m}], \end{matrix}

\begin{matrix} (11) & \begin{aligned} K & = P^{f} H^{T} {({HP}^{f} H^{T} + R)}^{- 1} \\ = E^{f} {({HE}^{f})}^{T} {[{HE}^{f} {({HE}^{f})}^{T} + (m - 1) R]}^{- 1} \end{aligned} \end{matrix}

is derived. Further manipulation yields another expression of K:

\begin{matrix} (12) & K = E^{f} {[(m - 1) I + {({HE}^{f})}^{T} R^{- 1} {HE}^{f}]}^{- 1} {({HE}^{f})}^{T} R^{- 1} . \end{matrix}

To reduce the calculation cost, the inverse matrix of Eq. (11) or Eq. (12) with a smaller size is chosen. Usually, as the number of the ensemble is much smaller than the number of observations, Eq. (14) is used.

The LETKF treats the analysis error covariance matrix,

\begin{matrix} (13) & {\tilde{P}}^{a} \equiv {[(m - 1) I + {({HE}^{f})}^{T} R^{- 1} {HE}^{f}]}^{- 1}, \end{matrix}

in the ensemble space. The relationship between this matrix in the ensemble space and the analysis error covariance matrix in the model space is expressed as $P^{a} = E^{f} {\tilde{P}}^{a} (E^{f})^{T}$ , and $E^{a} = E^{f} ({\tilde{P}}^{a})^{\frac{1}{2}}$ is the ensemble update. In this way, the analysis $x_{i}^{a}$ is obtained as

\begin{matrix} (14) & x_{i}^{a} = {\overline{x}}^{a} + δ x_{i}^{a} = {\overline{x}}^{f} + δ x_{i}^{a} + E^{f} {\tilde{P}}^{a} {({HE}^{f})}^{T} R^{- 1} d . \end{matrix}

Using the shape of the N×m matrix, where N is the number of the variables, Eq. (14) is written as

\begin{matrix} (15) & [x_{1}^{a} | \dots | x_{m}^{a}] = [{\overline{x}}^{f} | \dots | {\overline{x}}^{f}] + E^{f} W, \end{matrix}

where

\begin{matrix} (16) & W = ({\tilde{P}}^{a})^{\frac{1}{2}} + [{\tilde{P}}^{a} {({HE}^{f})}^{T} R^{- 1} d | \dots | {\tilde{P}}^{a} {({HE}^{f})}^{T} R^{- 1} d] . \end{matrix}

To avoid unrealistic correction caused by remote observations with the use of a limited ensemble size, a weighting function based on the distance from the analysis point is multiplied by the observation error. This method is called “localization”. The calculation is independently performed at each grid so it can be performed in parallel with high computational efficiency. The length of localization is also a setting parameter of the data assimilation system, and the sensitivity of the assimilation performance to this parameter is examined in Sect. 3.3.2.

When an analysis ensemble is derived, each ensemble takes its own time evolution calculated by the forecast model, and the forecast at the next step is derived by

\begin{matrix} (17) & x_{i, t + 1}^{f} = M (x_{i, t}^{a}) . \end{matrix}

In this way, the forecast and analysis steps are repeated through the data assimilation cycles.

Here we extend to a 4D analysis. By the modification of the observation operator, the observation at any time (j2) can be assimilated as the information on time development from the target time (j1). One such assimilation is called the 4D-EnKF (Hunt et al., 2004).

The forecast at the time step j1 is written as a weighted mean of forecast ensembles:

\begin{matrix} (18) & x_{j 1}^{f} = [x_{1, j 1}^{f} | \dots | x_{m, j 1}^{f}] w \equiv X_{j 1}^{f} w . \end{matrix}

The weighting matrix w is unknown but is calculated by the pseudo-inverse matrix:

\begin{matrix} (19) & w = {((X_{j 1}^{f})^{T} X_{j 1}^{f})}^{- 1} {(X_{j 1}^{f})}^{T} x_{j 1}^{f} . \end{matrix}

On the other hand, the (unknown) forecast at the time step j2 is also written as a weighted mean of forecast ensembles:

\begin{matrix} (20) & x_{j 2}^{f} = X_{j 2}^{f} w . \end{matrix}

Substituting Eq. (16) into this equation, the following formula is obtained:

\begin{matrix} (21) & x_{j 2}^{f} = X_{j 2}^{f} w = X_{j 2}^{f} {({(X_{j 1}^{f})}^{T} X_{j 1}^{f})}^{- 1} {(X_{j 1}^{f})}^{T} x_{j 1}^{f} . \end{matrix}

Finally, the modified observation operator to assimilate the observation at time step j2 to the forecast at time step j1 is written as follows:

\begin{matrix} (22) & H^{'} = {HX}_{j 2}^{f} {({(X_{j 1}^{f})}^{T} X_{j 1}^{f})}^{- 1} {(X_{j 1}^{f})}^{T} . \end{matrix}

Appendix A explains that directly assimilating the observation at a certain time step by the modified observation operator is the same as assimilating at the time of observation and then calculating the time evolution after the assimilation. Thus, this method is regarded as a kind of 4D assimilation including the information on the time development. Another advantage of this method is that future observations can be assimilated, as it is similar to the Kalman smoother. In this study, this extended LETKF with 4D assimilation is used. The time interval (called the “assimilation window”) between the observations and the analysis is one of the setting parameters.

The EnKF initial condition is obtained using the time-lagged method as follows. First, a 6-month free run is performed from a climatological restart file for 1 June. The results from the free run over about 10 d with a center of 1 January are used as the initial condition for each ensemble member on 1 January. For the runs with 30 ensemble members, 30 initial conditions at a time interval of 6 h are used. For runs with 90 and 200 ensemble members, the time intervals for the initial conditions are taken as 4 and 2 h, respectively. The analysis data for the first 10 d of the assimilation are regarded as a spin-up and are hence not used to examine the assimilation performance.

2.4 The method of parameter validation in the data assimilation system

As already mentioned, the parameter set of data assimilation usually made for the troposphere and stratosphere is not necessarily appropriate for the analysis when the MLT region is included. This is because the dominant physical processes and scales of motions could be different (e.g., Shepherd et al., 2000; Watanabe et al., 2008). This section describes the parameters that should be optimized for the data assimilation system for the whole neutral atmosphere from the troposphere up to the lower thermosphere. The relevance criteria of the data assimilation for each parameter are also described.

Table 1Parameter settings for sensitivity tests. Boldface shows the difference from the control (the first line). The control setting is equivalent to DB, P0.7, G20, L600, I15, W6, and M30.

Download Print Version | Download XLSX

The parameters included in the data assimilation system are divided into two categories. The first category includes two parameters describing the GCM: the horizontal diffusion coefficient and the factor of gravity wave source intensity in the gravity wave parameterization. The second category includes five parameters related to the data assimilation: the degree of gross error check, the localization length, the inflation factor, the length of assimilation window, and the number of ensembles. The sensitivity of the performance of assimilation is tested by changing one parameter among the standard set of the parameters as shown in Table 1. Finally, the performance of the assimilation with the best set of parameters is confirmed.

The criteria used for the evaluation of the data assimilation for each parameter setting are observation minus forecast (OmF) and observation minus analysis (OmA) in the observational space. One more criterion for examining the quality of data assimilation is χ², which was introduced by Ménard and Chang (2000):

\begin{aligned} χ^{2} = t r ({YY}^{T}), \\ Y = \frac{1}{\sqrt{m}} (y - H ({\overline{x}}^{f})) {({HE}^{f} {({HE}^{f})}^{T} + R)}^{- \frac{1}{2}} . \end{aligned}

The parameter χ² describes the consistency between the innovation with the covariance matrices for the model forecast and the observations. The χ² values should be close to 1 if the background and observation errors are properly specified in the assimilation system. The χ² values higher (lower) than 1 mean that the background or observation error has been underestimated (overestimated) against the innovation in the observational space.

3 Results

In this section, two types of parameter sensitivity experiments are performed. One is a parameter tuning of the forecast model to reduce the systematic biases of the model in the MLT region. The other is an optimization of parameters related to the data assimilation module. Table 1 summarizes the experiments that we performed, and the best parameter set among them is shown as “Ctrl”. The grounds for regarding this parameter set as the best are described in detail in the following subsections. It is also worth noting that we tested many parameter sets other than those shown in Table 1 that did not work due to computational instability.

3.1 Forecast model improvement

To reduce the model bias in the mesosphere, the vertical profile of the horizontal diffusion coefficient and the gravity wave source intensity in the non-orographic gravity wave parameterization are examined by comparing observations in the summertime Antarctic mesosphere. Here, the zonal wind observed by an MST radar called the PANSY radar in the Antarctic (Sato et al., 2014) is used as a reference of the mesospheric wind. Note that the temporal and longitudinal variation of the dynamical field is relatively small in January and February in the summertime Antarctic mesosphere. The model performance may depend on the parameters describing the MLT processes, although we used default values of the model for this study. For example, climatological concentrations of chemical species are used for the calculation of the radiative heating rate, although the O₃ and NO concentrations are affected by the solar activity in a short timescale. The effects of ion drag are neglected because it is important mainly above the height of ∼ 200 km. The chemical heating caused by the recombination of atomic oxygen is incorporated using a global mean vertical profile of its density, and we neglected spatial and temporal changes.

3.1.1 Horizontal diffusion coefficient

The downscale energy cascade from resolved motions to unresolved turbulent motions is represented by numerical diffusion in most atmospheric models. A fourth-order horizontal diffusion scheme is used in the present version of the JAGUAR to prevent the accumulation of energy at the minimum wavelength. However, it is difficult to directly constrain the horizontal diffusion coefficient with observational data. In the present study, the horizontal diffusion coefficient is set to be constant up to the lower mesosphere and then exponentially increase above to reproduce realistic temperature and wind structures. As the horizontal diffusion in the model top is sufficiently strong to damp small-scale disturbances including (resolved) gravity waves, a sponge layer, which is usually included at the uppermost layers of GCMs, is not used in the model.

To optimize the tuning parameters of the forecast model, a series of free-run experiments with three different profiles of horizontal diffusion coefficients are performed. The impact of the difference in the horizontal diffusion coefficient is examined, focusing on the zonal mean zonal wind field. All experiments are started with the same initial conditions, which are obtained from a free-run simulation with climatological external conditions (hereafter referred to as “the climatological simulation”).

https://gmd.copernicus.org/articles/13/3145/2020/gmd-13-3145-2020-f01

Figure 1The vertical profiles of the horizontal diffusion coefficients given in the forecast model. Profile B was used for the data assimilation.

An ensemble Kalman filter data assimilation system for the whole neutral atmosphere

2.1 Forecast model

2.2 Measurements used in the assimilation

2.2.1 PREPBUFR

2.2.2 Aura MLS

2.3 Data assimilation system

2.4 The method of parameter validation in the data assimilation system

3.1 Forecast model improvement

3.1.1 Horizontal diffusion coefficient

3.1.2 Gravity wave source intensity

3.2 Aura MLS bias correction

3.3 Data assimilation setting optimization for 30 ensemble members

3.3.1 Gross error coefficient

3.3.2 Localization length

3.3.3 Inflation coefficient

3.3.4 Assimilation window length

3.3.5 Comparison of a series of sensitivity tests for data assimilation with 30 ensemble members

3.4 The effect of ensemble size and an estimate of the optimal ensemble size for the data assimilation in the middle atmosphere

4.1 Comparison with other reanalysis data

4.2 Comparison with MST and meteor radar observations