A Local Particle Filter and Its Gaussian Mixture Extension Implemented with Minor Modifications to the LETKF
 ^{1}RIKEN Center for Computational Science, Kobe, Japan
 ^{2}Center for Environmental Remote Sensing, Chiba University, Chiba, Japan
 ^{3}PRESTO, Japan Science and Technology Agency, Chiba, Japan
 ^{4}RIKEN Interdisciplinary Theoretical and Mathematical Sciences Program, Kobe, Japan
 ^{5}RIKEN Cluster for Pioneering Research, Kobe, Japan
 ^{6}Japan Agency for MarineEarth Science and Technology, Yokohama, Japan
 ^{7}Department of Atmospheric and Oceanic Science, University of Maryland, College Park, Maryland, USA
 ^{8}Meteorological Research Institute, Japan Meteorological Agency, Tsukuba, Japan
 ^{9}Deutscher Wetterdienst, Offenbach, Germany
 ^{10}Applied Mathematics, University of Reading, UK
 ^{1}RIKEN Center for Computational Science, Kobe, Japan
 ^{2}Center for Environmental Remote Sensing, Chiba University, Chiba, Japan
 ^{3}PRESTO, Japan Science and Technology Agency, Chiba, Japan
 ^{4}RIKEN Interdisciplinary Theoretical and Mathematical Sciences Program, Kobe, Japan
 ^{5}RIKEN Cluster for Pioneering Research, Kobe, Japan
 ^{6}Japan Agency for MarineEarth Science and Technology, Yokohama, Japan
 ^{7}Department of Atmospheric and Oceanic Science, University of Maryland, College Park, Maryland, USA
 ^{8}Meteorological Research Institute, Japan Meteorological Agency, Tsukuba, Japan
 ^{9}Deutscher Wetterdienst, Offenbach, Germany
 ^{10}Applied Mathematics, University of Reading, UK
Abstract. A particle filter (PF) is an ensemble data assimilation method that does not assume Gaussian error distributions. Recent studies proposed local PFs (LPFs), which use localization as in the ensemble Kalman filter, to apply the PF for highdimensional dynamics efficiently. Among others, Penny and Miyoshi developed an LPF in the form of the ensemble transform matrix of the Local Ensemble Transform Kalman Filter (LETKF). The LETKF has been widely accepted for various geophysical systems including numerical weather prediction (NWP) models. Therefore, implementing consistently with an existing LETKF code is useful.
This study developed a software platform for the LPF and its Gaussian mixture extension (LPFGM) by making slight modifications to the LETKF code with a simplified global climate model known as Simplified Parameterizations, Primitive Equation Dynamics (SPEEDY). A series of idealized twin experiments were accomplished under the ideal model assumption. With large inflation by the relaxation to prior spread, the LPF showed stable filter performance with dense observations but became unstable with sparse observations. The LPFGM showed more accurate and stable performances than the LPF with both dense and sparse observations. In addition to the relaxation parameter, regulating the resampling frequency and the amplitude of Gaussian kernels was important for the LPFGM. With a spatially inhomogeneous observing network, the LPFGM was superior to the LETKF in sparsely observed regions where the background ensemble spread and nonGaussianity are larger. The SPEEDYbased LETKF, LPF, and LPFGM systems were available as opensource software on Github (https://github.com/skotsuki/speedylpf) and can be adapted to various models relatively easily like the LETKF.
Shunji Kotsuki et al.
Status: final response (author comments only)

RC1: 'Comment on gmd202269', Anonymous Referee #1, 05 Jul 2022
I think this is an interesting article, which shows the potential of particle filters (and particularly two flavours of localised particle filters) in large scale atmospheric applications. The authors build on an existing LETKF computational frameword to implement and experiment with these localised particle filters. The implementation is very nicely detailed with schematics and direct comparisons of the algorithms, including stepbystep contrasts, and quantifying the computational costs.
I have some minor corrections and questions I would like to have answered before I can recommend this article for publication.
Short summary:
 Quantifying millions of observations does not mean much if one does not quantify the number of state variables. I would eliminate 'millions'.Introduction:
 Cite van Leeuwen 2021 (a consistent interpretation of the stochastic EnKF) when discussing perturbed observations.
 Cite Wang et al 2004 next to Bishop 2001. After all, it is the symmetric transform variant which is used in practice.
 Line 62. Even PFs with proposal densities collapse, although at a slower rate.
 Line 69. How do local PFs ensure continuity between particles in different regions? Maybe mention something brief on this regard.Methodology
 Lines 9697. The sentence 'Transform matrix...' could be rephrased to avoid word repetition.
 Line 147. Why write H(x)y in the Gaussian likelihood. You had used yH(x) when describing the EnKF. It is better to keep consistency in the paper.
 Line 205. Can there be a similar 'adaptive inflation' as Miyoshi's (and previously Anderson's) for the weights of the LPF?
 Line 225. Does the LPFGM reduce to the LPF when \gamma>0?
 Figure 2 is really nice an illustrative of the differences between the LPF and the LPFGM.Experimental settings
 Line 298. There is an incomplete sentence starting with 'Although... '
 Line 805. Are these identical twin or fraternal twin experiments? I.e., is there model error?
 Line 825. Why was this specific variable chosen to tune the localisation scales?
 Line 830. Why was T chosen? I would think that, if you want to exploit the PF advantage, you would choose a variable like relative humidity.
 Line 880. In the regular PF, the number of particles to avoid filter collapse has to be of the order exp(Neff^2), where Neff is the effective size of the problem, according to Snyder et al 2008. Then van Leeuwen and Ades (2013) showed that Neff is proportional to the number of independent observations. I therefore thought that filter collapse would happen with the sparse observation network. It is clear that filter collapse and filter divergence are different then. Could you explain the difference and the mechanisms leading to them?
 Line 890. Why does frequent resampling lead to filter divergence? Does this have to do with the spacewise continuity of the fields not being ensured?
AC1: 'Reply on RC1', Shunji Kotsuki, 07 Sep 2022
We are very grateful to the reviewer for her/his careful reviews and kindly giving us valuable and constructive comments and suggestions that we have generally accepted. Here, we provide our pointbypoint responses to each of the comments. The revisions are highlighted by red in the revised manuscript. Supplemental PDF file would be useful to check revisions and their corresponding comments

AC1: 'Reply on RC1', Shunji Kotsuki, 07 Sep 2022

RC2: 'Comment on gmd202269', Anonymous Referee #2, 13 Aug 2022
General comments:
The authors present a useful incremental step in developing a practical particle filter based methodology that can be scaled to dimensions required for realistic operational forecasting. The results show a narrow range of parameters for the augmented LPF (i.e. the LPFGM) that can outperform a tuned LETKF implementation using a lowresolution global primitive equations model.
I think the manuscript would be improved if the authors could highlight a specific example case/scenario in which the dynamics require a nonGaussian method. At its simplest, this could simple be to focus in on a regional assessment of the Southern Hemisphere East Pacific Ocean RAOB case (or the Arctic, using stenographic project map), where large differences were found between LETKF and LPFGM. It would be useful to see RMSE and ensemble spread statistics calculated for this region alone, and perhaps a bit more detailed assessment of this region as a ‘case study’. This could provide further motivation for finding cases in which the LPFGM enhancement to the LETKF software architecture could provide significant value.
More strength could be given to the argument for using this method if it could be made clear that such scenarios are common occurrences in operational systems (or perhaps in reanalysis systems that focus on less observed periods in history). Are there any scenarios for a coupled Earth system forecasting system in which particular regions or physical quantities are particularly poorly observed that might benefit from this method? Do the authors see applications in reanalysis such as CERA20C and the NOAA 20th Century Reanalysis that might improve historical reconstructions during relatively sparsely observed periods? Are there any applications in the modern satellite era where this approach could still be advantageous?
Specific comments:
L 19:
“Therefore, implementing [the LPF] consistently with an existing LETKF code is useful.”
L 21:
“This study develop[s]”
L 25:
“ The LPFGM showed more accurate and stable performances than the LPF with”
Change to:
“ The LPFGM showed more accurate and stable performance than the LPF with”
L 29:
“The SPEEDYbased LETKF, LPF, and LPFGM systems [are] available”
L 37:
“Ensemble[based] data assimilation (DA)”
L 37:
“such as weather and ocean predictions.”
Change to:
“such as weather and ocean prediction.”
L 44:
“models shows local low dimensionality ”
Change to:
“models show local low dimensionality ”
L 44:
“and practical EnKF with these systems uses”
Change to:
“and practical EnKF implementations use”
L 45:
“that limits the impact of distant observations within a local area and reduces the effective degrees of”
Change to:
“that limits the impact of distant observations while also reducing the effective degrees of”
L 52:
“issues in [the basic assumptions of the] EnKF [by permitting] nonlinear observation operator[s] and”
L 57:
“particles or the ensemble size [must] be increased”
L 60:
“equivalentweights particle filter (ETPF”
Is this the correct acronym? It seems that EWPF would be more appropriate.
L 64:
“method [for the] EnKF”
L 7677:
“, and the source code is accessible at https://github.com/takemasamiyoshi/letkf.”
It seems to me that this belongs in the data availability section, or something similar, toward the end of the document.
L 100:
It looks a bit odd to ‘divide’ the matrix by beta  I think it would be more clear to write ((m1)/beta) before the matrix “I”.
L 103, equation (5):
The second equality of equation (5) is perhaps not obvious. It would be helpful to add a step or two showing this relationship. (For example, showing how the relationship can be derived from the first equality in eqn. 5 and the definition in equation 6.)
It seems a bit odd the way it is written here because Pa is defined using K.
L 123:
“and relaxation to prior”
Do you use “relaxation to prior spread”, or “relaxation to prior perturbations”?
I would suggest:
“and a relaxation to prior scheme (Zhang et al., 2004) as implemented by Whitaker and Hamill (2012)”
L 145:
It might be worth mentioning that this implies that the observation error distribution is assumed Gaussian.
L 146, equation 12:
It looks like the “T” transpose operator is bolded.
L 167, equation 19:
Could you provide some explanation of condition #1. The “m” isn’t defined, nor are the indices i,j explained. The meaning of the superscripts is not explained.
L 185:
“ In addition, this stochastic approach approximates Eq. (19) using the MonteCarlo approach.”
Nice approach.
L 185186:
“The generated transform matrix with 200 samples (Fig. 1 c) is close to that with 10,000 samples (Fig. 1 d) in the case of 40 particles. ”
I’m curious how this Monte Carlo approach scales with the ensemble size. How does the performance (e.g. accuracy/stability of the method and the computational costs) change as the ensemble size gets large?
L 189:
“The effective particle size N_eff”
This terminology is a bit confusing, as a particle is analogous to an ensemble member. I would either say “the effective ensemble size” or “the effective number of particles”.
L 195, equation 23:
Is this effectively a relaxation back to equal weights? I think it would be helpful to explain this more, especially since later it is indicated that only values of tau as 0 and 1 are used. What then is the implication of these two choices?
L 203:
“Therefore, the LPF usually applies inflation to the posterior particles (e.g., Farchi and Bocquet 2018).”
Change to:
“Therefore, the LPF usually applies inflation to the posterior particles (e.g., Penny and Miyoshi, 2016; Farchi and Bocquet 2018).”
Penny and Miyoshi (2016) also applied a form of additive inflation to the posterior:
“at each cycle we add Gaussian noise with variance scaled locally to a magnitude matching the analysis error variance and apply this to each analysis member prior to the subsequent ensemble forecast. The amplitude of the additive noise was chosen to conform to the dynamics of the growing error subspace, as estimated by the analysis ensemble spread.” (Penny and Miyoshi, 2016)
L 206:
“based on [the] authors’ preliminary experiments”
L 214:
“The Gaussian mixture extension of the LPF is [one] such hybrid algorithm”
L 218:
“hut”
Change to:
“hat”
L 223:
How does increasing gamma reduce the amplitude? Is there a relationship between gamma and the number of ensemble members?
L 226, equation 26:
Am I interpreting correctly that the Kalman filter is applied to each participle independently, using the forecast error covariance estimated from the entire ensemble? Or, is it applied to the adjust the mean of the individual particles like an ETKF?
L 228:
I see that K^ is defined here, but I do not see a definition for Pa^. It seems that Pb^ is a scaled version of the ensemble forecast error covariance, but I’d like to see a more precise discussion of how K^ is formed  is it the same for every particle, or does each particle have a different K^?
If all of the particles have the same K^, then how is this different than applying the standard EnKF update to the mean and perturbations? Please provide more discussion on these points.
L 235:
Do you mean T_{t,GM}?
L 241:
What is WP22? Can you just provide the full reference?
L 247, equation 33:
In practice, would there be any benefit to forming and applying the T_{t,LPFGM} matrix directly?
L 249:
“may appear [to use] the same observations twice”
L 290:
You might reiterate here that typically N_MC << p_L.
L 298:
“Although running the SPEEDY model is, the model contains The SPEEDY model”
This looks like some kind of typo or grammatical mistake.
L 307:
“Gaussian noises were added”
Change to:
“Gaussian noise was added”
L 308:
“data whose interval is 6 h”
Change to:
“data at 6 h intervals”
L 311:
“every seven layers”
Change to:
“all seven layers”
L 311312:
“The ensemble size is 40 and their initial conditions were taken from an independent nature run.”
This is not clear  was a freerunning ensemble used as the nature run? Or was the initial ensemble sampled from a single deterministic nature run? Please be more precise.
L 320:
“ LETKF were optimized manually”
Change to:
“ LETKF were tuned manually”
If this was done manually, then it is likely not optimized.
L 417418:
“An alternative method of inflation is adding random noise to the transform matrix (Potthast et al., 2019). However, regulating the amplitude of the random noise was not trivial in the authors’ preliminary experiments with L96 (not shown).”
Have the authors tried inflation method as detailed by Penny and Miyoshi (2016). There the amplitude was specified by the local analysis error variance, and could be interpreted as noise applied to the transform matrix (it is applied to the analysis perturbations).
L 446:
“This study considered two experimental settings with and without weight succession”
It seems like at least one case in between (e.g. tau = 0.5) should be tried to give some idea of the benefit of this weighting method.
L 456:
“optimal localization scale ”
Again, if not mathematically optimized, then I would change this to:
“best performing localization scale”
L 461:
“forecast in a sparsely observed regions”
Change to:
“forecast in sparsely observed regions”
L 700, figure 9:
Please use white instead of yellow for the “near zero” values in panels (d) and (e).
(Same comment for Figure 12)

AC2: 'Reply on RC2', Shunji Kotsuki, 07 Sep 2022
We are very grateful to the reviewer for her/his careful reviews and kindly giving us valuable and constructive comments and suggestions that we have generally accepted. Here, we provide our pointbypoint responses to each of the comments. The revisions are highlighted by red in the revised manuscript. Supplemental PDF file would be useful to check revisions and their corresponding comments

AC2: 'Reply on RC2', Shunji Kotsuki, 07 Sep 2022
Shunji Kotsuki et al.
Shunji Kotsuki et al.
Viewed
HTML  XML  Total  BibTeX  EndNote  

349  78  12  439  2  1 
 HTML: 349
 PDF: 78
 XML: 12
 Total: 439
 BibTeX: 2
 EndNote: 1
Viewed (geographical distribution)
Country  #  Views  % 

Total:  0 
HTML:  0 
PDF:  0 
XML:  0 
 1