Submitted as: development and technical paper 23 Feb 2021
Submitted as: development and technical paper  23 Feb 2021
Efficient ensemble generation for uncertain correlated parameters in atmospheric chemical models
 ^{1}Institute for Energy and Climate Research  Troposphere (IEK8), Forschungszentrum Jülich, Germany
 ^{2}Rhenish Institute for Environmental Research at the University of Cologne, Germany
 ^{3}Institute of Geophysics and Meteorology, University of Cologne, Germany
 ^{1}Institute for Energy and Climate Research  Troposphere (IEK8), Forschungszentrum Jülich, Germany
 ^{2}Rhenish Institute for Environmental Research at the University of Cologne, Germany
 ^{3}Institute of Geophysics and Meteorology, University of Cologne, Germany
Abstract. Atmospheric chemical forecasts highly rely on various model parameters, which are often insufficiently known, as emission rates and deposition velocities. However, a reliable estimation of resulting uncertainties by an ensemble of forecasts is impaired by the highdimensionality of the system. This study presents a novel approach to efficiently perturb atmosphericchemical model parameters according to their leading coupled uncertainties. The algorithm is based on the idea that the forecast model acts as a dynamical system inducing multivariational correlations of model uncertainties. The specific algorithm presented in this study is designed for parameters which depend on local environmental conditions and consists of three major steps: (1) an efficient assessment of various sources of model uncertainties spanned by independent sensitivities, (2) an efficient extraction of leading coupled uncertainties using eigenmode decomposition, and (3) an efficient generation of perturbations for highdimensional parameter fields by the KarhunenLoéve expansion. Due to their perceived simulation challenge the method has been applied to biogenic emissions of five trace gases, considering statedependent sensitivities to local atmospheric and terrestrial conditions. Rapidly decreasing eigenvalues state high spatial and crosscorrelations of regional biogenic emissions, which are represented by a low number of dominating components. Consequently, leading uncertainties can be covered by low number of perturbations enabling ensemble sizes of the order of 10 members. This demonstrates the suitability of the algorithm for efficient ensemble generation for highdimensional atmospheric chemical parameters.
Annika Vogel and Hendrik Elbern
Status: final response (author comments only)

CEC1: 'Comment on gmd202126', Astrid Kerkweg, 29 Mar 2021
Dear authors,
in my role as Executive editor of GMD, I would like to bring to your attention our Editorial version 1.2:
https://www.geoscimodeldev.net/12/2215/2019/
This highlights some requirements of papers published in GMD, which is also available on the GMD website in the ‘Manuscript Types’ section:
http://www.geoscientificmodeldevelopment.net/submission/manuscript_types.html
In particular, please note that for your paper, the following requirements have not been met in the Discussions paper:
 "The main paper must give the model name and version number (or other unique identifier) in the title."
 "If the model development relates to a single model then the model name and the version number must be included in the title of the paper. If the main intention of an article is to make a general (i.e. model independent) statement about the usefulness of new development, but the usefulness is shown with the help of one specific model, the model name and version number must be stated in the title. The title could have a form such as, “Title outlining amazing generic advance: a case study with Model XXX (version Y)”.''
As you are using EURADIM to show the performance of the ensemble, please add something like “a case study using EURADIM version x.y” to the title of your manuscript.
Additionally, please note, that as you are using EURADIM to produce the results shown in your article, the information how to access the EURADIM code is also required, including the permanent archiving of the exact EURADIM version the results of this articles have been created with.
Yours, Astrid Kerkweg

RC1: 'Comment on gmd202126', Anonymous Referee #1, 06 Apr 2021
The comment was uploaded in the form of a supplement: https://gmd.copernicus.org/preprints/gmd202126/gmd202126RC1supplement.pdf

RC2: 'Comment on gmd202126', Anonymous Referee #2, 12 Apr 2021
Overview:
The paper presents a method to construct ensembles with a small number (10) of members that still represent the uncertainty in ensemble applications of CTMs. The method consists of the following three steps: a) sensitivity estimation, b) eigenmode decomposition of the sensitivity information and c) generation of the ensemble based on the KarhunenLoéve expansion. The method is illustrated with the construction of an isoprene emission ensemble derived for one day.
General remarks:
The paper contains a derivation of the method in a generalised way and a discussion of its application to create an ensemble of biogenic isoprene emissions. But, it is a considerable weakness of the paper that not enough evidence is given that the ensembles generated by the method would indeed capture the main component of the uncertainty. Without this evidence, the paper has little relevance. Giving such evidence is not trivial. I can only suggest two aspects but there might be other options: 1) use the derived ensemble of isoprene emissions in an actual CTM ensemble application. 2) provide more evidence that the ensembles created using either the combined or independent sensitivities approach lead to similar results.
Of the three steps a), b) and c), a) seems to be the most interesting because it addresses the interesting questions how many model runs are required to capture the sensitivity of model result to variation of model configuration choices. By assuming linearity, the authors derive that only a reference model run and a model run with the a modification of one aspect at a time (i.e. alternative land use map, alternative deposition scheme etc.) is required and that crosscombinations of model configurations (i.e. alternative land use map AND alternative deposition scheme) are not required. Hence, the number of the required runs does not have to be all possible permutations of configuration options but only the number of tested configurations itself. Intuitively speaking, that seems obvious because the sensitivities are assumed to be linear. This means that strong nonlinearities as typical for NWP dynamics or atmospheric chemistry may not be suited for the method. A better discussion of the limitations of the choice of linearity is required.
Section 2, which describes the method (a, b, c) uses a sophisticated mathematical nomenclature, which may be more confusing than enlightening for the typical GMD reader. The chapter is difficult to understand, and it should be made clearer what the main novel ideas are, and what are just mathematical definitions. For example, formulas 3a and b and 4a and b are in my opinions simple repetitions of the well known formulas for covariance and mean. I acknowledge that it is a matter of taste how “mathematical” a paper should be formulated but the mathematical formulae need to convey a specific message relevant to the objective of the paper.
The applied terms of “model argument”, “model parameter”, “model input” and “model implementation” are confusing. For example, I think that “model configuration” is a better term for what is meant by “model argument”.
Section 3 requires more clarity on what its objective and scientific content is. For example, it should be made much clearer that one of its main purposes is to compare the combined and the independent sensitivities approach. Further, concrete evidence of the successfulness of the method (see above) should be much more the focus of that section. Finally , it would be beneficial to provide a more sciencebased discussion of the soundness of the derived isoprene emissions ensemble.
The discussion and conclusion section is very long. I would strongly recommend to introduce a separate and more concise conclusion section.
Figure captions need to provide more detail to make sure the figures can be understood without a lot of additional information. Acronyms in the figures should be spelled out.
Specific remarks:
l3 replace “by” with ”with”
l5 make clearer what is novel about this approach (w.r.t application and method)
l12 High spatial correlation of simulated biogenic emissions is not a surprise
L 14 please compare 10 to the “uncondensed” number or ensemble member.
L 15 please provide more evidence that the 10member ensemble indeed captures the uncertainty.
L 2040: Most of the discussions here is on NWP ensemble construction. The aim for NWP is to get realistic error growth with increasing lead time. That is not the primary objective for the CTM applications. Perhaps you could reference here to Reduced Rank Kalman Filter approaches for CTM.
L 44 it is not just the larger state vector which distinguishes NWP from CTM ensembles but the error growth characteristics.
L 65 Please clarify what differentiates KL from singular vector decomposition methods.
L 78 Please clarify in more detail the terms “model parameters” and “model arguments” and “model inputs”
L 85 The choice of the “differently configured model simulations” is in my opinion the most interesting aspect of the paper. Please expand if this choice is guided by the algorithm or by the (human) users.
L 81 “model inputs and configurations” that are very vague terms. Please provide more detail
L 93 is it better to say the leading “uncoupled” ?
L 98 please explain here how C is related to ensemble generation
L 99 clarify “model parameters” (all model grid points, a set of 20 land use classes ? )
L 108  is model parameter a state vector element?
L 112 It is not quite clear, why any model parameters can be considered stochastic
L 114 “i” was defined as model argument. parameters are Q (?)
L 125 is this triviality of importance later?
L 126 the log distribution may need a bit more motivation. It may be justified for positivedefinite variables such as concentration but not for all possible parameters.
L 146 I do not understand the subsample argument here. Is not the same as choosing fewer arguments and implementations? If this is part is not referred to further down, I would simply omit it for the sake of clarity.
L 159 The assumption of linearity is an important one. It contradicts the notion that sensitivities are nonlinear. This mathematical choice need to be discussed for its realism for any specific application and choice of model arguments.
L 180 I am not sure, if I understand all the need for the mathematical derivations here. Is not obvious that the assumption of linearity makes it not necessary to investigate the nonlinear combined sensitivities. It seems not a result but an direct choice of assuming linearity?
L 187 The assumption of tangentlinearity might also be a strong limitation for many atmospheric applications
L 220 Please clarify what make KL superior to more standard methods to generate ensembles using singular vectors.
L 240 Was the exp transformation prone to producing unrealistically high sensitivity factors?
L 255 A bit more detail on the generalisation of the emission factors to model uncertainty parameter would be interesting.
L 255 “KL ensemble approach” I think the method is used to generate a ensemble. It is not yet an ensemble approach, which would mean the model simulation using the ensemble (emissions) itself. Please clarify what you mean.
L 265 Not clear what you mean “evaluation” ?
L 282 Please explain here that the reference is the default EURAD configuration and (2) the other configuration option.
L 285 Please discuss the plausibility of the shown sensitivity factors.
L 288 It is not clear what you mean by panel 1, 2
L 297 The additional MEGAN uncertainty might useful for an application but for the paper it may complicates the comparison of the combined and independent sensitivities
L 324 Figure 7 should be referenced after Fig 6
L 339 The sentence is not clear. Does it mean the resulting ensemble members differ a lot between independent and combined method. That is not an evidence that the reduction in sensitivity runs works, quit the opposite.
L 361 It should be made clearer that the assumption of linearity is the main reason for reduction of the sampling space, which is no surprise.
L 366 Please clarify what you mean by “performance of the ensemble”
L 370 Please clarify, if this has been demonstrated by the presented practical results, or if this is only an assumption.
L 379 Please explain the context of the assumed number of considered arguments and realisations
Annika Vogel and Hendrik Elbern
Model code and software
KarhunenLoéve (KL) Ensemble Routines of the EURADIM modeling system Annika Vogel and Hendrik Elbern https://doi.org/10.5281/zenodo.4468571
Annika Vogel and Hendrik Elbern
Viewed
HTML  XML  Total  BibTeX  EndNote  

251  79  8  338  2  2 
 HTML: 251
 PDF: 79
 XML: 8
 Total: 338
 BibTeX: 2
 EndNote: 2
Viewed (geographical distribution)
Country  #  Views  % 

Total:  0 
HTML:  0 
PDF:  0 
XML:  0 
 1