Solid Earth Solid Earth Open Access Discussions

Abstract. Simulating pollen concentrations with numerical weather prediction (NWP) systems requires a parameterization for pollen emission. We have developed a parameterization that is adaptable for different plant species. Both biological and physical processes of pollen emission are taken into account by parameterizing emission as a two-step process: (1) the release of the pollen from the flowers, and (2) their entrainment into the atmosphere. Key factors influencing emission are temperature, relative humidity, the turbulent kinetic energy and precipitation. We have simulated the birch pollen season of 2012 using the NWP system COSMO-ART (Consortium for Small-scale Modelling – Aerosols and Reactive Trace Gases), both with a parameterization already present in the model and with our new parameterization EMPOL. The statistical results show that the performance of the model can be enhanced by using EMPOL.


Introduction
A relatively high proportion of the population in industrialized countries suffer from pollen allergies.Different allergenic plant species shed their pollen at different times of the year, leading to several pollen seasons (e.g., birch pollen season, grass pollen season).In central Europe, the most important periods for patients are the tree pollen season in spring, the grass pollen season in late spring and early summer, and the ragweed pollen season in late summer and autumn (only in regions where this invasive plant is present).Even though medication is possible, the best way to reduce allergic symptoms remains a complete avoidance of the allergens (van Moerbeke, 1997).Therefore, it is of great importance to forecast the distribution of airborne pollen a few days in advance.Currently, pollen forecasts are mainly based on pollen monitoring, weather forecasts, climatological information about the pollen seasons, and the experience of the forecaster.The available technology (Hirst type traps, Hirst, 1952) for pollen monitoring demands a great deal of manpower since the pollen grains have to be counted manually.Therefore, the density of the pollen sites is low compared to that of meteorological sites.Thus, the spatial resolution of pollen forecasts is very low (essentially, forecasts are only available at the observational sites themselves).
Recently, pollen dispersion has been integrated into numerical weather prediction (NWP) models.The knowledge of the simulated spatial and temporal evolution of pollen concentrations enables the forecaster to make nationwide predictions instead of forecasts for the pollen sites only.The prerequisite for reliable numerical simulations of pollen concentrations are (i) an up-to-date distribution map of the plant that indicates the pollen sources, (ii) a NWP system that can deal with particles (including transport, deposition, washout of the particles etc.), and (iii) a parameterization of the emission process.
Assuming that the first two prerequisites -the distribution map of the plant and the NWP system including particle dispersion -are available, this publication is focused on the emission of pollen grains, which is a combination of physical and biological processes that are characteristic for each plant species.Thus, the emission parameterization has to address two questions: (i) When does emission take place?(ii) How many pollen grains are emitted?To answer these questions, we have to look more closely at the biological and physical processes leading to pollen ripening and release.Two time scales can be distinguished that require the use of two submodels within the emission parameterization: 1.The seasonal cycle of pollen emission depends on the number of mature flowers or inflorescences, and hence on the development of the plants.It is driven by the weather conditions during the preceding weeks or months and usually is described via phenological models (e.g.Sarvas, 1974;García-Mozo et al., 2009).
2. The diurnal cycle of pollen emission is driven by the current meteorological conditions that lead to a rupture of the anthers (thus, the release of pollen grains from the flowers) and to the entrainment of the pollen grains into the atmosphere.This process happens in a time frame of seconds to hours.
Although the processes leading to pollen emission have been described in the literature, the information that can be found is mainly of a qualitative nature (e.g.Bianchi et al., 1959).Additionally, both the development of the plants and the release of the pollen are processes that are specific to each plant species.Hence, an emission parameterization that is supposed to work for different plant species has to be very flexible regarding these biological and physical processes.
In Sect.2, we briefly describe the meteorological model that is used in our study.A comparison between different pollen emission parameterizations is given in Sect.3. In Sect.4, a new parameterization for pollen emission is developed, followed by a description of a basic tuning.The application of this parameterization and a comparison of the results with an existing parameterization can be found in Sects.5 and 6.Finally, Sect.7 contains the summary and conclusions.
Throughout the publication, we have used the following terms to describe the different processes and parameters in the context of pollen emission: -anthesis: opening of the anthers.
-pollen release: release of the pollen from the anthers, either into a reservoir or directly into the atmosphere.
-pollen presentation: making the pollen grains available for entrainment into the atmosphere.Usually, this is the result of the combined processes of anthesis and pollen release.
-entrainment: uplifting of the pollen from the reservoir or from the anthers into the atmosphere.
-pollen emission: the combined processes of pollen release and entrainment into the atmosphere.This term is used when the different steps during the emission process are not distinguished.This formulation is necessary since many emission parameterizations treat pollen release and entrainment as a single process.
-pollen production: the amount of pollen that is produced per square meter and per year.
-pollen concentration: the concentration of airborne pollen, given in number of pollen grains per cubic meter of air.

The NWP model system COSMO-ART
COSMO is a non-hydrostatic regional NWP model that is used for operational weather forecasts in various European countries (Steppeler et al., 2002).Vogel et al. (2009) have developed an extension ART (Aerosols and Reactive Trace Gases) to COSMO in order to study the interaction between aerosols and the atmosphere.Information about the code availability can be found in the Supplement.Physical processes that are incorporated into COSMO-ART include transport by the mean wind, turbulent diffusion, dry and wet deposition, coagulation, condensation, washout, and sedimentation of the aerosols and reactive trace gases.ART includes, amongst others, a module to simulate the emission and dispersion of pollen grains (Vogel et al., 2008;Zink et al., 2012).The default parameterization of pollen emission follows the suggestions of Helbig et al. (2004) and Vogel et al. (2008).
At MeteoSwiss, COSMO-ART has been used as an operational tool for birch pollen forecasting since 2011, using a modified version of the Helbig et al. (2004) emission parameterization.The modification includes the description of the pollen season as well as the influences of temperature, humidity, and wind speed on the pollen emission.It is designed to better model the physiological processes in the plants.Since the quantitative relationships between the meteorological parameters and pollen emission are largely unknown, the implementation of the qualitative knowledge about plant physiology is somewhat subjective.The error between modeled and observed pollen concentrations was evaluated as a function of temperature, relative humidity and wind speed.Using these results, the meteorological functions in the emission parameterization of Helbig et al. (2004) were adapted.This was done based on observational data from Switzerland.

Available emission parameterizations
The approaches to parameterize pollen emission that can be found in the literature differ greatly in complexity.Very simple solutions use spatially and temporally uniform emission fluxes (e.g.Pasken and Pietrowicz, 2005).More sophisticated versions include current meteorological conditions and/or a curve representing the typical pollinating season (e.g.Helbig et al., 2004;Schueler and Schlünzen, 2006;Marceau et al., 2011;Sofiev et al., 2013).In addition to the current meteorological conditions, the model of Martin et al. (2010) includes previous values of relative humidity.
In this section, we describe the differences and similarities of three different emission parameterizations in some detail.These parameterizations are: -The parameterization of Helbig et al. (2004), in the following referred to as "H orig ".It was used to simulate both birch (Vogel et al., 2008) and ragweed (Zink et al., 2012) pollen concentrations.
-The parameterization that has been used for operational numerical birch pollen forecasts at the Federal Office of Meteorology and Climatology in Switzerland (MeteoSwiss), an optimized formulation of H orig .In the following referred to as "H opt " (compare Sect. 2).
-The parameterization of Sofiev et al. (2013).In the following referred to as "S13".It has been implemented into the System for Integrated modeLling of Atmospheric coMposition (SILAM) and was applied for birch pollen emission.
These are the only comprehensive parameterizations of which we are aware that are incorporated into NWP systems for the application to wider regions.The comparison is divided into three parts reflecting the nature of the parameterizations: the first part describes the seasonal cycle, the second part the daily cycle of pollen emission.Part three explains additional features of the parameterizations.A table summarizing the differences between the parameterizations can be found in the Supplement.

Description of the pollen season
The amount of pollen that is ripe and available for emission depends greatly upon the state of the pollen season.At the beginning and the end of the season only a few plants flower, hence, the amount of available pollen grains is small, regardless of the meteorological conditions.It is therefore essential to introduce a mathematical model that describes the course of the pollen season on a daily basis.H orig assumes the birch pollen season to last for 30 days and to have a fixed shape of a parabola.In H opt , a more sophisticated description of the pollen season is used.Based on measured pollen data gathered in Switzerland, the shape of the pollen curve is chosen to be positively skewed.A similarly skewed shape of the pollen curve has been reported for central Europe by Grewling et al. (2012).Additionally, the length of the season is variable depending on the temperature during the pollen season.This takes into account the fact that a warm spring season results in a short and intense birch pollen season, whereas a cool spring tends to lengthen the birch pollen season because not all birch trees flower at the same time.S13 introduces an internal model that reflects the seasonal pollen curve.It uses temperature sums to predict the start and course of the pollen season.The season ends when a certain amount of pollen has been emitted.The upswing and downswing of the pollen curve is parameterized using relaxation functions describing the probability for a single tree to bloom at a certain day.

Meteorological influences
Naturally, emission has to be linked to a velocity scale since the pollen grains have to be lifted into the air by wind currents.In H orig and H opt , the friction velocity u * is taken as the parameter influencing the amount of emitted pollen.
A threshold friction velocity u * t has to be reached in order to allow emission.The value of u * t is derived using a parameterization for dust entrainment and a meteorological correction factor (see below).Emission by free convection is not taken into account.S13 uses both the 10 m wind speed and the convective velocity scale w * to take into account both ways of pollen entrainment into the atmosphere.Theoretically, if unlimited amounts of pollen were available, higher wind speeds would yield stronger entrainment, and hence more airborne pollen.In reality, this is limited by the fact that at a certain point, the flowers will run out of pollen grains.This is not considered in H orig .H opt and S13 take this into account by introducing a threshold wind speed or a function converging to a maximal value that stops further increase of emission.
Short-term pollen emission is driven by current meteorological conditions.All three parameterizations consider these effects using correction terms.H orig and H opt use meteorological correction terms for temperature, relative humidity and wind speed that influence the value of the threshold friction velocity u * t .However, the individual terms are different for both parameterizations (cutoffs at different thresholds etc.).Both H orig and H opt , consider precipitation through washout.S13 considers relative humidity and precipitation as hindrances for emission if their values are in a certain range.In S13, temperature plays an important role through the seasonal description, short-term effects on emission are not taken into account.

Other features
H orig and H opt use plant-dependent parameters such as the leaf area index (LAI), and the height of the plants that influence the amount of emitted pollen.Their use in the model is reciprocal, leading to higher emissions for small plants and plants with less leaves.The idea behind this is that leaves keep the released pollen grains within the canopy.These parameters are left out in S13.
Resuspension is considered in H orig only.H orig and S13 use a total amount of pollen that can be produced per season.In S13, this number defines the end of the pollen season.In H orig it can shorten the pollen season if the model runs out of pollen before the prescribed end of the pollen season after 30 days.Since this feature is unwanted, the total amount of pollen has been removed in H opt .The two main problems associated with the total amount of pollen are (i) the actual number is basically unknown, and (ii) for some plants (such as birch trees) this number varies considerably by year (years with high total amounts of pollen are referred to as "mast years").Some plants (such as birch trees) produce less pollen when they grow at higher altitudes (e.g.Sveinbjörnsson et al., 1996;Gehrig and Peeters, 2000).To represent this fact in the model, a reducing factor, f Q,alt , is introduced as a function of altitude in H opt .This factor is plant-specific and takes into account the influence of the changing climatic conditions with altitude on the plant.

Development of an emission parameterization for pollen grains
The emission process varies from species to species.For example, some grasses need high relative humidities for the opening of their anthers since they have to swell in order to crack.By contrast, low relative humidities create favorable conditions for the release of birch pollen, since birch pollen's anthers open when they are dry (e.g.Fuckerieder, 1976;Puls, 1985;Keijzer et al., 1996;Matsui et al., 1999;Dahl et al., 2013).Since all of the emission parameterizations mentioned in Sect. 3 have been developed for tree pollen, the question remains as to whether they are suitable for herbaceous plants like grasses or ragweed.Since both species are allergenic and responsible for important pollen seasons, it is desirable to have an emission parameterization that is appropriate for these plants as well.Since the model system COSMO-ART is operationally used at MeteoSwiss, we have looked more closely at the two emission parameterizations associated with COSMO-ART (H orig and H opt ).The parameterizations show the following disadvantages.Two plant-dependent parameters in the emission formula are inversely proportional to the emission flux: the leaf area index and the height of the plant.For trees, the variability of these parameters lies in a range where their influence is relatively low.For plants with mean heights of 1 m or less and low leaf area indices, however, these plant-dependent parameters become prominent.Moreover, their use only makes sense if their specific values vary over the model domain or time.Otherwise, they can simply be incorporated into a tuning factor.An important problem in this respect is that the true values of both parameters cannot be known in real time for the entire model domain.Therefore, they are not suited to be incorporated into the parameterization.Additionally, the sensitivity of the simulated emission flux on the height of the plant shows the undesirable feature that small plants emit more pollen than big plants.
Meteorological influences on the emission flux are incorporated into the parameterization via a threshold friction velocity (see Sect. 3.2).Only if the friction velocity reaches this threshold is emission allowed in the model.The threshold value is not constant but includes functions of temperature, relative humidity, and wind speed.Due to the nature of the parameterization, these functions cannot be determined empirically.In our opinion, one of the main drawbacks of this emission parameterization is the mixing of biological and meteorological factors that influence different aspects of the emission process.This leads to complicated formulations that cannot be easily validated via straightforward experiments.
Consequently, we have developed a new emission parameterization, EMPOL, which is based on the biological and physical processes leading to pollen emission.It can easily be adapted to different plant species by adjusting just a few key factors.Plant-dependent values that are variable over the model domain, such as the actual height of plants, are omitted.The meteorological influences are designed in a way that allows simple experimental determination of the corresponding functions.
The description of the pollen season is not part of EM-POL but is taken from the seasonal model developed for H opt (compare Sect. 3.1).It is read into the model as an input parameter.

Basic concepts
Given that the pollen season has started (in other words: plants are ready to release pollen), the emission of pollen can be seen as a two-step process: first, changes in the meteorological conditions lead to a rupture of the pollen sacs (anthesis).Pollen grains are released from the flowers.Second, the pollen grains that are now exposed to air motions can be entrained into the atmosphere.We have adopted this view by dividing our emission parameterization into two parts: 1. Depending on biological and meteorological conditions, a certain number of pollen grains are released from the flowers and fill up a pollen reservoir.Botanically, this process is called pollen presentation.
2. If meteorological conditions are favorable, pollen in the reservoir are entrained into the atmosphere.
Figuratively, the pollen reservoir can be seen as a surface where the pollen grains rest after being released from the anthers, e.g., a leaf.Such a process has been described by Bianchi et al. (1959) for ragweed; most of the pollen first fall onto the foliage below the flowers before being entrained into the atmosphere.However, this descriptive view should not be exclusive: if the conditions are favorable, pollen grains can be released from the flowers and entrained directly into the atmosphere.In that case, the "reservoir" should only be seen as a way to describe the fact that pollen grains have to be made available before they can be carried into the atmosphere.
Conceptually, the parameterization can be described as follows: a constant factor, Q pollen,day , gives the maximal daily amount of pollen that could be released at the height of the pollen season on one square meter if the conditions for pollen release and entrainment were perfect.All other factors take values in the range between 0 and 1 and describe resistances to the pollen release.These factors include, e.g., unfavorable meteorological conditions, but also a low plant coverage, or an early or late date in the pollen season.
The reservoir R pollen is made up of the pollen that are released from the flowers at that very time step ( R pollen ) and the pollen that were left in the reservoir after the previous time step (R pollen,old ).We assume that the pollen in the reservoir can be lost due to random processes (such as animals brushing against the plant and causing pollen to fall to the ground).This loss is described in the factor random .Additionally, rain washes out a specific portion of the reservoir which is described via the function precip .Combining the factors described above, the content of the reservoir is given by (1) The amount of pollen that the flowers release into the reservoir at each time step is given by (2) The factor plant combines the plant-specific variables that define the amount of pollen that could be released per time step under perfect meteorological conditions at a given grid point.It consists of the figure Q pollen, t (calculated from Q pollen,day ) describing the maximum amount of pollen that could be released per time step and per square meter if the grid box was totally covered with the specific plant in the perfect growing state.This maximum number is reduced by factors describing the percentage of ground actually covered with the specific plant (f Q,cov [0, 1]), the course of the pollen season using a mathematical description f Q,seas [0, 1] (see Sect. 3.1), and the influence of the altitude on the productivity of the plants (3) The meteorological influences on pollen release are described via mathematical functions: (4) These equations describe the probability that anthesis will happen under the current meteorological conditions.Up to now, only temperature and relative humidity have been considered.Since the processes leading to pollen release are slightly different for different plant species, these functions are plant-dependent and need to be adapted for the different species.
Additionally, a switch, biol , is introduced that turns off pollen release as soon as a certain daily amount of released pollen is reached.Under optimal conditions, all ripe pollen could potentially be released before the end of the day.Once the flowers have run out of ripe pollen grains, pollen release will stop even if good conditions continue.
The second part of the emission parameterization describes the entrainment of the pollen from the reservoir into the atmosphere.This process is mainly driven by meteorological conditions, namely moisture and turbulence.The pollen flux is given by with t being the time step of the simulation.It is assumed that the pollen are dispersed instantaneously and homogeneously within the grid box.The pollen concentration mainly depends on the amount of turbulence that lifts the pollen into the air.This is considered through the function f E,TKE [0, 1].
In moist conditions, entrainment is reduced as pollen grains tend to stick to the surface.The strength of this constraintf E,RH [0, 1] -is given as a function of the relative humidity.
If pollen grains are released from the anthers in an explosive manner (compare Rohwer, 1993) and entrained into the atmosphere regardless of the turbulent kinetic energy and relative humidity, these two functions f E,TKE and f E,RH can be set to a fixed value of one.In that case, the pollen reservoir will not be filled, but all pollen released from the anthers will be entrained into the atmosphere directly.
A flowchart in the Supplement shows the different steps of the parameterization and the influences for each of these steps.If EMPOL is used for other pollen species, the plant-specific values in the following parameters need to be adapted: Q pollen,day , Q pollen, t , f R,T , f R,RH , f E,TKE , f E,RH , random , and precip .Additionally, the following input fields/values need to be provided: f Q,cov , f Q,seas , f Q,alt , the pollen diameter, and the pollen density.

Tuning of the emission parameterization
One of the ideas behind EMPOL is the possibility to deduce the main parameters via dedicated experiments (e.g.Michel et al., 2012).In a laboratory, it should be possible to measure the functions relating the amount of released/entrained pollen and specific environmental conditions (e.g.temperature) while keeping the remaining parameters constant.For lack of such experimental data, we had to formulate the missing functions in the parameterization on the basis of measured pollen concentrations.EMPOL contains several parameters that have to be tuned: (April 2010 and April 2011) to improve this first guess.It should be noted that -using birch pollen measurements as a basis -the formulations and values described here are valid only for birch.It should also be kept in mind that using the full modeling system and measured pollen concentrations at this stage introduces some strong assumptions.The most important of these is that any atmospheric transport and dispersion processes are disregarded, i.e., larger pollen concentrations are solely due to larger emissions.The present exercise, therefore, is not more than a "second guess" and still leaves room for improvement based on a true parameter tuning exercise in which the error is minimized by varying all the parameters; this will be performed as soon as corresponding data are available.The present tuning only attempts to render the resulting emissions in a broadly reasonable range, while attempting to describe the physical and biological overall performance.
The parameter Q pollen,day reflects the overall level of the pollen flux.It was tuned by calculating the overall bias between measurements and model.Based on these values, Q pollen,day was set to 2.133×10 9 pollen per square meter and per day.The amount of pollen that can be released per time step is calculated from Q pollen,day considering the following requirements/assumptions: -Under optimal conditions for pollen release, the flowers can run out of pollen grains before the end of the day.
-The daily cycle of pollen release is not prescribed in the model.The amount of released pollen does not depend on the time of the day but results from the functions f R,T and f R,RH .In our implementation, we assume that the flowers will run out of pollen grains after 16 h of constant pollen release.Q pollen, t can then be calculated by dividing Q pollen,day by the number of time steps during a 16 h-period: The functions f R,T , f R,RH and f E,TKE were tuned by calculating the absolute error between the modeled and the measured pollen concentrations for each measuring station.These errors were plotted against the different meteorological variables.These plots were then used to adjust the functions describing the meteorological influences on pollen release/entrainment.The resulting functions are (the corresponding curves can be found in Figs. 1, 2 and 3) In these functions, T denotes temperature, rh the relative humidity, and TKE the turbulent kinetic energy on the lowest model level.
The number of pollen that are lost from the reservoir due to random processes is derived by using the concept of a halflife.Under the premise that only the random losses are effective, we assume a continuous removal from the reservoir such that after 12 h, half of the pollen present in the reservoir is lost.This assumption is used to calculate the percentage of the pollen in the reservoir that are lost per time step due to random processes: where t is the time step of the model given in seconds.
The washout of the reservoir is given by: Wherein p denotes the sum of convective and grid-scale precipitation in the model.It is given in kg m −2 s −1 .
The tendency of the pollen to stick to the surface under moist conditions is given as a function of the relative humidity rh:

Testing the new parameterization
After debiasing the model using Swiss pollen data from 2010 and 2011, we have simulated the birch pollen season of 2012 using two model configurations.Both use an identical setup, with the emission parameterization being the only difference.First, simulations were carried out using the parameterization H opt (see Sect. 3).Second, we have adapted COSMO-ART to run with our newly developed emission parameterization EMPOL.

Setup of the simulations
We We use the birch distribution data set that was produced by Pauling et al. (2012) as the map of possible pollen sources.Pauling et al. (2012) present a method to create plant distribution data sets using birch as an example.This method combines forest inventory and land use data from Switzerland to derive a distribution map.The transfer to a larger domain (southern and central Europe) is achieved by using the Global Land Cover 2000 data set in combination with pollen data.
The mathematical description of the pollen season f Q,seas is taken from the phenological model developed for the operational numerical pollen forecasts at MeteoSwiss (compare Sect. 3.1).It is used as an input parameter for the emission parameterization in both model configurations.

Comparison to pollen measurements
The simulated pollen concentrations (mean over 9 grid points) are compared to daily measurements at 34 observational sites throughout Europe for the entire pollen season of 2012.A list of the sites, including their geographical location, can be found in Table 1.Pollen grains were sampled using Hirst type traps (Hirst, 1952).
The level of the simulated pollen concentrations at the start and end of the pollen season strongly depends on the seasonal description (f Q,seas , Sect.3.1).Since the aim of our exercise is the comparison of the emission parameterizations and explicitly not that of the seasonal description, it has to be made sure that the day-to-day differences between modeled and observed pollen concentrations can be attributed mainly to the emission parameterization.Therefore, we exclude days outside the main pollen season from the exercise.The main pollen season is defined as the period between the first and last occurrence of 70 pollen per cubic meter in the observations (daily means).This corresponds to the pollen class "strong" (compare Table 2).Secondly, we exclude days at the beginning and end of the simulated pollen season, since the upswing and downswing of the simulated pollen season is not yet very well defined.To ensure that these days are not taken into account, we exclude days where the value of the modeled pollen season f Q,seas [0, 1] is below 0.3.The observational data has been checked for outliers.Additionally, we excluded sites for which there were less than 7 days of observations within the chosen period (see above).

Statistical measures
Manual operational pollen forecasts are usually done for pollen classes that reflect more or less the strength of the allergenic symptoms that are induced by the given pollen concentration.These thresholds depend on the sensitization rates and are not constant over regions (Jaeger, 2011).Table 2 gives the thresholds for birch pollen concentrations used for the operational pollen forecasts at MeteoSwiss (Gehrig et al., 2013).Taking this classification, and using a lower and upper limit, the continuous values of pollen concentrations were Table 2. Pollen classes for birch pollen concentrations (C pollen ) used for the operational pollen forecasts at MeteoSwiss (Gehrig et al., 2013).

Pollen class
Pollen concentration in m −3 low pollen load C pollen < 10 moderate pollen load 10 ≤ C pollen < 70 strong pollen load 70 ≤ C pollen < 300 very strong pollen load 300 ≤ C pollen  3) was completed, taking the occurrence of the desired class as an event and the occurrence of any other class as a non-event.
Whether the modeled pollen class is higher or lower than the observed one -and, if it is higher or lower, by how muchcannot be taken into account.Additionally, it has to be noted that the occurrence of any non-event in measurements and simulations is counted as a correct negative.Since a nonevent is any pollen class other than the one currently under observation, a correct negative does not necessarily mean that the correct pollen class has been forecast.In addition to this approach based on the usual manner of manual pollen forecasts, we classified the pollen data using single thresholds as it is usually done for precipitation.This reflects the fact that for most allergenic patients, a pollen concentration in excess of a certain personal threshold triggers symptoms, and thus necessitates the intake of medication.Whether this threshold is just reached or strongly overpassed is of minor importance.
Based on the four numbers in a 2 × 2 contingency table (compare Table 3), a series of skill scores has been computed (Wilks, 2006).Generally, several of these scores should be looked at together in order to form a complete picture of the performance of a model.We have chosen the Threat Score, False Alarm Rate, as well as the Pierce Skill Score.
The Threat Score, TS, (Eq.13) measures the proportion of correct forecasts without taking into account the correct non-events: Usually, this is done for rare events where correct nonevents are meaningless.In our case, disregarding the correct non-events is especially wanted since they are not necessarily correct, as described above.TS ranges between 0 and 1, with 1 being a perfect score.The False Alarm Ratio, FAR, (Eq.14) gives the fraction of simulated events that were not observed: FAR takes values between 0 and 1, and a perfect score renders a value of 0.
The Pierce Skill Score, PSS, (Eq.15) is a measure that describes the performance of a model compared to an unbiased random forecast: PSS can take values between −1 and 1. Forecasts worse than a random forecast render negative values.Forecasts better than the random forecast result in positive values.Rare events that are correctly forecast count more than frequent events.
Additionally, we calculated some statistical measures that reflect the pollen concentrations as opposed to pollen classes.This overcomes the problem of classification: values close to the thresholds might lead to a wrong class even though they are not far away from the observed value.By contrast, classes that cover a large interval of concentrations lead to events being counted as correct even if the real value is several factors wrong.We have calculated the correlation coefficient (r 2 ), the p-value, the root mean square error (rmse), the index of agreement (d 1 ), the fraction of predictions within a factor of two of observations (FAC2), and the geometric mean bias (GMB) (GAW Report No. 181, 2008).
We have calculated the p-value for r 2 .It gives the probability of obtaining the observed data if the null hypothesis is true.In our case, the null hypothesis is: "Observations and modeled simulations do not correlate."If the p-value falls below a predetermined significance level (we use a value of 0.05), one can assume that the null hypothesis is wrong.It has to be noted that the p-value is not a proof that the alternative hypothesis is correct.However, if the p-value is above the chosen significance level, it is not justified to reject the null hypothesis.
The index of agreement (Willmott et al., 1985) is based on the sums of the absolute values of the errors between observations and modeled values.It can take values between zero and one, with one being a perfect score.It is given by where P i denotes the simulated values, O i the observations, O the observational mean, and N the number of data points.FAC2 gives the fraction of the predictions that are within a factor of two of the observations.It is calculated as with n i = 1 if the modeled value lies within a factor of two of the observed value.Otherwise, n i is zero.A "perfect" forecast (with respect to the factor of two) renders a score of one.
The geometric mean bias is given as It has to be noted that GMB is sensitive to the relative error.For example, an observed value of 1 and a simulated value of 0.1 render the same GMB as an observed value of 100 and a simulated value of 10.In reality, the former is of little importance to allergic people whereas the latter is crucial.In addition, the measuring systems cannot distinguish between 0.1 and 1, although they are clearly sensitive to the difference between 10 and 100.Furthermore, in the observed pollen concentrations, values of 0 pollen per cubic meter can occur.This obviously results in an error for the logarithm.Hence, the logarithm of pollen concentrations smaller than 20.1 pollen per cubic meter was set to 3.

Results regarding pollen classes
We have studied the ability of the model versions to forecast a certain pollen class using TS and FAR.The results using a double threshold to define a pollen class (upper and lower bound) can be found in Figs. 4 and 5.Both model versions show that the ability to correctly forecast a pollen class is different for the different pollen classes.Apparently, it is easier to predict higher pollen classes.This is desirable since higher pollen classes induce stronger allergenic symptoms.In any case, the class "low pollen load" scores worst.This can partly be explained by the fact that the higher pollen classes cover a wider range of pollen concentrations (see Table 2): e.g., for the "very strong" pollen class, the concentration needs to be anything above 300 pollen per cubic meter, whereas for the class "low" the concentration needs to be below 10 pollen per cubic meter.Obviously, the latter is more difficult to hit.It has to be noted that the class "low pollen load" is very rare during the main pollen season, both in observations and in the models.The scores are therefore based on only a few data points where the observations fall into this class.Additionally, measurements of small pollen concentrations are uncertain due to the nature of the measuring system.The same scores have also been computed using a single threshold to define a pollen class rather than an upper and lower limit.The results given in Table 4 still show strong differences between the classes.However, the conclusion drawn from the exercise is reversed: now, the results indicate that it is easier to forecast the lower pollen classes.This demonstrates that the scores are very sensitive to the way an "event" or "non-event" is defined (single or double threshold).Which definition is appropriate depends very much on the question of the study.
The PSS using a double threshold is shown in Fig. 6.For the three lower pollen classes, the values of PSS are on the order of 0.3 (EMPOL) and less than 0.1 (H opt ).The scores for the "very strong" pollen class are better for both parameterizations: 0.49 and 0.24, respectively.The results indicate that, indeed, the highest pollen class is better predicted than the lower pollen classes.This is true for both emission parameterizations and for both definitions of an "event" (not shown).
In order to get a sense of the quality of our results, we have compared the PSS values for pollen to the results obtained for operational COSMO forecasts of precipitation at MeteoSwiss.Since precipitation, in contrast to birch pollen concentrations, is not limited to only one season, the analysis has been done for all four seasons.For precipitation, 8 different thresholds are used for the classification, starting with 0.1 kg m −2 and ending with 50 kg m −2 .The corresponding values can be found in Table 5.Unlike pollen classes, precipitation is better forecast for the lower thresholds.The best score for pollen classes (EMPOL, class "very strong", see Fig. 6) is on the order of the best scores for precipitation (0.6).Likewise, the worst scores for precipitation and pollen (H opt , classes "low", "moderate" and "strong") are in the same range and less than 0.1.Considering that precipitation has already been forecast in NWP models for some decades, this is an encouraging outcome for simulations of pollen concentrations that have been introduced to NWP systems just recently.Summing up the results, EMPOL scores higher than H opt for any of the pollen classes and for all of the computed skill scores, regardless of the definition of an "event" (see, e.g., Figs. 4 to 6).

Results regarding pollen concentrations
The mean correlation coefficient is clearly better for EMPOL than for H opt (see caption of Table 6).However, to reject the null hypothesis ("Modeled and observed concentrations are uncorrelated".), the p-values have to be less than 0.05.Neither the mean p-value of H opt nor that of EMPOL fall below this threshold (see caption of Table 6).The picture changes when looking at individual measuring sites (see Table 6).For H opt , the p-values are generally higher, with only the values for three stations below 0.05 (stations 4, 16, and 30).These coincide with the three highest correlation coefficients for H opt .By contrast, EMPOL has p-values of less than 0.05 at 17 stations and only a few stations with very high p-values.Again, the high p-values generally coincide with small correlation coefficients (e.g., stations 2, 14, 15, 21, 23, 33, 34) and vice versa.H opt only performs better at 7 out of the 34 measuring sites.This can be interpreted such that the biological and physical processes are better represented in EMPOL than in H opt .Both parameterizations have been tuned using Swiss pollen data of the previous years.For EMPOL, this results in better correlation coefficients and very low p-values at most of the Swiss pollen stations (numbers 5 to 13).Nevertheless, apparently the chosen parameterization also works well for many stations outside of Switzerland (e.g.,stations 3,16,18,19,24,26,27,29,30).Such a refinement of the results for the stations included in the tuning process cannot be found for H opt .
The model's ability to forecast a concentration in reasonable proximity to the observed value can be measured using FAC2 (see Fig. 7).It can be seen that EMPOL performs better at 28 of the 34 stations, which leads to a considerably higher mean value than H opt has.Overall, about 50 % of EMPOL's simulated concentrations are within a factor of The mean rmse is very high for both parameterizations (see Table 7).The individual values for different stations are quite diverse (not shown).In many cases, high values of rmse coincide for the two parameterizations.This suggests that the error could have its origin outside the emission parameterizations.Examples of such external factors are the plant distribution or a wrong altitude of Alpine stations in the model due to the low spatial resolution. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33   A different picture can be found for the GMB (see Fig. 8).Here, again, similar values coincide at some of the stations (e.g., sites 12, 19, 20, 25, 26, 33).However, at the majority of stations, one parameterization clearly shows a stronger bias than the other (e.g., sites 1-4, 14, 21, 24).Here, the difference in the error has to originate in the emission parameterizations since the other influencing parameters are the same for both model versions.

Sensitivity to mast years
As mentioned before, birch shows a bi-annual variation of the pollen production (years with high pollen production are referred to as "mast years").The overall level of the pollen concentrations in the model is tuned based on simulations and observations of pollen concentrations of a predefined set of years.Usually, such a set of years will contain both mast years and normal years.This will result in a simulated pollen level somewhere between the level of the mast years and the level of normal years.Therefore, the occurrence or non-occurrence of a mast year in the period chosen for the experiment will have an influence on the performance of the model for the statistical scores that are sensitive to the overall level of the pollen concentrations (e.g., the fraction of predictions within a factor of 2 of observations).Scores that are not sensitive to the overall level (e.g. the correlation coefficient) will not be influenced by the occurrence of a mast year.
We tried to identify the mast and normal years in the period used in this study (2010 to 2012) based on the Swiss observational data.Values of the seasonal pollen index SPI (yearly sum of the observed daily pollen concentrations) for the Swiss observational sites are given in Table 8.It is not possible to make a clear statement.Four of the Swiss observational sites show the highest SPI in the year 2010, three in the year 2011, and two in the year 2012.The variation between the years can be very small (e.g., sites 5 and 9) or very strong (e.g., sites 6 and 11).Looking at these data, it is obvious that it is not possible to make an overall statement about a certain year being a mast year or not even for these stations that are relatively close to each other.The conclusions drawn from the data are not consistent between the different observational sites.Additionally, the difference between mast years and normal years can be strong or very weak, depending on the observational site.

Summary and conclusions
We have developed an emission parameterization, EMPOL, for pollen grains for use in NWP systems.EMPOL needs the plant distribution and a description of the pollen season as input parameters.The emission process is separated into two steps.The first step is the release of the pollen from the flowers into a pollen reservoir (pollen presentation).This is driven by temperature and relative humidity.The second step is the entrainment of the pollen from the reservoir into the atmosphere.This is driven by the turbulent kinetic energy.Additional processes that are included: washout of the reservoir by precipitation, loss of pollen in the reservoir due to random processes, sticking of the pollen grains in the reservoir when there is high relative humidity.Under favorable conditions, the available pollen grains can be released rapidly.In that case, pollen release is turned off for the rest of the day.The maximum daily amount of available pollen is a function of the progression of the pollen season.
The meteorological functions have been implemented in a way that allows adaptation for different pollen species.This is important since the opening processes of the anthers are different for different plant species.Additionally, EMPOL uses individual functions for each of the meteorological parameters.This and the separation of the different steps leading to pollen emission facilitate the design of experiments to determine the different meteorological parameters.Another advantage over the parameterizations based on Helbig et al. (2004) (H orig and H opt ) is the avoidance of unknown parameters like LAI that introduce unnecessary uncertainty.
Statistical scores show that the model version using EM-POL performs better than the model version using H opt .This is true both for analyses using continuous values and for analyses based on pollen classes.
TS and FAR show that the performance of the model differs for different pollen classes.Using a lower and upper limit to define classes, the higher pollen classes are better predicted than the lower pollen classes.Using a single threshold to discriminate between classes, the lower pollen classes reach better scores.For PSS, the scores of the higher pollen classes are better, both for single and double thresholds.Overall, the scores are comparable to the ones that are reached for operational forecasts of precipitation using the COSMO model.The best score (0.49) is reached for the pollen class "very strong" and EMPOL.The mean correlation coefficient for H opt is 0.12, whereas for EMPOL it reaches a value of 0.43.The results are quite diverse among the different observational sites.Using EMPOL, 49 % of the simulated values deviate less than a factor of two from the observations (H opt : 29 %).
It has to be noted, however, that this parameterization is not yet fully tuned.The functions in the parameterization work in the timescale of a time step.In our case, these are 60 s.It is hardly possible to tune these functions using daily resolved observations.Pollen observations with a higher temporal resolution (bihourly values) have started to be registered on an operational basis only recently.Additionally, it would be desirable to conduct laboratory experiments for the explicit determination of the functions that connect the amount of released/entrained pollen to meteorological variables.Taking these possibilities into account, EMPOL has a great potential to become considerably better in the future.
For the time being, the functions have only been tested for birch pollen emission.However, during the last months, EM-POL has also been employed to parameterize the emission of grasses and ragweed using plant-specific constants (e.g. in the functions for temperature, humidity and TKE).First results are very promising (not shown).Parameterizing the emissions of further plant species, such as hazel, alder, or ash, should follow to enable the operational use of numerical predictions for the main allergenic pollen species.
For the future development of EMPOL, the following paragraph lists some of the possible improvements.
-Conducting field or laboratory experiments to deduce better functions relating meteorological conditions to the different steps of pollen emission.
-Introducing a mechanism that hinders pollen release and/or entrainment for a certain time period after a precipitation event.
-Introducing the influence of rising CO 2 on the pollen production e.g., Ziska and Caulfield, 2000;Rogers et al., 2006).With respect to the present parameterization this could be done in several ways: (1) Q pollen,day

Fig. 1 .
Fig. 1.Function describing the influence of temperature on the release of birch pollen.

Fig. 2 .
Fig. 2. Function describing the influence of relative humidity on the release of birch pollen.

Fig. 3 .
Fig. 3. Function describing the influence of turbulence on the entrainment of birch pollen grains into the atmosphere.

Fig. 4 .
Fig. 4. Threat Score, based on classes defined by an upper and lower limit.

Fig. 5 .Fig. 6 .
Fig. 5. False Alarm Ratio, based on classes defined by an upper and lower limit.

Fig. 7 .Fig. 8 .
Fig. 7. Fractions of predictions within a factor of two of observations for the two model configurations at each of the measuring sites.The colored lines denote the mean over all stations.

www.geosci-model-dev.net/6/1961/2013/ Geosci. Model Dev., 6, 1961-1975, 2013
have used the COSMO-ART version 2.0 in combination with the COSMO version 4.19.The operational COSMO-ART domain of MeteoSwiss covers a large part of central and western Europe, reaching from Portugal in the west to the Balkans in the east, and from southern Italy in the south to the southern parts of Scandinavia in the north.

Table 1 .
Sites of the pollen measurements and their geographical locations, numbers refer to the numbering in the figures and tables of Sect.6.

Table 3
classified for both the modeled and the measured values.For each pollen class, a 2×2 contingency table (see Table

Table 4 .
Threat Score (TS) and False Alarm Ratio (FAR) for pollen classes using a single threshold.The threshold is given in pollen per cubic meter.

Table 5 .
Values of the Pierce Skill Score for operational COSMO forecasts of precipitation at MeteoSwiss using different thresholds.The four seasons are evaluated separately.

Table 6 .
Correlation coefficients (r 2 ) and their corresponding pvalues for the two model configurations at each of the measuring sites.The mean r 2 over all stations is 0.43 for EMPOL and 0.12 for H opt .The corresponding mean p-values are 0.19 for EMPOL and 0.49 for H opt .

Table 7 .
Statistical results for the two model configurations based on pollen concentrations.Values are means over all measuring sites.The rmse is given in pollen per cubic meter.Figures in the last column give the number of sites where EMPOL scores better than H opt .The total number is 34 sites.

Table 8 .
Seasonal pollen indices for the Swiss observational sites for the years 2010 to 2012.
could be transformed into a variable field (right now, its value is fixed for the entire domain), (2) a new input field could be introduced that reflects the influence of CO 2 , e.g., f Q,CO 2 , (3) the influence of CO 2 could even be calculated within the model if COSMO-ART is run in a "full mode" including reactive trace gases.