Two new submodels for the Modular Earth Submodel System ( MESSy ) : New Aerosol Nucleation ( NAN ) and small ions ( IONS ) version 1 . 0

Two new submodels for the Modular Earth Submodel System (MESSy) were developed. The New Aerosol Nucleation submodel (NAN) includes new parameterisations of aerosol particle formation rates published in recent years. These parameterisations include ion-induced nucleation and nucleation of pure organic species. NAN calculates the rate of new particle formation based on the aforementioned parameterisations for aerosol submodels in the ECHAM/MESSy Atmospheric chemistry Climate (EMAC) model. The Ion pair production rate, needed to calculate the ion-induced or -mediated nucle5 ation, is described using the new submodel IONS, which provides ion pair production rates for other submodels within the MESSy framework. Both new submodels were tested in EMAC simulations. These simulations showed good agreement with ground based observations.


Introduction
The influence of aerosol particles on various aspects of climate and human health (Knibbs et al., 2011;Lelieveld et al., 2015) is well established.Aerosol particles influence climate through aerosol-cloud and aerosol-radiation interactions (Lohmann et al., 2010).A detailed understanding of the sources of aerosol particles is necessary to study their climate and health effects.New particle formation (NPF), i.e. nucleation and growth of new aerosol particles from vapours, is an important source of secondary aerosol particles in the troposphere and planetary boundary layer, and observed events of NPF are well documented (Weber et al., 1999;Kulmala et al., 2004).Manninen et al. (2010) give examples of NPF at various European measurement sites, Pierce et al. (2014) in Canada, Bae et al. (2010) in the USA and Suni et al. (2008) in Australia, and Sipilä et al. (2016) observed NPF in a coastal region of Ireland.According to Merikanto et al. (2009) and Yu and Luo (2009), a significant proportion, about 50 % globally, of cloud condensation nuclei (CCN) originate from NPF.
Many global model studies of atmospheric aerosols rely on the binary homogeneous nucleation (BHN) parameterisation of Vehkamäki et al. (2002), which describes aerosol particle nucleation using a polynomial fit to a microphysical model of nucleation as a function of H 2 SO 4 concentration, temperature and relative humidity.Yu (2010) and Kazil et al. (2010) published look-up tables for a nucleation parameterisation that includes the effect of airborne ions, ion-mediated nucleation (IMN) and ion-induced nucleation (IIN), respectively.Ball et al. (1999) showed that NH 3 can enhance nucleation rates in a mixture with H 2 SO 4 and water vapour.Napari et al.Published by Copernicus Publications on behalf of the European Geosciences Union.
(2002) derived a parameterisation of the H 2 SO 4 -NH 3 -H 2 O system based on theoretical calculations and an improved parameterisation was developed by Merikanto et al. (2007).However, observed boundary layer nucleation rates can not be explained by H 2 SO 4 -NH 3 -H 2 O nucleation alone (Kirkby et al., 2011).Sihto et al. (2006), Kuang et al. (2008) and Paasonen et al. (2010) developed parameterisations based on ground-based observations of boundary layer nucleation events.These parameterisations are typically least-square fits to a power law dependency of observed particle formation rates as a function of vapour concentration and are only valid for environments that match the observation sites.
New parameterisations of aerosol nucleation based on experiments in the European Organization for Nuclear Research (CERN) Cosmics Leaving Outdoor Droplets (CLOUD) chamber were published in the past years.These parameterisations include a variety of chemical species and in most cases the influence of air ions.Additionally, these parameterisations offer a description of boundary layer and upper tropospheric nucleation.Dunne et al. (2016) derived parameterisations for systems that include H 2 SO 4 , NH 3 and ions over a wider range of atmospheric temperatures.Riccobono et al. (2014) describes secondary organic aerosol nucleation from biogenic vapours and H 2 SO 4 , while Kirkby et al. (2016) showed that nucleation can even occur without H 2 SO 4 , purely from biogenic vapours and air ions.Furthermore, Riccobono et al. (2014) and Kirkby et al. (2016) provided a parameterisation used by Gordon et al. (2016) to study the effect of NPF on climate.Most of the recent parameterisations of particle formation use atmospheric ions or ionising radiation (Yu, 2010;Kazil et al., 2010;Dunne et al., 2016;Kirkby et al., 2016).
Aside from production of aerosol particles, the chemical conversion and transport of aerosols in the atmosphere are of importance.Various general circulation models (GCMs) include aerosols to study global aspects of aerosol particles.Mann et al. (2014) compared 12 global chemical transport models (CTMs) and GCMs, which included aerosol microphysics.Estimates on the fraction of CCN particles from secondary aerosol formation vary between different models, e.g.Merikanto et al. (2009) and Yu and Luo (2014).
In this work, the implementation of the CLOUD-based parameterisations into the Modular Earth Submodel System (MESSy) is described, as well as their application in the ECHAM/MESSy Atmospheric Chemistry (EMAC) GCM.These parameterisations are part of the New Aerosol Nucleation (NAN) submodel.The new parameterisation requires the inclusion of tropospheric and stratospheric ions; therefore, the submodel IONS treating production of ions from galactic cosmic rays and radon was created.

MESSy
MESSy is a collection of models for various aspects of Earth system modelling.Most of the models are organised as submodels, which form the submodel core layer (SMCL).Models in the SMCL can either be used as box models or be part of a larger model, the so-called base model.A commonly used combination of MESSy with a GCM is EMAC (Pozzer et al., 2012;Klingmüller et al., 2014).Initialisation and acquiring data from other submodels is done within the submodel interface layer (SMIL).The control of each submodel is performed through variables in Fortran 90 namelists.Each submodel uses a file with these namelists to set variables and allow coupling to other submodels.As described in Jöckel et al. (2010), submodels can share values via the channel infrastructure.
Several submodels describing aerosol dynamics exist within the MESSy framework.The current most-developed submodels for aerosol dynamics within the MESSy framework are GMXe (Pringle et al., 2010), MADE and its successor MADE3 (Lauer et al., 2005).The GMXe submodel is based on M7 (Vignati et al., 2004), which describes the aerosol size distribution as seven overlapping log-normal distributions, of which four modes are soluble and three modes are insoluble.M7 and GMXe were developed and optimised for inorganic aerosol particles; therefore, Tsimpidi et al. (2014) developed the ORACLE submodel for the treatment of secondary organic aerosol (SOA); see also Tsimpidi et al. (2017).ORACLE uses the volatility basis set approach based on Donahue et al. (2006) to calculate partitioning of gases between the particle and gas phases.The aerosol particle size distribution is taken from GMXe.Gas-phase chemical reactions are calculated with the Module Efficiently Calculating the Chemistry of the Atmosphere (MECCA) submodel (Sander et al., 2011).

IONS submodel
Atmospheric ions are produced by galactic cosmic rays (GCRs) and by the radioactive decay of radon and its subsequent decay products.In order to provide ion pair production rates independent of the global electric circuit (GEC) submodel (Baumgaertner et al., 2013), ion pair production and the calculation of a steady-state ion concentration were included in a new MESSy submodel IONS.For the calculation of ion pair production from radon decay, the diagnostic radon (DRADON) submodel (Jöckel et al., 2010) must provide tendencies for all tracers in the decay chain.The submodel can provide the ion pair production rate and steadystate ion pair concentration to other submodels via MESSy's coupling scheme.
Radon emissions are described either by constant emissions over land (value set via namelist) and ocean (also set Geosci.Model Dev., 11, 4987-5001, 2018 www.geosci-model-dev.net/11/4987/2018/via namelist) or by an emission flux map, e.g.Zhang et al. (2011).For a detailed description of possible input parameters, see the electronic Supplement.The ion pair production from a single decay event is calculated in the same way as described by Zhang et al. (2011).It is assumed that each α decay creates an ion pair for every 35.6 eV of initial energy, while every β decay produces an ion pair for every 32.5 eV of initial energy.The radon decay chain and the corresponding energies are given by the reaction chain given in Reactions (R1) to (R5).Half-life times are given above the reaction arrows.
The α decay of 214 Po to 210 Pb is not explicitly mentioned in Reaction (R4) due to a half-life time of only 164 µs, though the released α particle is included in the calculation of produced ion pairs.The radon decay chain ends with the stable isotope 206 Pb.Under atmospheric conditions, however, if the optional coupling of DRADON submodel to an aerosol model is chosen, 210 Pb is already taken up into aerosol particles, due to a lifetime with respect to radioactive decay of 22.3 years.Since the half-life time of this decay exceeds the lifetime of atmospheric aerosols by more than 2 orders of magnitude, the last decay chain is not included in the model.The IONS submodel includes the cosmic-ray-induced ionisation (CRII) scheme by Usoskin et al. (2010).The CRII tables contain the ion pairs produced per second and gram of air as a function of atmospheric depth, cosmic ray modulation and geomagnetic cut-off rigidity.Values between the tabulated points are calculated by linear interpolation in the same way as in Dunne et al. (2016).The geomagnetic cut-off rigidity is calculated by the method of Fraser-Smith (1987).The main difference between this implementation and the one described in Dunne et al. (2016) is the use of more recent tables for both the modulation of GCRs and geomagnetic cut-off rigidity.For the GCR modulation, a choice between a table of monthly averages from 1936 to 2016 (Usoskin et al., 2005(Usoskin et al., , 2011;;McCracken and Beer, 2007) or yearly averages since 1600 (Asvestari and Usoskin, 2016;Asvestari et al., 2017) is available.The MESSy import for time series data provides a linear interpolation for dates between the listed values.The geomagnetic cut-off rigidity uses the first three coefficients of the International Geomagnetic Reference Field (IGRF) (Thébault et al., 2015) coefficients of Earth's magnetic field.For 1900-2015, the IGRF table is applied, while for years prior to that the reconstruction of the magnetic field by Jackson et al. (2000) is used.The coupling to the GEC model makes it possible to use the new parameterisation of ionisation in the GEC submodel to calculate the conductivity of air.
The number concentration of small ion pairs (n ± ) due to production and their loss in the atmosphere can be described by The first two terms (Q d and Q g ) are the ion pair production due to radioactive decay and galactic cosmic rays.The other terms describe the various loss processes.The first loss process is ion-ion recombination.The rate constant of ion-ion recombination (k r ) is calculated with the parameterisation of Brasseur and Chatel (1983) which gave reasonable agreement with ion-ion recombination in the CERN CLOUD chamber (Franchin et al., 2015), although under high-pressure, low-temperature conditions.The second loss process is uptake of ions by aerosol particles with a number concentration of A. The particle-size-dependent coefficient k a is calculated using the same method as in Tinsley and Zhou (2006) and Baumgaertner et al. (2013).For particles with a diameter larger than 20 nm, the expression from Hoppel (1985) is used to calculate the attachment rate coefficient.d µm is the aerosol particle diameter in µm.For particles smaller than this, Tinsley and Zhou (2006) provided as extrapolation for nucleation-mode particles.The size of the aerosol particles is provided by aerosol submodels such as GMXe.
The third loss process is ion-induced nucleation, which is negligible outside of nucleation events but becomes important during nucleation events.However, this loss is only taken into account in the nucleation submodel when calculating the ion-induced nucleation rate.The reason for this is to limit the maximum possible ion-induced nucleation to the ion pair production rate.Nevertheless, small ions that are lost due to nucleation simply become slightly larger ions, and removing them from the simulation can cause an imbalance in the small ion concentration.Since only small ions are considered here, this would lead to an overall ion imbalance.Furthermore, singly charged particles up to a diameter of a few nanometres have the same recombination coefficient; see, for example, Hoppel (1985) or López-Yglesias and Flagan (2013).Therefore, losses due to nucleation are not used in the ion submodel.

NAN submodel
The channel objects in MESSy (Jöckel et al., 2010) allow for a flexible transfer of variables between models.Therewww.geosci-model-dev.net/11/4987/2018/Geosci.Model Dev., 11, 4987-5001, 2018 fore, the implementation of the aerosol nucleation parameterisations can be used by several aerosol submodels within MESSy.Further, this approach allows code which is easier to maintain and adjust to new scientific findings, such as refined parameterisations, including additional species and new nucleation mechanisms.The steady-state new particle formation rates described in Dunne et al. (2016), Kirkby et al. (2016) and Riccobono et al. (2014) were implemented into the nucleation model core layer of the NAN submodel.This resulted in several functions that return the formation rates of aerosol particles with a diameter of 1.7 nm.A short summary of the parameterisation will be given here, while details, such as the choice of functions, number of parameters and optimisation, are explained in the supporting information of Dunne et al. (2016), Kirkby et al. (2016) and Riccobono et al. (2014).The neutral binary homogeneous nucleation of sulfuric acid and water is given by and neutral homogeneous ternary nucleation of sulfuric acid, ammonia and water by The indices indicate the type of nucleation with "b" binary, "t" ternary, "n" neutral and "i" ion-induced nucleation.The function k x,y (T ) has the same form for all four nucleation pathways but uses different parameters and basically describes the temperature dependence of the particle formation rate as with x ∈ (b, t), y ∈ (n, i) and the temperature T in Kelvin.The function is shared with the ion-induced ternary channel and controls the saturation behaviour of the ternary nucleation.The equations for ion-induced nucleation take a similar form but with the concentration of negative ions, [n − ], included as a factor.This leads to and Although called binary and ternary nucleation, the influence of water vapour is not explicitly indicated in the parameterisation.Although the experimental data that form the basis of this parameterisation were conducted at various water vapour concentrations, most of the measurements were done at a relative humidity of 38 %.Dunne et al. (2016) give a scaling factor dependent on the relative humidity as a fraction (RH) and temperature (T ) in Kelvin: with c 1 = 1.5 and c 2 = 0.045 K −2 .However, this scaling factor is more of an ad hoc solution and based on very few measurements.The overall effect of this scaling is described as relatively small in Dunne et al. (2016) and is not used here.Two functions describe nucleation by oxidised organic species, named HOM in Kirkby et al. (2016), which is again split into a neutral channel, and an ion-induced channel, A major difference between this channel and Eqs. ( 8) or ( 9) is that the organic nucleation can proceed with positive and negative ions.The original form of Eq. ( 12) given by Kirkby et al. (2016) assumed charge balance; the equation given above remains valid even if charge balance is not given.The description of nucleation from oxidised organic species and sulfuric acid is described according to the power law dependency of Riccobono et al. (2014).The definition of oxidised organic species varies between Kirkby et al. (2016) and Riccobono et al. (2014).The latter defined the oxidised organics as BioOxOrg, which are produced by the oxidation of pinanediol with OH radicals, while the former named the oxidised organics HOM and defined it as a product of αpinene oxidation by O 3 and OH.Mass spectra from both sets of experiments show similar species with high oxygen to carbon ratios, so it can be assumed that the nucleating species are also the same to a large extent.However, Riccobono et al. ( 2014) only provides evidence for nucleation of OH oxidation products with sulfuric acid.While it is reasonable to assume that O 3 oxidation products will also nucleate with H 2 SO 4 , the parameterisation is strictly only valid for OH oxidation products.An additional problem is that the nucleation rate parameterisation given in Kirkby et al. (2016) cannot be separated into nucleation channels driven by OH and O 3 oxidation products.Therefore, it is assumed that the species HOM is the sum of monoterpene oxidation products from O 3 , denoted HOM O 3 and OH radicals, HOM OH .With this definition, the power law dependence from Riccobono et al. (2014) can be written as The yield of HOM OH production, 0.6 % for lumped atmospheric terpenes according Tröstl et al. (2016), was included in the parameter k R since the original parameterisation did not include a yield.
Nucleation between amines and sulfuric acid is described as if [amines] > 2.0 × 10 8 cm −3 and in all other cases.This approach is the same as in Dunne et al. (2016), with a more generalised notation of the parameters.This allows straightforward and flexible switching between different parameterisation for amine nucleation.This is also of importance since different amine species can have different nucleating potential (Jen et al., 2014;Glasoe et al., 2015).
The parameterisation of Bergman et al. (2015) can easily be applied by setting the threshold concentration to 0 and setting the parameters with integer index 1 to the values used in Bergman et al. (2015).With all nucleation pathways, J j , described above the total nucleation rate, is described as the sum of all particle formation rates.It is assumed here that the different nucleation channels do not interact with each other as subcritical clusters or particles below the threshold of 1.7 nm.
All fit parameters can be set in the nucleation submodels' namelist PARAM (see the electronic Supplement for details).If no setting is chosen, the published default values are used.This makes it possible to study the sensitivity of model results to these parameters and change parameterisations easily.None of the organic nucleation channels described above have an experimental basis for a temperature dependence of the nucleation rate.Nevertheless, a temperature dependence is defined in the model using an exponential scaling factor: which is applied to Eqs. ( 11), ( 12) and ( 13).Setting the parameter B = 0 leads to no temperature dependence in the model for the organic nucleation channels and is the default setting.
The existing subroutines for calculating nucleation rates according to the parameterisations of Vehkamäki et al. (2002) and Kulmala et al. (1998) were copied from GMXe so that these legacy nucleation parameterisations can also be used.The set of parameterisations for a model run is set in the submodels' namelist.If multicomponent nucleation is chosen, the submodel tests whether nucleation depletes the gasphase concentration of nucleating vapours.If this is the case, an Euler integration is performed for the length of the global model time step which calculates the vapour depletion, derives the average particle formation rate for each pathway and the total number concentration of newly formed particles.
The newly formed particles can either be added directly to the nucleation mode, as is done in GMXe, or optionally the method of Anttila et al. (2010) can be used to grow the freshly formed particles to a fixed size.The latter method is useful if the smallest size bin or mode of the aerosol model is larger than the size of the nucleated particles.The implementation of Anttila et al. (2010) into MESSy does not include iteration, in order to keep computational cost at a minimum.The condensation sink is provided by the aerosol dynamics model via MESSy's channel objects.The major drawback of this approach is that it requires additional parameterisations for the growth rates of freshly nucleated aerosol particles.For use with GMXe, the freshly nucleated particles are added directly into the nucleation mode.

Simulations
Nucleation rates in MESSy are usually calculated within the calling aerosol submodel.Therefore, EMAC simulations were performed to evaluate whether the call to the nucleation subroutine can be moved outside of GMXe.A simulation that used the Dunne et al. (2016) parameterisation within the GMXe submodel served as the baseline for comparison with the new nucleation submodel.This baseline simulation was compared with a simulation where the new submodel was called after GMXe and a simulation with the nucleation called before GMXe.
A full list of the simulations is given in Table 1.The set of chemical reactions in these simulations was the same as in Jöckel et al. (2016).Simulations were carried out with a spectral resolution of T42 and 31 hybrid sigma-pressure levels.The dynamics were nudged towards ERA-Interim data of the European Centre for Medium-Range Weather Forecasts (ECMWF).Tracer nudging and data initialisation were the same as in Jöckel et al. (2016).
To test the organic nucleation scheme, the years 2007 and 2008 were simulated, with the first year acting as spin-up.The chemical reactions and emissions from Tsimpidi et al. (2014) were used.Reactions of terpenes with OH and ozone that form HOM species were added, similar to Gordon et al. (2016) with the refined yields from Tröstl et al. (2016).As mentioned in Sect.2.3, the terpene oxidation product is split into the product of ozonolysis of terpenes and oxidation of terpenes with OH radicals, leading to as the reactions of the aerosol precursor gas.The lumped terpene tracer, LTERP, is based on terpene emissions from Tsimpidi et al. (2014).The gas-to particle-phase partitioning of the added organic species is calculated by ORA-CLE (Tsimpidi et al., 2014).A saturation vapour pressure The Stationary Column OUTput (SCOUT) submodel provides instantaneous values of nucleation rates, aerosol particle and precursor gas concentrations at each 600 s model time step at the coordinates of 22 atmospheric measurement stations from the EBAS database (Tørseth et al., 2012).The stations and their coordinates are given in Table 2.The year 2008 was chosen for the overlap with ion measurements from Manninen et al. (2010).The aerosol particle number concentrations were measured with condensation particle counters, which provide the total concentration of particles exceeding a threshold diameter.For comparison with observational data, the concentration of particles N d exceeding a diameter d, here 10 nm, is calculated as for a set of m modes, in the case of GMXe m = 7, of overlapping log-normal size distributions.The count mean diameter for mode j is given by D p j and the standard deviation as σ j .

Ion model evaluation
A total of 6 of the 22 stations listed in EBAS with aerosol particle data for 2008 (see Table 2) were used in the analysis of ion spectrometer measurements in Manninen et al. (2010).
The ion concentration measured at these stations is compared to the simulated concentration in Fig. 1.For this plot, the measured concentration of positive and negative ions was averaged in order to compare with the simulation, which retains ion balance.The simulated time series was matched onto the observed time series by linear interpolation, using the timestamps of the observation as grid for both time series.Simulation and observation are in good agreement for most data points, with 65 % of the data points within a factor of 2 and 93 % within a factor of 5.However, EMAC also  2).
tends to overpredict ion concentrations by a factor of up to 2 in many cases, typically when the observed ion concentration is below 500 i.p. cm −3 (where "i.p." indicates ion pairs).This can in part be attributed to model assumptions, e.g.ion balance and the lack of a binned ionised aerosol model, and in part to the instruments used for the measurements.Wagner et al. (2016) showed that the transmission efficiency for (Neutral cluster and) Air Ion Spectrometer (NAIS/AIS) can be as low as 70 % for small ions, depending on instrument and inversion used.This correction cannot be applied ad hoc to historic measurements due to changes in instruments and inversions.Nevertheless, this provides an indication that the measured small ion concentrations may be too low by up to a factor of approximately 0.7.Certain specific events in high-altitude locations which can lead to high ion concentrations, such as splashing rain drops (Tammet et al., 2009) or strong wind episodes (Virkkula et al., 2007), are not accounted for in the model.These events are the reason why the plot was limited to 3000 i.p. cm −3 on both axes, as some observations showed extremely high ion concentrations for certain days.All the observed ion concentrations exceeding 3000 cm −3 in Fig. 1 were measured at the high-altitude stations.
Time series and distributions of monthly ion concentrations, modelled and measured for two stations (Hyytiälä and Hohenpeissenberg), are shown in Fig. 2. The blue (left) part of each area shows the distribution of simulated small ion concentrations, while the red part (right) shows the measured concentrations.The horizontal dashes in each area give the quantiles.The distribution for the high-elevation Hohenpeissenberg site shows a few extremely high ion concentrations of up to 6000 i.p. cm −3 .These are common on high-elevation sites and Manninen et al. (2010) attributed their formation to strong winds.The low-level station at Hyytiälä shows no such behaviour.The time series indicate also that the model does not capture the seasonality shown in the observations.This can have various reasons, such as seasonality in the radon emissions or differences in the aerosol number concentrations and hence differences in losses of ions to aerosol particles between model and observation.However, the data set shown here is rather small and lacks some measurements in the first months of 2008.
Figure 3 shows the zonal distribution of the total ion pair production rate for the year 2008.Ion pair production rates are highest close to the poles and at pressure levels of around 200 hPa, due to higher flux of GCR particles close to the magnetic poles.The ion pair production rate is a factor of 2 lower along the Equator at these pressure levels.Towards ground level, the effect of GCR particles becomes less important and radon decay becomes an important contributor over land.Figure 4 shows the global ion pair production rates at ground level (Fig. 4a) and at 200 hPa (Fig. 4b).The ground-level distribution shows that ion production over land exceeds the production over oceans.This is due to radon emissions over land.Examining the production rate over the oceans shows a negligible dependence on the latitude.At 200 hPa, the latitude correlates with the ion pair production due to Earth's magnetic field.The orientation of the magnetic field also causes the sinusoidal shape visible in the distribution.The overall distributions of small airborne ions and ion pair production rates obtained with EMAC agree well with similar simulations from other models, e.g.Usoskin et al. (2008) and Baumgaertner et al. (2013).

Intra-model comparison
Comparison between the new implementation of the Dunne et al. (2016) parameterisation outside the GMXe submodel and an implementation within GMXe is done by comparing number concentrations for all soluble modes in all grid cells at 10 h intervals over a given month.Figure 5 shows the aerosol particle number concentration from the EMAC simulations.The ordinate axis shows values with the nucleation calculated within GMXe, while the abscissa axis shows values of particle formation rates calculated in NAN before the call to GMXe.The panels show the results for the two smallest soluble modes, nucleation and Aitken modes, in GMXe.The colour indicates the total number of occurrences within each hexagonal bin.Most values differ by less than a factor of 10, indicated by the dashed lines.The percentages of points within a factor of 2, 5 and 10 are 84 %, 94 % and 96 %, respectively.
Figure 6 is the same as Fig. 5, except that the NPF rate was calculated after GMXe calculated the aerosol size distribution for the time step.Calling the nucleation submodel after GMXe gives slightly better agreement with the baseline model, with 88 % of points within a factor of 2. The differ-  ence between the implementation before and after GMXe is the result of numerical errors due to the linearisation of nonlinear processes.Similar effects can be expected for other submodels within MESSy.To test this, the GMXe submodel was called with the radiation microphysics or with general physics and the difference between these two simulations leads to a comparable statistics as the presented comparison between GMXe and NAN.Nucleation rates typical follow a power law with respect to vapour concentrations; see, for example, Kashchiev (1982) and Oxtoby and Kashchiev (1994).Therefore, small changes in the vapour concentration, here H 2 SO 4 and NH 3 , can have a large influence on the nucleation rate.Condensation proceeds typically faster than nucleation; it is reasonable to place the nucleation after the condensation in a time step.Therefore, the original implementation of GMXe calculates nucleation after it calculates the amount of vapour that condensed on aerosol particles.There is no internal shorter time  step in GMXe.However, condensation is not the only process affecting vapour concentrations or particle concentration.Therefore, aerosol particle concentrations are also sensitive to the placement of GMXe within MESSy's interface layer.Unfortunately, making microphysical processes available for as many submodels and potential users as possible is best achieved as a submodel, as MESSy has currently no unified interface definition for sub-submodels, i.e. a submodel of a submodel.Therefore, implementation of NAN and IONS as submodels was preferred, as both models can be called independently of the choice of other submodels.

Comparison with observations
A comparison between atmospheric observations and modelled particle concentrations, for 22 locations from the EBAS (Tørseth et al., 2012) database, is shown in Fig. 7.For the comparison with observations, a cut-off diameter of 10 nm was used since most condensation particle counters (CPCs) in the database appear to exceed a 50 % counting efficiency at this size.The simulated time series of particle concentrations was matched onto the observed time series by linear interpolation, using the timestamps of the observation as a grid for both time series.The overall agreement between both data sets is good: 44 % of the data within a factor of 2, 77 % within a factor of 5 and 88 % within a factor of 10.However, it is clear that the difference between both data sets is not normally distributed.Excellent agreement exists in a large central area of the distribution.
For three of the stations, the monthly distributions of particle concentrations are shown in   2.
and third quantiles.The missing right areas for Bondville indicate missing data.From this plot, it can be seen that EMAC and observations differ to varying degrees in their distribution of values within each month.Nevertheless, the model catches certain seasonality for some stations, shown here for Hyytiälä and to a lesser degree Mace Head, while the seasonality predicted by EMAC is not evident from the observational data for Bondville.This can best be seen from the medians.Additionally, the observations go through certain extreme values which are in most cases not exceeded by the model (aside from 2 months in Hyytiälä).This could be due to not-yet-included nucleation mechanisms or local pollution events not captured by a global model.

Conclusions
Two new submodels were introduced to MESSy and tested with EMAC.The submodel IONS provides ion pair production rates that can be used in other submodels such as GEC (Baumgaertner et al., 2013) or the here-presented NAN submodel.NAN calculates new particle formation rates based on several optional nucleation parameterisations.Having the nucleation rates outside of the aerosol microphysics models comes with several advantages.New parameterisations can be implemented easily without major rearrangements in existing source code.The same parameterisations can be used by different aerosol microphysical models.Furthermore, the submodel can be used in a box model or other base models.
The calculated ground-level ion concentration was compared to a small set of field measurements and overall gives reasonable agreement.Some extreme events are not reproduced by the model, perhaps due to a lack of suitable parameterisations, unknown microphysical process or their potentially localised nature.The global distribution of ion pair pro- duction rates follows known patterns from theoretical considerations and numerical models.
The effect of calculating nucleation rates outside of GMXe has some influence on the results.This is expected when linearising non-linear processes and is an intrinsic problem of operator splitting.Nevertheless, it has been shown that the new NAN submodel agrees well with results from GMXe, with 84 % of the data within a factor of 2.
Large uncertainties in new particle formation remain, mainly due to the incomplete nature of the implemented nucleation rate parameterisations.Incomplete aspects include the temperature dependence of nucleation involving organic species, the chemistry of HOM formation and details about the interaction of the parameterisations of Riccobono et al. (2014) and Kirkby et al. (2016).The latter is in part due to the different definition of oxidised organic species, to different instrumentation available and to differences in the experimental design.The largest open question is certainly whether the parameterisation in Riccobono et al. (2014) is also valid for species from terpene ozonolysis.

Table 1 .
Overview of the EMAC simulations.The chemistry was taken fromJöckel et al. (2016).The column "experiment" gives the name of the experiment, which is used for axis labelling in figures.Column "parameterisation" gives the citation for the nucleation parameterisation used."Position" in EMAC indicates in which part of the code the nucleation rate was calculated.10 −2 µg m −3 was assumed for HOM OH and HOM O 3 .This places the saturation vapour pressure within the lowvolatility organic compound (LVOC) regime as described inTröstl et al. (2016).

Figure 1 .
Figure 1.Comparison of observed ground-level ion concentration with simulated concentration at the six measurement sites with ion measurements (Table2).

Figure 2 .
Figure 2. Monthly distribution of observed and simulated ion concentrations at two locations in 2008.The station codes are above each panel.Blue areas (left half of each area) modelled distribution; red (right) areas modelled observed values.For January and February 2008, no station had ion data available.

Figure 3 .
Figure 3. Zonal average yearly mean ion pair production rate, Q, from EMAC for 2008.The white line shows the tropopause.

Figure 4 .
Figure 4. Global distribution of yearly mean ion pair production rate, Q, at ground level (a) and at 200 hPa (b).

Figure 5 .
Figure 5. Logarithm of the aerosol particle number concentration with the Dunne et al. (2016) nucleation scheme implemented in NAN and called in EMAC just before the call of GMXe (y axis) vs. implementation inside the GMXe submodel (x axis).Panel (a) shows the results for nucleation-mode particles and panel (b) for Aitken-mode particles.The colour indicates the total number of counts in each hexagonal bin.

Figure 6 .
Figure 6.Logarithm of the aerosol particle number concentration with the Dunne et al. (2016) nucleation scheme implemented in NAN and called in EMAC after the call of GMXe (y axis) vs. implementation inside the GMXe submodel (x axis).Panel (a) shows the results for nucleation-mode particles and panel (b) for Aitkenmode particles.The colour indicates the total number of counts in each hexagonal bin.
Fig. 8.The left (blue) part of the areas gives the distribution from the EMAC simulation for each month, while the right (red) areas are from observations.The central horizontal line indicates the median concentration; the upper and lower vertical lines indicate the first www.geosci-model-dev.net/11/4987/2018/Geosci.Model Dev., 11, 4987-5001, 2018

Figure 7 .
Figure 7.Comparison of particle concentration from EMAC with atmospheric observations for the year 2008.The used stations and their coordinates are in Table2.

Figure 8 .
Figure 8.Comparison of EMAC-simulated aerosol particle number concentrations, including the parameterisations of Riccobono et al. (2014), Dunne et al. (2016) and Kirkby et al. (2016), with atmospheric observations for three stations and the year 2008.The area shows an estimate of the monthly distribution of values for EMAC simulation (left, blue) and observation (right, red).The central vertical line within each area gives the monthly median for each month; the upper and lower lines are first and third quantiles.The station names are above each panel;Table 2 contains the coordinates for each station.Figures for all stations are in the Supplement.

Table 2 .
Measurement stations used in the comparison with atmospheric particle concentrations in Figs.7 and 8. Station coordinates are taken from the EBAS data files."Altitude" is given in metres.Station names in italic indicate locations with ion measurements.