Regional evaluation of the performance of the global CAMS chemical modeling system over the United States (IFS cycle 47r1)

The Copernicus Atmosphere Monitoring Service (CAMS) provides routine analyses and forecasts of trace gases and aerosols on a global scale. The core is ECMWF’s Integrated Forecast System (IFS), where modules 15 for atmospheric chemistry and aerosols have been introduced, and which allows data-assimilation of satellite retrievals of composition. We have updated both the homogeneous and heterogeneous NOx chemistry applied in the three independent tropospheric-stratospheric chemistry modules maintained within CAMS, referred to as IFS(CB05BASCOE), IFS(MOCAGE) and IFS(MOZART). Here we focus on the evaluation of main trace gas products from these 20 modules that are of interest as markers of air quality, namely lower tropospheric O3, NO2 and CO, with a regional focus over the contiguous United States without data assimilation. Evaluation against lower tropospheric composition reveals overall good performance, with chemically induced biases within 10 ppb across species across regions within the US with respect to a range of observations. The versions show overall equal or better performance than the CAMS Reanalysis. Evaluation of surface air quality 25 aspects shows that annual cycles are captured well, albeit with variable seasonal biases. During wintertime conditions there is a large model spread between chemistry schemes in lower-tropospheric O3 (~10-35%) and, in turn, oxidative capacity related to NOx lifetime differences. Analysis of differences in the HNO3 and PAN formation, which act as reservoirs for reactive nitrogen, revealed a general underestimate in PAN formation over polluted regions likely due to too low organic precursors. Particularly during wintertime, the fraction of NO2 30 sequestered into PAN has a variability of 100% across chemistry modules indicating the need for further constraints. Notably a considerable uncertainty in HNO3 formation associated with wintertime N2O5 conversion on wet particle surfaces remains. In summary this study has indicated that the chemically induced differences in the quality of CAMS forecast products over the United States depends on season, trace gas, altitude and region. Whilst analysis of the three 35 chemistry modules in CAMS provide a strong handle on uncertainties associated with chemistry modeling, the further improvement of operational products additionally requires coordinated development involving emissions handling, chemistry and aerosol modeling, complemented with data-assimilation efforts. https://doi.org/10.5194/gmd-2021-318 Preprint. Discussion started: 1 October 2021 c © Author(s) 2021. CC BY 4.0 License.

Evaluation against lower tropospheric composition reveals overall good performance, with chemically induced biases within 10 ppb across species across regions within the US with respect to a range of observations. The versions show overall equal or better performance than the CAMS Reanalysis. Evaluation of surface air quality 25 aspects shows that annual cycles are captured well, albeit with variable seasonal biases. During wintertime conditions there is a large model spread between chemistry schemes in lower-tropospheric O3 (~10-35%) and, in turn, oxidative capacity related to NOx lifetime differences. Analysis of differences in the HNO3 and PAN formation, which act as reservoirs for reactive nitrogen, revealed a general underestimate in PAN formation over polluted regions likely due to too low organic precursors. Particularly during wintertime, the fraction of NO2 30 sequestered into PAN has a variability of 100% across chemistry modules indicating the need for further constraints. Notably a considerable uncertainty in HNO3 formation associated with wintertime N2O5 conversion on wet particle surfaces remains.
In summary this study has indicated that the chemically induced differences in the quality of CAMS forecast products over the United States depends on season, trace gas, altitude and region. Whilst analysis of the three 35 chemistry modules in CAMS provide a strong handle on uncertainties associated with chemistry modeling, the further improvement of operational products additionally requires coordinated development involving emissions handling, chemistry and aerosol modeling, complemented with data-assimilation efforts.

Introduction
Poor air quality has a significant impact on visibility, human health and lifespan, crop production and ecosystems, while this impact is expected to be accentuated due to climatic change (Silva et al., 2017;Reddington et al., 2019;Schneidemesser et al., 2020). High concentrations of pollutants induce premature mortality (e.g. Lelieveld et al., 45 2015) and spark episodes in people with asthma. For these reasons a predictive capability at local scale is deemed desirable in order to provide forewarning of intense pollution episodes and to perform retrospective monitoring of annual exposure. Thus, like for forecasting weather events, during the last decades there has been a focus on integrating interactive chemistry and aerosol modules into global weather-forecasting models such as the European Centre for Medium Range Weather Forecasts (ECMWF) Integrated Forecasting System (IFS) (Benedetti et al., 50 2009;Morcrette et al., 2009;Flemming et al., 2015;Huijnen et al., 2016;Rémy et al., 2019) amongst others for the purpose of providing short-term Air-Quality Forecasts (AQF) at global scale, in the framework of the Copernicus Atmosphere Monitoring Service (CAMS). The IFS system is upgraded frequently, allowing the fast benefit of advances in meteorological aspects, the chemistry modeling and its interactions. Note that beside the global system, an operational suite of regional-scale models for providing timely AQF for Europe (Marécal et al., 55 2015). For other domains such as the United States (US), AQF are provided by the global CAMS system (http://atmosphere.copernicus.eu; last access 28.09.21), updated twice daily and run at a resolution of approximately 0.4° x 0.4° on 137 vertical levels. The operational configuration furthermore relies on data assimilation of trace gases and aerosol for a suite of satellite retrievals, combined with a state-of-the-art atmospheric chemistry and aerosol model. Until recently, model development of the CAMS global system has 60 focused on the performance at global scale with emphasis on more pristine regions and conditions . Only limited attention has been made for regions directly affected by high anthropogenic emission sources such as the US for which AQF are provided. This gives motivation for the study presented here, which assesses the quality of the CAMS global chemistry modeling with regard to the seasonal mean performance in tropospheric pollutants for a typical year, as compared against a suite of independent measurements for the US.
For the operational suite, the CAMS system adopts the IFS(CB05) version of the model (i.e., based on tropospheric chemistry originating from the TM5 chemistry transport model (Huijnen et al., 2010;Williams et al., 2013) without the explicit modelling of the stratospheric ozone chemistry). This allows for convenience with respect to stability and run time, recognizing thus far the focus of CAMS products is mainly on the troposphere, while stratospheric ozone could sufficiently well be constrained by a linear model combined with (ozone) data assimilation, e.g. Inness 70 et al. (2020). Another application of this system has been the production of a consistent, long-term reanalysis dataset from 2003 to present (Inness et al., 2019;Wagner et al., 2021), which can be used to analyze interannual variability in atmospheric composition . Also, as the reanalysis is using a fixed model configuration (IFS cycle 42R1), this dataset can be as well used as a reference for assessing changes in the performance of the IFS and updates to the operational chemistry modules since that cycle. One constraint is that 75 whatever changes are adopted there should be a net improvement of key products, such as tropospheric O3, as measured with respect to a comprehensive set of reference observations. Key trace gases from the reanalysis dataset (hereafter referred to as CAMSRA) have recently been compared against a host of different aircraft data to allow an assessment of biases at global scale over multiple years (Wang et al., 2020). For CO and O3, regional negative biases of between 15-30% were found, where large biases have also been found in both OH and HO2 80 radicals which act as key oxidants in the troposphere. For CO there is typically an underestimation in the simulated surface concentration , albeit for regions far away from high emission sources, while this is efficiently corrected by the data assimilation in CAMSRA. For the surface over the North Hemispheric midlatitude continents, CAMSRA typically shows seasonal biases in the monthly mean O3, with a negative bias during wintertime and a positive bias during summertime (Wagner et al., 2021). A similar performance is noticed for its 85 control simulation which excludes data assimilation of trace gases. This indicates that further improvements can be attained by focusing on improving the relevant chemical processes included in the CAMS operational system because the impact of the data assimilation of atmospheric composition is limited at the model surface and does not constrain many key species.
To date within the CAMS global modelling system three independent atmospheric chemistry modules are 90 maintained apart from the operational configuration, as described in Huijnen et al. (2019). These are the modules based on the modified CB05 scheme (Williams et al., 2013;, optionally coupled to the BASCOE https://doi.org/10.5194/gmd-2021-318 Preprint. Discussion started: 1 October 2021 c Author(s) 2021. CC BY 4.0 License.
Tropospheric O3 is principally formed via the photolysis of nitrogen dioxide (NO2), with the regeneration of NO2 occurring via the oxidation of nitric oxide (NO) with peroxy-radicals (HO2, CH3O2, RO2) and the titration of O3. The chain length of this cycle is determined by the loss of NO2 into more stable nitrogen compounds, namely nitric 110 acid (HNO3), peroxy-acetyl nitrate (PAN) and organic nitrates (commonly referred to as NOy species), where a large fraction of HNO3 is formed via the heterogeneous conversion of nitrogen pentoxide (N2O5) on wet surfaces (e.g. Brown et al., 2009). HNO3 is soluble and is lost via wet deposition and/or the formation of inorganic aerosol particles in the form of nitrate (NO3 -), whereas PAN exists in a thermal equilibrium. It can dissociate to release NO2 allowing transport of reactive NOx away from the source regions (Fischer et al., 2014). The length of the NOx 115 cycle depends on both the chemical mechanism and the rate parameters employed, determining the regional O3 production efficiency. Therefore, to fully understand differences in O3 production efficiency between the various chemical schemes requires analysis of the major NOy components.
In this study we present an evaluation of key trace gas products (tropospheric O3, NO2, CO) simulated by the chemistry modules implemented in the CAMS system for the contiguous United States. This evaluation is 120 performed for the years 2014-2015, spanning an entire summer and winter period. We also include the corresponding products from the CAMS Reanalysis dataset (Inness et al., 2019) to provide an anchor point towards this previous model version. In Sect. 2 we provide details of the chemistry modules employed in the CAMS system, with emphasis on recent updates to all three chemical mechanisms. In Sect. 3 we provide details of the observations used for evaluating the regional performance across the United States. In Sect. 4, we present analyses for the three 125 main chosen gases and in Sect. 5 we provide further discussion concerning the variability in the main trace species across the different modules with a focus on differences due to NOy. Finally in Sect. 6 we present our conclusions. Additional information in support of the main findings is also provided in the supplementary material.

Model Description 130
In this section we provide a brief description of the various configurations of the CAMS system for global atmospheric chemistry modelling. Here we focus on the upgrades which have been made to the three chemistry modules (CB05BASCOE, MOCAGE, MOZART) available in the IFS as compared to the extensive description of each of the modules as provided in Huijnen et al. (2019). Hereafter we refer to these model configurations as A brief overview of the contents and parameterized differences for each of the various chemical modules is provided in Table 1, below. For details pertaining to the CAMSRA reanalysis dataset the reader is referred to Innes et al. (2019). There is significant variability in the number of thermal (photolytic) reactions across schemes, with IFS(MOC) and IFS(MOZ) being the most explicit (condensed). Compared to Huijnen et al. (2019), the 140 heterogenous scavenging and conversion for N2O5 and HO2 has also been homogenized across the different schemes. As in previous versions, the calculation of photolysis rates is characteristic for each scheme, where a recent inter-comparison has been conducted by Hall et al. (2018) showing differences in the key photolysis frequencies (J values) of ~5%, where the percentage cloud-cover and droplet size provided by the IFS is identical https://doi.org/10.5194/gmd-2021-318 Preprint. Discussion started: 1 October 2021 c Author(s) 2021. CC BY 4.0 License. throughout. Heterogeneous conversion and scavenging are described using the approach of Chang et al. (2011), 145 where the loss of HO2 and NO3 also occur as pseudo-first-order sink processes. 150 IFS(CB05BASCOE), or IFS(CBA) in short, is a merge of tropospheric chemistry originally based on a modified version of the CB05 mechanism (Yarwood et al., 2005), combined with stratospheric chemistry originating from BASCOE (Skachko et al., 2016). The CB05 tropospheric chemistry in the IFS, of primary relevance in this study, adopts a lumping approach for organic species by defining a separate tracer species (Williams et al., 2013;Flemming et al., 2015).

155
The modified band approach (MBA), which is adopted for the computation of tropospheric photolysis rates (Williams et al., 2012), uses 7 absorption bands across the spectral range 202−695 nm. In the MBA, the radiative transfer calculation is performed with a two-stream solver using the absorption and scattering components introduced by gases, aerosols and clouds is computed on-line for each of the predefined band intervals. Heterogeneous reactions and photolysis rates are calculated using CAMS IFS-AER aerosol model (Rémy et al. ,  For IFS(CBA) there have been extensive modifications to four main components of the tropospheric chemistry module, namely: (i) the inclusion of HONO and CH3O2NO2 into the NOx reaction cycle, (ii) the replacement of the isoprene (C5H8) oxidation scheme with a hybrid from the literature, (iii) the coupling of the formation of Secondary Organic Aerosol (SOA) to oxidation products of aromatics and (iv) the inclusion of hydrogen cyanide

170
(HCN) and acetonitrile (CH3CN) from Biomass Burning (BB) sources. This tropospheric chemistry version is referenced as 'tc06f' and is further described below.
The updated rate data for NOx chemistry and NOy components are listed in Table 2 and are based on the updates tested in the chemistry transport model TM5 (Williams et al., 2017). One important update is the use of the new recommendation for the formation of HNO3 (Mollner et al., 2010) that has been shown to have significant effects 175 on the tropospheric O3 burden (Søvde et al., 2011). HONO acts as an important source of OH and NO during the early morning from efficient photolytic destruction after nocturnal build-up (Stutz et al., 2004), whereas CH3O2NO2 alters the NOy chemistry in the free troposphere (Browne et al., 2011). Additionally, updates have also been made to the reaction data for O3 + C2H4 and NO3 + C5H8. To date IFS(CBA) has only included a very simplistic parameterization for the oxidation of C5H8. To improve the realism of the product distribution from the oxidation of C5H8 by OH, we have developed a mechanism which is 185 a hybrid of that developed by Stavrakou et al. (2010) and Lamarque et al. (2012). This new hybrid method includes the direct formation of glyoxal (CHOCHO), hydroxy-aldehydes (HPALD1, HPALD2), a peroxy-product (ISOPOOH), glycolaldehyde (GLYALD), hydroxy-acetone (HYAC) and methyl-glyoxal (CH3COCHO). Explicit J values are calculated online for these products using the latest recommendations for absorption data, as for other species. Most of these intermediates are soluble with Henry solubility analogous to ALD2 except CHOCHO,

190
where the approach of Ip et al. (2009) is used. We still retain the ISPD intermediate representative of methyl-vinylketone and methacrolein from previous versions of the scheme. Reaction rates in this mechanism have been updated following latest recommendations, as aligned with the mechanism described by Myriokefalitakis et al. (2020).
Validation of this updated component of the IFS(CBA) is beyond the scope of this study, but we have found that

195
OH-recycling increases over forested regions with high biogenic emission fluxes is as for that shown in Stavrakou et al (2010) thus affecting atmospheric lifetime of individual trace gases for regions with high resident mixing ratios of C5H8. It should be noted that for O3 and NO3 we still adopt the original stoichiometry in product distribution as in previous versions of IFS(CBA) (Huijnen et al., 2016). We provide details related to this extensive update in Table 3.

200
To date aromatics were not explicitly treated in IFS(CBA), but rather as part of the generic paraffinic bond tracer PAR. This is now updated. For this we follow the work of Karl et al. (2009), who describe the oxidation of the aromatic tracers toluene (TOL) and xylene (XYL), allowing a coupling to Secondary Organic Aerosol (SOA) formation. In addition, in our model version the product distributions and rate expressions for NO and HO2 radicalradical reactions are taken from Myriokefalitakis et al. (2020). This links the aromatic species towards oxidant 205 loss (O3, OH, NO3), the production of CHOCHO/CH3COCHO and allows the introduction of gas-phase precursors for SOA formation (SOG) from anthropogenic and biomass burning sources. We provide details on the extension to the aromatic chemistry in IFS(CBA) in Table 4. OH + XYL  5PAR + AROO2 avg of (1.3 x 10 -11 , 2.36 x 10 -11 , 1.43 x 10 -11 )* O3 + XYL  5PAR + AROO2 avg of (5.37 x 10 -13 x exp(-6039/T), 1.91 x 10 -13 x exp(-5586/T), 2.4 x 10 -13 x exp(-5586/T)) NO3 + XYL  CH3COCHO + PAR avg of (3.6 x 10 -16 , 2.33 x 10 -16 , 4.5 x 10 -16 )* NO + AROO2  NO2 + CHOCHO + 0.33 In Table 5 we provide the rate data used for the oxidation of hydrogen cyanide (HCN) and acetonitrile (CH3CN)

215
by OH. For CH3CN we define a fraction (30%) to be converted to HCN, following Li et al. (2009) but alternatively we prescribe it as a purely chemical sink process in the troposphere, on top of the loss at the surface due to ocean uptake. This allows it to be used as a tracer for selected air-masses with the dominant emission sources being open burning fires (Li et al., 2000). Again, validation of these new tracers is not relevant to this study therefore not presented here.  (Stockwell et al., 1997) with the reactions relevant to the stratospheric chemistry of REPROBUS (Lefèvre et al., 1994;Lefèvre et al., 1998). As in the IFS(CBA) implementation, IFS(MOC) uses a lumping approach for organic trace gas species. IFS(MOC) initial tropospheric RACM chemistry scheme was extended and now includes the sulfur cycle in the troposphere leading to the introduction of DMSO and NH3 gas species (Ménégoz et al., 2009;Guth et al., 2016). The current version of the IFS(MOC) chemistry scheme uses from now 230 on 123 species, including long-lived and short-lived species, family groups and a polar stratospheric clouds (PSC) tracer, 319 gas-phase thermal reactions, 56 photolysis reactions and 14 heterogeneous reactions (9 for the stratosphere and 5 for the troposphere). The version adopted here is as that reported in Huijnen et al. (2019) with four major differences.

235
Firstly, in IFS(MOC), the formation of nitrate, ammonium and sulfur particles from NH3, SO2 and HNO3 gaseous species is now activated in an analogous way to what is used in IFS(CBA). The nitrate and ammonium formation depends on resident sulphate concentrations (c.f. Huijnen et al., 2019;Rémy et al., 2019). This primarily affects the modeled NH3 and HNO3 atmospheric burdens, but indirectly also affects the other trace gases through heterogeneous reactions.

240
Secondly, the formation of gaseous secondary organic aerosol precursors (SOG) from biogenic and anthropogenic and biomass burning volatile organic compounds (VOC) was implemented in IFS(MOC) following the simplified approach proposed by Spracklen et al. (2011). While biogenic sources (namely isoprene and monoterpene) are provided by the IFS(MOC) chemistry scheme, anthropogenic and biomass burning emissions were scaled from 245 https://doi.org/10.5194/gmd-2021-318 Preprint. the corresponding CO emissions. In this simplified approach, only two Secondary Organic aerosol precursor Gas (SOG) low-volatility classes are considered, one for biogenic SOG and the other gathering both the anthropogenic and biomass burning contributions. As in IFS(CBA), this SOG chemistry is coupled in IFS(MOC) with the aerosol module by solving the equilibrium of the partitioning between gas and aerosol phase.

250
Recent developments in IFS(MOC) also include the modelling of the HCN and CH3CN tracers with chemical loss being limited to oxidation by OH and photolysis in the stratosphere, using the same rate data as those used in IFS(CBA) (see Table 5), but with no products defined. As already stated for IFS(CBA), validation of these new tracers is not relevant to this study and therefore not presented here.

255
Last, the heterogeneous scavenging on aerosol, cloud droplets and ice particles for N2O5, HO2 and NO3 has been fully updated, and made consistent with the IFS(CBA) configuration. For the sake of simplicity, the heterogeneous reaction probabilities ( ) are for most surfaces considered constant, as summarized in Table 6. The reaction probability γ(HO2) is computed following Thornton et al. (2008), taking the role of pH and partition between the gaseous HO2 and the dissociated form (O2) into account, adopting a constant pH of 5.5. It follows the description 260 given in Huijnen et al. (2014), where further details are given.  The heterogeneous chemistry in the troposphere is implemented according to the corresponding module from IFS(CBA) to account for heterogeneous uptake of N2O5, HO2 and NO3 on aerosols as described in the previous section. However, the heterogeneous uptake on ice and cloud droplets is currently not included in IFS(MOZ).

Setup of model simulations
The model simulations using the new developments described above have been performed with IFS cycle 47r1. The simulations have been run using 137 vertical levels and a horizontal resolution of TL255, corresponding to 285 ~0.7° x 0.7°, excluding data assimilation of atmospheric composition.
The simulations evaluated here are executed as a series of 24 h hindcasts, daily initialized at 0h UTC from ERA5 meteorology (Hersbach et al., 2020). A 30-minute time step was used. A four-month spin-up period is employed to allow the system to reach chemical equilibrium during September to December 2013, after which an 18-month simulation was performed to allow the use of winter-time measurements for deriving some seasonality in the Apart from Biomass Burning (BB) and SO2, all emissions are applied in the lowest model level. Information on emission totals used in the simulations is given in Table 7 for IFS(CBA). This corresponds essentially to the emissions as adopted in IFS(MOZ) and IFS(MOC), with small variations in the partitioning of some of the lumped VOC's such as the higher alkanes, alkenes and aldehydes, as well as the aromatics. When comparing 310 emission totals with those used in Huijnen et al. (2019) main differences are 10% lower primary emissions for anthropogenic CO and SO2, an 20% increase in anthropogenic NH3, approximately equal NOx emissions and most importantly, a 20% reduction in C5H8. Some of these changes with respect to the anthropogenic contributions are furthermore due to the choice of another evaluation year and trends in the annual emission estimates. In this section we provide an overview of all the observational data used to evaluate the performance of the three IFS versions for the years 2014 and 2015. Figure 1 shows the location of all the measurement stations and regions covered by the aircraft campaigns utilized for assessing the different versions of the IFS.

Surface flasks and soundings for CO and O3
325   aircraft campaigns focus on industrial regions or those influenced by energy production techniques such as fracking for the chosen evaluation year. Data from the following campaigns are used: •  Peischl et al., 2018;Koss et al., 2017). The aircraft campaigns, whose regional coverage and trace species are given in Table 8, are either segregated into different legs for different US states or sample air over a wide area for different days. In the latter case we segregate spatially during the analysis to provide more regional coverage. Lower limits for the trace gases NO2 and NOy are placed on the observations as determined by the detection limits of the instrumentation of 40 ppt as stipulated in the data files. For NO2, HNO3 and PAN dual measurements are available from different instruments, which are 375 merged to increase the available sampling frequency. For N2O5, a maximum of 1.3 ppb is placed on the observations for the comparisons presented in to avoid spurious values.
We interpolate 3-hourly data output from the model simulations using the time, geolocation and pressure of the observations, then average over predefined pressure bins ensuring that there are a sufficient number of observations per bin. We also calculate mean bias statistics for the LT using selected pressure tops of 815 hPa (Col), 900 hPa 380 (EC), 880 hPa (ND) and 965 hPa (Texas) accounting for the variable orography and to include a sufficient number of points as to be statistically robust. The resulting values are presented in the Supplementary Material as Tables S1 through to S4 (O3, CO, NO2, NO).

385
For evaluating surface concentrations of the model output we exploit observations of O3, CO and NO2 from the AirNow (www.airnow.gov; last access: 28 Sept 2021) air quality observations network. Within the AirNow centralized system, hourly measurements of O3, NO2, CO, PM2.5 and PM10 are collected from measurement locations across the U.S., submitted by state or local monitoring agencies and made available in near real time,

390
after preliminary data quality assessments have been performed. Here we use data collected over the 2014 period for the designated domains, which includes 570, 222 and 123 monitoring stations for O3, NO2 and CO, respectively. The list and extension of the sub-domains over which statistical scores are provided is given in Table  9, together with the number of stations used in the current study. Each station is designated as classification being either urban, suburban or rural depending on the location of each station in order to differentiate between clean 395 and polluted sampling locations. Only clean background comparisons are shown due to the difficulty of global models representing small scale gradients in concentrations. Additionally, a filtering procedure was applied to the data before computing the daily mean values to remove stations where the time series displayed more than 50% of identical values, denoting a failure in the measurement sensor. Moreover, for CO the observational values < 50 μg m -3 and >1200 μg m -3 were filtered out at the rural 400 stations to avoid spurious instrumental effects. Model 3-hourly outputs were spatially interpolated from the model grid to the stations network. Time-series of daily mean composites were then obtained by computing the daily mean values across stations with identical station classification for each domain, using both the observed and simulated data during 2014, providing a spatial mean of the concentrations. Details on the accuracy of the instrumentation used across the network can be found in Williams et al. (2014).
405 Table 9: Details of the various regions defined for assessing simulated surface concentrations for O3, CO and NO2 using data from the AirNow measurement network. The total number of individual stations for each respective species is given in the right most column. locations away from strong NOx emission sources. This dataset is therefore a measure of the chemical processing which occurs in transported air-masses at regional scale. The inferred mixing ratios result from the application of the Multilayer Model (Finkelstein et al., 2000) to these samples, which uses site specific meteorological parameters to determine deposition values. We only use the near-surface mixing ratios here, as assessing the wet deposition component of the chemical modules is beyond the scope of this manuscript and more associated with 420 the aerosol modules applied in IFS. For brevity, we aggregate and compare seasonal mean values of the mixing ratios as interpolated from the model data using the location and height of each station for correct interpolation.
It should be noted that from an intercomparison of gaseous HNO3 values, CASTNET data were found to be typically lower than those of other measurement networks (Lavery et al., 2009). However, CASTNET data composites are typically used for assessing the performance of deposition processes in global transport models  Figure 2. We select all levels below 1km for calculating the seasonal means, which allows a direct comparison for coastlines and elevated regions (namely Colorado). For DJF we choose 2014-2015 data to be fully conversant with the validation of the vertical profiles (see later) and when the system has achieved chemical equilibrium for longer lived species.

440
For JJA, the seasonal distribution shows higher mixing ratios exceeding 65 ppb at both the East and West coasts driven by regional NOX emissions. There is a minimum in Central rural regions centred around Colorado, with a variability of between 35-45 ppb in IFS(CBA). Examining regional differences shows that when enhancements in (near-)surface O3 typically occur, both IFS(MOC) and IFS(MOZ) show a ±5% variability with respect to IFS(CBA), with maximal differences on the east coast of +4-6%. Over the surrounding oceans, IFS(MOZ) has a 445 decrease in mixing ratios of between 5-10 ppb due to reduced transport. During DJF the model shows a weak westeast gradient with a lower range of mixing ratios of 30-50 ppb. For DJF, the range of the differences is substantially higher (~8-35%) than for JJA, with IFS(MOZ) having a significant excess in O3 towards the east coast with respect to IFS(CBA). Therefore, under identical NOX emissions, the O3 production efficiency via the reactive NOx cycle is highest for IFS(MOZ) indicating a lower rate of termination towards NOy (See Sect. 6). This subsequently 450 increases mixing ratios of the hydroxyl radical (OH) from the primary production term involving photolysis of O3 in the presence of water vapour (H2O) (see Fig S1).
Evaluations against seasonal O3 sonde composites of all available measurements for JJA and DJF located across the US for the lower troposphere, as shown in Figure 3, also exhibits a signature of the seasonal mean differences as discussed for Fig. 2. For JJA there are positive biases for stations on the east coast, whereas for Trinidad Head

455
(at the west coast) the agreement is more favorable. The highest bias is shown for Huntsville in Texas of between 10-20 ppb, where the station is affected by the transported plumes from nearby cities (Newchurch et al., 2003). In general, the correct profile shape is captured for most sites, except for Trinidad Head where a steep gradient is observed. For DJF, there is more variability across the chemical modules where IFS(MOZ) has the highest mixing ratios towards the east coast leading to a positive bias of 15 ppb at Yarmouth. Again, profile shapes are  In Figure 4 the corresponding annual cycle in monthly mean mixing ratios for surface O3 at the three continuous measurement sites available in the US are shown, with two (BAO and THD) being co-located in vicinity of the 470 sonde measurements. For THD, which is situated at the outermost west coast of the US, the observed monthly mean mixing ratios are very low. This indicates the absence of significant local NOx emissions, therefore being influenced by land-sea air movements allowing the sampling of clean pacific air (Oltmans et al., 2008). For most months in THD and NWR, there is a bias of up to 100% across the different chemical modules with a more muted amplitude in monthly variability than observed, indicating difficulties of the global model configuration towards 475 capturing the correct seasonal cycle. Apart from model resolution effects, the transport of air from out of the US could be too efficient as described by the transport processes, or the production and loss over the ocean surface is not in the correct equilibrium. Comparing IFS(CBA) and CAMSRA there has been an increase of between 2-10 ppb in surface O3 mixing ratios, increasing the simulated biases somewhat.
Even though the stations BAO and NWR are relatively close together there is a diverse shape in the annual cycle

480
showing the influence of regional NOx chemistry, meteorology and station height (1.5km versus 2.9km above sea level). Transport of pollutants from the Denver region affects O3 mixing ratios at NWR (Chin et al., 1994;McDuffie et al., 2016), thus representative of a chemically aged air-mass. The BAO site is also influenced by anthropogenic emissions with measurements sampling the boundary layer (Gilman et al., 2013;McDuffie et al., 2016). At BAO the annual cycle and maximal mixing ratios are captured well, with IFS(CBA) performing better 485 than CAMSRA during the winter months. For NWR, which samples air above the surface, there is no strong annual cycle in the measurements, whereas for the CAMSRA and the mini-ensemble a maximum occurs during JJA with a bias of between ~50-70%.
https://doi.org/10.5194/gmd-2021-318 Preprint.  The variability in the daily mean values in surface O3 simulated by the chemistry modules are evaluated against regional composites of measurements taken from the AirNow network assembled for 2014, see Figure 5. This is 495 done for the regions defined in Table 10  observations and simulations ranges between r=0.4-0.8 for Maine and the East Coast across stations, whilst being anti-correlated in California-Nevada (r ranging from 0 to -0.4), with strong similarities between IFS(CBA) and CAMSRA. For boreal summertime, significant positive biases occur for all chemical modules at both coasts (40-80 μg m -3 ), with Colorado exhibiting smaller negative biases (upto 20 μg m -3 ) due to strong regional NO and VOC mixing ratios (more daytime titration and radical chemistry). In terms of correlation, there is a weak-to- during the summer months indicating too efficient O3 production and/or too little transport and mixing, as seen in previous studies . The description of small-scale urban chemical processing with influence from street canyons and local point sources (factories, processing) is not included in the current CAMS description thus leading to such biases. However, the seasonal variability in the outflow from such polluted regions is described well enough not to lead to corresponding biases for the rural comparisons.  Vertical profiles in the LT across the various US regions are shown in Figure 6, with the boundaries of various regions given in Fig. 1 and Table 3. The corresponding mean bias statistics are documented in Table S1. The shallow to moderate gradients in LT O3 seen in the observations are simulated relatively well across the various 530 chemistry modules, albeit with a changing variability and bias across the regions. For summertime, most comparisons are for the Colorado region over Denver and the surrounding area, where O3 production is heavily influenced by oil and natural gas production and Industry (Cheadle et al., 2017) exhibiting mean values 60-65 ppb for the (near-)surface. During July the region experienced a strong cyclonic front, whereas in August non-cyclonic conditions occurred (Vu et al., 2016) resulting in different transport dynamics for each period although little impact 535 is seen in the mean value. Here, the monthly negative biases are of the order of -2 to -10 ppb, indicating too low regional NOx emissions (c.f. Table S3) and showing that the persistent positive bias seen at more globally remote regions  does not occur during summertime for more polluted urban regions. For CAMSRA more significant negative biases exist of between 10-15 ppb.

545
The evaluation during boreal wintertime over the East Coast, which has lower observed O3 mixing ratios of 38-45 ppb, reveals there is more divergence across the chemistry modules under these conditions, with IFS(MOZ) showing high positive biases of ~10 ppb, whereas IFS(CBA) captures the observational mean profile well within a few ppb. This leads to high oxidative capacity for IFS(MOZ) which has a subsequent impact on tropospheric CO (see next section). This evaluation highlights differences in model performance in the lower troposphere compared 550 to those presented for the surface O3 analysis. In that vertical mixing has occurred means that this analysis is not subject to the representation of locational issues with measurement sites (local emission sources such as roads, the effects of building on transport, etc.). For springtime over Colorado mixing ratios are somewhat lower than summertime (~50 ppb), where IFS(CBA) shows a small negative biases of a few ppb, with the positive bias for IFS(MOZ) persisting (5 ppb). For Texas, a positive bias of 5 ppb occurs across chemistry modules, with the 555 CAMSRA dataset having similar biases throughout regions during springtime.

Tropospheric CO
The corresponding US continental distribution for seasonal mean tropospheric CO for IFS(CBA) for JJA and DJF 560 are shown in the top and bottom left panels of Figure 7, respectively. A distinct east-west gradient exists with ~50% higher mixing ratios towards the East coast reaching ~150-160 ppb. No distinct burning regions are visible towards the west coast associated with comparatively low BB activity for 2014 in the US (Petetin et al., 2018). The two other chemistry modules have consistently lower mixing ratios under identical primary CO emissions, indicating differences in the chemical production rate from the oxidation of formaldehyde and higher Volatile

565
Organic Compounds (VOC), combined with differences in the chemical lifetimes as a result of OH variability . For IFS(MOZ) we diagnose a comparatively low tropospheric CO burden associated with a fast oxidation rate due to higher mixing ratios of OH in the LT of between 20-50% (c.f. Fig S1). This is directly associated with the higher O3 (c.f. Fig 2) in IFS(MOZ). For DJF, a much shallower continental gradient exists with average mean mixing ratios of between 100-120 ppb, with IFS(MOC) and IFS(MOZ) again exhibiting  The seasonal cycle in surface CO mixing ratios is compared against monthly mean composites from the ESRL observational network. The seasonal cycle is somewhat determined by the regional CO emissions as exemplified by the increase observed for July at Park Falls (LEF), which is possibly due to local BB events for this year. A  Fig.7. In some instances, biases are of the order of 100% especially for winter months where OH is typically lower (c.f. Fig S1), and pollution is less mixed into the free troposphere.

595
More extended regional composites for surface CO have been assembled using data available from the AirNow measurement network, with three regional comparisons of the daily variability in surface CO at rural locations being shown in Figure 9. The number of stations used for the comparison is lower than that used for the other species as limited by data availability (c.f. Table 10  Tropospheric CO profiles in the LT are compared against the corresponding aircraft composites across the various campaigns in Figure 10. As for O3, the shape of the vertical profiles is captured well with a significant variability  Table S2). Comparing IFS(CBA) against CAMSRA shows there is only a modest difference in CO, with slight increases in the associated negative biases.
Biases presented in Table S2 shows that, assuming comparable data quality, the influence of local effects related to positioning and selection of stations can result in more extreme biases. For the aircraft composites surface effects are not so important where mixing in the boundary layer provides a more homogenized value. For wintertime and 625 springtime the higher observed mean values of 130-175 ppb are not captured by the mini-ensemble with underestimates of 10-40 ppb depending on the month and region suggesting too low primary emissions over a wide area, with CAMSRA exhibiting the lowest biases.
One dominant precursor for the chemical production of CO is the oxidation of formaldehyde (CH2O) e.g. Zeng et al. (2015). The corresponding comparisons of CH2O for all chemical modules are shown in the Supplementary

630
Material in Figure S4, with associated biases in Table S3. In the LT during boreal summertime conditions, C5H8 acts as a dominant source of CH2O resulting in high mixing ratios over Colorado of 1.7-2.5 ppb. Generally, all chemical modules exhibit negative biases, except for the CAMSRA dataset, associated with the higher C5H8 emissions which are applied (MEGAN-MACC; (Sindelarova et al., 2014)). There is a higher variability associated with the simulations than that observed showing the sensitivity towards both photolysis and dissolution into cloud 635 droplets, which introduces complications in short-term (daily) modelling of CH2O. Note that the higher CH2O in CAMSRA is not directly correlated with higher CO, due to effective CO data assimilation being applied.
During wintertime biogenic fluxes are low due to the seasonality in biogenic activity, thus CO comparisons shown for February and March can be considered to be representative of the background supplemented with regional anthropogenic emission sources, considering the tropospheric lifetime of 1-2 months (e.g. Williams et al., 2017).

640
This results in the resident mean CH2O value being only a quarter of those seen for boreal summertime over Colorado. Under these conditions, the relative negative biases for CH2O increase to ~40-60% across the chemical modules, with IFS(CBA) having twice the negative bias compared to IFS(MOZ). Given the low biogenic precursors, the overall negative bias suggests a deficit in the chemical production term of CH2O likely from the limited oxidation of other VOC's or peroxy-radical termination reactions, combined with missing direct HCHO 645 emissions (Green et al., 2021). For springtime, relative biases are higher than in boreal summertime for Colorado across chemical modules, with values for the Southern US being simulated with low relative biases of around 20%.  Figure 11. The large spatial variability in NOx emissions can be clearly seen resulting in much more distinct regional differences compared to the corresponding plot for CO (Fig 7) for both seasons. For JJA, higher (near-)surface values occur on the Eastern and Western seaboards, as well as Colorado and Texas, associated with urban conurbations. Comparing differences shows that the export of NO2 out of the source regions is more effective in IFS(MOC) associated with the conversion of a large regional fraction of NO2 into PAN under conditions of high 660 VOC fluxes (see Sect 5; Fig S7; Fischer et al., 2014). Thus, increases of > 40% occur in the surrounding oceans as a result of long-range transport of NOx out of the continent. A contributory factor to the simulated differences is the application of the new recommendation for HNO3 formation which results in a ~10% lower gas-phase formation rate in IFS(CBA) (Stavrakou et al., 2013) under identical OH availability. The higher regional OH (c.f. Fig. S1) results in a stronger termination flux into HNO3, which is typically over-estimated in the region (c.f. Fig S6). Colorado is the only region where a negative offset 670 occurs with respect to IFS(CBA), indicating locally higher conversion of NO2 into other NOy components due to high regional VOC fluxes. For IFS(MOZ) lower NO2 mixing ratios occur in the North US, with less export and https://doi.org/10.5194/gmd-2021-318 Preprint. high mixing ratios in the Southern US. Some of these differences exist due to differences in the flux of the NO + HO2 recycling term between chemical modules as a result of a difference in the HOx chemistry (c.f. Fig S1).

Tropospheric NO2
The associated seasonal mean distribution for NO is shown in the supplement (Fig S2), where maximal mixing 675 ratios (0.3-0.4 ppb) occur over the more polluted urban areas. Substantially higher NO is simulated in the LT in IFS(MOC) and IFS(MOZ) compared to IFS(CBA), with continental increases of between 40-50%. Higher NO increases the direct titration term for O3, with IFS(MOZ) having the lowest biases in O3 for boreal summertime as diagnosed in the aircraft campaigns (Table S1). Moreover, the higher OH for JJA in both IFS(MOC) and IFS(MOZ) (c.f. Fig S1) also increases direct gas-phase conversion of NO2 into HNO3 (c.f. Fig S5). Heterogeneous

680
conversion of N2O5 to HNO3 on wet surfaces and particles during nighttime is another important pathway for reducing NOx recycling, as analyzed in more detail in the next section.
In figure 12 we make comparisons of the daily mean variability in surface NO2 in the simulations against daily mean composites assembled from the AirNow network for rural stations within the regions defined in Table 10. The number of stations used for assembling the composite is lower (higher) than that used for surface O3 (CO) and

685
typically not measured at the same location (c.f. Fig. 1 and Table 10). The uncertainty in the observations is higher than for the other longer-lived species and dependent on the instrumentation used in the local networks, especially for the lower concentrations. Except for the East Coast, Maine and Colorado, there is a notable annual cycle, with minima occurring during boreal summertime which is captured across the simulations. Mean daily biases for the various chemical modules are region specific showing the influence of the chemical mechanisms across different 690 chemical regimes. For boreal wintertime, negative mean daily biases of between 0-30 μg m -3 occur throughout the US except for Colorado, where there are lower negative biases of between 0-10 μg m -3 , associated with high NOx and VOC emissions (c.f. surface O3 discussed above). For Colorado, CAMSRA has a significantly lower mean daily bias of up to 2 μg m -3 under different emission estimates and the use of assimilation. The corresponding correlation ranges between R=0-0.5 for all of the simulations across domains, revealing only a weak-to-moderate 695 correlation in the simulated fields. For boreal summertime the mean daily bias is typically negative between 5-30 μg m -3 with only a limited difference between the three chemical modules, with CAMSRA exhibiting strong similarities. The extent of correlation is somewhat station specific with no latitudinal or longitudinal influence, typically ranging with R=0±0.2, thus weakly correlated throughout the stations. For CAMSRA, there is more anticorrelation (R ranging from -0.1 to -0.3). For the more polluted urban stations, the seasonal cycle is captured well 700 with observed mixing ratios ranging from 30-40 μg m -3 during boreal wintertime to 7-15 μg m -3 during boreal summertime, thus approximately twice those measured for the rural stations (not shown). Biases are somewhat higher for the winter months than shown for the rural comparisons (10-15 μg m -3 ), compared to summer months (<10 μg m -3 ). Therefore the differences in the biases between rural and urban environments is not as large as for CO (not shown).

705
As for other species, tropospheric NO2 profiles are compared against the corresponding aircraft composites for the various campaigns in Figure 13, with the quantification of the biases given in Table S5. The corresponding profiles for tropospheric NO are shown in the Supplementary Material ( Figure S2 and Table S4). As for the other trace gases, the shape is captured relatively well, with the profile exhibiting a negative gradient with respect to pressure. Comparing the observational mean values in Table S4 shows that both Colorado and the East Coast have similar 710 environments for NOx levels (around 1.0-2.5 ppb). Model biases increase from summertime to wintertime under such NOx rich conditions. For IFS(CBA) there is no significant change in the biases for many months when compared to CAMSRA. In the various aircraft composites for Colorado, the mini-ensemble shows a significant peak around ~820-830 hPa that is typically not seen in the observations and can result in positive bias in the LT. For NO, the corresponding biases are lower than for NO2 with only marginal differences between chemical 715 modules. This suggests that differences shown in Fig S2 are

725
The NO/NO2 ratio (R) is an indication of the equilibrium position of the NOx chemical system, as determined by the balance between fast titration of O3 by NO plus conversion by peroxy-radicals, and its photochemical production from NO2 photolysis. Low values for R are indicative of an equilibrium which favours O3 production and high values correspond to an equilibrium which favours O3 destruction. A comparison of R between the different chemistry modules and those derived from in-situ aircraft observations for the chosen campaigns are 730 shown in Figure 14. For summertime over Colorado the variability in R across the various campaigns ranges from 0.3-0.6, with significant differences across chemical modules. The profile shapes of R in the LT are captured fairly well, although with lower variability in the chemical modules than in the observations. IFS(CBA) captures the correct R value in the LT for many of the months whilst both other chemistry modules have a higher R, which moderates the O3 biases shown in Table S1. Also, CAMSRA has higher R values typically overestimating resident

735
NO mixing ratios (c.f.  The formation of O3 is determined by the resident NOx mixing ratios and the chain-length in the chemical recycling between NO and NO2 before termination to HNO3 occurs. One rapid loss route for HNO3 production is the conversion of nitrogen pentoxide (N2O5) on wet surfaces and aerosols, which has been directly observed during the WINTER campaign (Kenagy et al., 2018). For this campaign, N2O5 hydrolysis was shown to account for 58%

755
of the chemical loss of NOx (Jaegle et al., 2018), with reaction of NO2 with OH accounting for another third (thus the two most dominant chemical processes). In-situ observations of N2O5 are rare, where daytime mixing ratios are typically of the order of tens of ppt due to the efficient loss by photolysis, and, therefore, subject to high measurement uncertainty. During nighttime, surface observations range from 50-3000 ppt, with high daily variability (e.g. Wood et al., 2005;Brown et al., 2009). The WINTER campaign includes observations taken during 760 night, where accumulations of N2O5 occur, allowing model evaluations to be made.
Evaluations against the nighttime N2O5 measurements from the WINTER campaign are shown in Figure 15, along with the corresponding profiles of HNO3. Unfortunately, no model data was available for N2O5 in the CAMSRA dataset for comparison. The formation of N2O5 involves the NO3 radical, principally formed by the slow oxidation process from the reaction of NO2 with O3 during nighttime. Their relatively small biases for the WINTER 765 campaign (c.f. Fig 6 and Fig 13) provide some confidence in the flux of NO3 production, where different reaction kinetics for thermal equilibrium are applied in each chemical module. Little difference exists across the chemistry modules for the simulated mixing ratios of N2O5, although a signature does exist for February regarding IFS(MOZ), which has higher mixing ratios in the LT by 10-20%. In this chemistry module no N2O5 conversion on cloud particles and ice droplets is assumed. For February a positive bias exists across chemical modules with respect to HNO3 observations, where, counterintuitively, IFS(MOZ) has higher mixing ratios and biases (similar to IFS(MOC)) in spite of less efficient heterogeneous conversion. The corresponding N2O5 profiles indicate strong negative biases thus suggesting too rapid hydrolysis into HNO3. As described in Sec. 2.2, the conversion rate is computed in the IFS from the available Surface Area Density (SAD) of clouds and aquated aerosols, and the conversion frequency γ on these particles, 775 (Brown et al., 2009), which is here assumed in the range of 0.01-0.02 depending on aerosol type, see Table 6. Derivations of γ(N2O5) from a chemical box modelling study based on the measurements across regions and scenarios taken during the WINTER campaign found a median value of 0.0143, with a spread of two orders of magnitude (~0.001-0.1, McDuffie et al., 2018). This suggests that to reconcile the negative N2O5 model bias shown here the adopted γ(N2O5), needs to be made more variable. The inclusion of N2O5 conversion on clouds  Table  S4 shows that NO2 has a small negative bias for these months, although no validation of the nitrate radical (NO3)

790
can be performed to determine whether the deficit in N2O5 is only due to heterogeneous processes. IFS(MOC) exhibits the most occurrences of high ratios for February (> 20), whilst for March both IFS(MOC) and IFS(CBA) have a similar incidence of high ratios (> 50).

805
For the spring and summer months over Colorado there is a high degree of correlation between measured and modelled R', with correlations in the range 0.8-0.9, with the exception of August 2014, where the simulations become uncorrelated. For low R' values there is a tight agreement between the mini-ensemble members and the observational values. For higher R' values (> 10) the chemical modules exhibit low biases under high NOx emissions (see Table S3 and Fig. S5), although with variable biases for HNO3 (c.f. Fig S6). Previous derivations 810 of R' have found values in the range 0.8-10.4 for low NOx environments (Huebert et al., 1990), which is in the range of those observed in the remote free-troposphere. Biases for CAMSRA are overall much lower than those from the three recent chemistry modules during springtime.
Both daytime and nighttime measurements are used for deriving the wintertime (February-March) correlations, where the observational range in R' is approximately half that derived for summertime. Correlations become much 815 weaker and the R' values are an order of magnitude higher for the mini-ensemble than those in the observations, indicating that the NOx chain-length for the chemistry versions is shorter than observed. One main difference is that uncertainties associated with heterogeneous conversion of N2O5 plays a dominant role in HNO3 production during nighttime, which may explain the reduced correlation. For CAMSRA, the high R' values (> 30) during February in the three chemistry modules, does not occur. In order to further evaluate the regional differences in HNO3 across the chemical modules, we make seasonal comparisons of surface HNO3 mixing ratios against observational composites taken from the CASTNET network throughout the US, Figure 17. The maximal mixing ratios for HNO3 occur near regions with high NOx emissions. For JJA differences between configurations are modest, with the largest percentual spread over the comparatively 830 clean northern part. Comparing seasons shows that for IFS(CBA) during DJF much more HNO3 exists towards the western US than for JJA. Instead, there is a significant reduction in IFS(CBA) during DJF on the East Coast that is larger than for the other two modules.
Despite relatively small absolute values in OH mixing ratios during DJF, the significant percentual differences between modules (Fig S1), could be responsible for differences in the direct production term across chemical 835 https://doi.org/10.5194/gmd-2021-318 Preprint.  Figure S7). Here colder temperatures increase its tropospheric lifetime by suppressing thermal decomposition, but simultaneously decrease its formation in absence of biogenic and BB precursor emissions, Fischer et al. (2014). , and a lower photolysis frequency. As a result, a significant overestimation in the fraction of NOx exported out of the source regions will occur, as shown in the seasonal zonal mean PAN distributions in Figure S9, where twice as much northerly 865 transport occurs for IFS(MOC) compared to the other chemistry modules.
The ratio F defined as F=PAN/NO2 can be used to examine the ability of the chemical modules towards capturing the correct partitioning of resident NOx into PAN, which can then be transported out the source regions by convective uplift and long-range transport affecting background O3 budgets (Fischer et al., 2014). In general, the https://doi.org/10.5194/gmd-2021-318 Preprint. observations show an increase in F with respect to altitude, with F typically ranging between 0-1 during 870 summertime and 0-0.2 during wintertime, Figure 18. For most months and regions, IFS(CBA) and IFS(MOZ) provide accurate simulations of the vertical variability in F values below 850 hPa. IFS(MOC) generally has a positive bias in F, particularly during wintertime, where F is up to a factor 2 higher than that observed. This is indicative for a too stable PAN in this chemistry version, which affects the O3 production efficiency via the availability and distribution of NO2.

875
During summertime over Colorado IFS(MOC) exhibits good agreement in F in the boundary layer, with IFS(CBA) and IFS(MOZ) under-estimating by 0.03-0.05. In spite of the updates of the NOx chemistry in IFS(CBA), CAMSRA has slightly lower biases for summertime. For wintertime (springtime) towards the East Coast (Colorado) F ratios for IFS(MOC) are nearly double those observed. However, agreement is quite good for North/South Dakota for April showing the regional variability in performance of the chemical modules. As PAN 880 is transported out of the boundary layer the contribution of the loss rate due to photolysis increases (albeit with a low frequency thus allowing long transport lifetimes). This highlights the importance of a correct parameterization of the photolysis frequency across the various chemistry modules. The reaction towards the colder temperatures in the FT markedly affects lifetime, where different reaction kinetics are applied across chemical modules. On average an improvement for IFS(CBA) is seen with respect to CAMSRA, which can be attributed to improvements 885 in model NOy chemistry when considering the corresponding negative bias for NO2 shown in Figure 13.

Conclusions
In this study we have presented a detailed description of the recent updates which have been made to the chemistry modules that are integrated in ECMWFs IFS global model for the purpose of performing global Air Quality 890 forecasts. We have evaluated a set of three simulations covering the years 2014/2015, using the latest model configuration as developed in an experimental version of the ECMWF IFS cycle 47R1. This provides insight in the performance of the modeling of trace gases (here excluding data assimilation) in the CAMS global system. This study has focused on lower tropospheric composition for the contiguous United States with an emphasis on tropospheric O3, NO2 and CO. We also included comparisons against the most recent reanalysis dataset that is 895 based on a previous CAMS configuration (CAMSRA). This allows assessment of model changes compared to this established dataset.
By comparing seasonal means in the lower troposphere between the various chemistry modules we have shown a strong seasonality in the regional inter-model differences for O3, CO and NO2 in the US. For O3 these differences are limited to ±5% during boreal summertime, during which higher mixing ratios occur. The ability to capture the 900 regional seasonality in surface concentrations for the background is somewhat region dependent, with relatively good agreement for the West Coast and an overestimation towards the East Coast. Comparing seasonal composites against ozonesondes shows that there is generally good agreement in more remote locations and high positive biases of 10-30 ppb for more polluted regions, especially at the surface near the US East Coast. Comparisons for more southerly regions show lower mean daily biases in Texas and California-Nevada with limited correlation in 905 the daily variability. For the Colorado region, there are biases of ±6 ppb across chemistry modules (±10-15%). At the surface there are small negative biases of around 5 μgm -3 for IFS(CB05BASCOE) and 15 μgm -3 for CAMSRA. For boreal wintertime a significant variability in the O3 production efficiency occurs across chemistry modules resulting in IFS(MOCAGE) and IFS(MOZART) exhibiting increases in mixing ratios of +6-15% and +20-25% across a wide region as compared to IFS(CB05BASCOE), especially in the Northern US. A significant positive 910 bias in surface concentrations occurs for 2014 in the Northern US indicating too efficient O3 production, whereas CAMSRA exhibits a significant negative bias. Other regions show less difference across the simulations.
Associated differences occur for the OH radical for both seasons, which leads to significant differences in the tropospheric distribution of CO of between 8-20%, especially during wintertime. In general, the seasonal cycle at the surface is captured well when compared to both ESRL background observations and surface AirNow, with between 5-40% depending on the region, which contributes to the negative CO biases. Biases of CO for all chemical modules are typically larger than the CAMSRA dataset which is strongly constrained by assimilation of CO observations from satellite retrievals.
As was the case for O3, also NOx shows a seasonal variation in the simulated inter-model differences within the 925 order of 5-10% for NO2 and up to 50% for NO. Comparing profiles for both trace gases against aircraft measurements shows significant negative biases exist for both NO and NO2 for the NOx-rich environment of Colorado across all chemical modules indicating regional emissions which are too low. The performance of the three chemistry versions is overall better than for CAMSRA, which can be understood by the fact of the limited impact of NO2 data assimilation in the CAMS reanalysis. Comparisons against AirNow surface observations 930 shows that the regional annual cycles are captured well across the simulations with negative biases and showing only a weakly correlated daily variability.
Examining NO/NO2 ratios shows that the equilibrium between NO and NO2 is mostly captured well by IFS(CB05BASCOE) in the boundary layer, with the other chemical modules overestimating the fraction of NO (albeit with lower NOx mixing ratios). A strong correlation exists in the HNO3/[NO + NO2] ratio across days for 935 boreal summertime between the modelled and measured fields ( R> 0.9), albeit with a negative model bias of ~50%. This is indicative of a lower NOy burden in the simulations due to cumulative differences in emissions, chemistry, aerosol formation and deposition processes. For CAMSRA, the HNO3/[NO + NO2] ratio is overall better. For nighttime under cold conditions, the NO/NO2 ratio is typically underestimated implying a lack of NO regeneration by slow redox reactions.

940
There is generally an overestimation in HNO3, both at surface and in the free troposphere, which may be due to too efficient N2O5 hydrolysis on wet surfaces under some conditions. Model analysis suggests that this conversion on cloud surfaces is not a dominating term with respect to associated N2O5 comparisons for the East Coast during the wintertime period.
One dominating factor on the seasonal distributions of NO2 is the fraction stored as PAN and transported out the 945 source regions. For boreal summertime, IFS(MOZART) simulates 20-50% less resident PAN than IFS(CB05BASCOE), which contributes to the more efficient O3 formation in IFS(MOZART). When comparing against aircraft profiles around Colorado for July and August, there is generally an underestimation in resident PAN of 40-60% across chemical modules, suggesting a lack of Volatile Organic Compounds precursors and subsequent acetyl-peroxy radicals, in line with previous studies . For boreal wintertime, when 950 there is an extended tropospheric lifetime under cold temperatures, significant positive biases in regional PAN were diagnosed as compared against aircraft profiles for IFS(MOCAGE), pointing at differences in model assumptions regarding the stability of PAN, as determined by the rate data employed. This is also reflected by the PAN/NOx ratios which show a strong overestimate in IFS(MOCAGE) and requires future developments.
As presented in this manuscript, a significant divergence of key air quality products simulated by each of the 955 chemistry modules exists, depending on seasonal and regional conditions. These are due to fundamental differences associated with the oxidative capacity and the regional efficiency for the production of tropospheric O3, which are in turn determined by the chemical mechanism, the parametrizations adopted and the rate data used.
In future studies attention should be made towards (i) improvement of variability in surface O3, CO and NO2, with respect to air quality observations, by a joint effort of improving the emissions and deposition handling, and 960 improved diagnostics (ii) further homogenization of the physical conversion processes across modules with respect to radicals and N2O5, (iii) improve on the VOC tropospheric burdens to provide sufficient peroxy-radicals for better PAN formation and (iv) further investigate what determines PAN mixing ratios under cold/low light conditions in term of dissociation and stability.
Whilst analysis of the three chemistry modules in CAMS provide a strong handle on uncertainties associated with 965 chemistry modeling, the further improvement of operational products additionally requires coordinated development involving emissions handling, chemistry and aerosol modeling, complemented with dataassimilation efforts.

Author Contributions 970
https://doi.org/10.5194/gmd-2021-318 Preprint. Discussion started: 1 October 2021 c Author(s) 2021. CC BY 4.0 License. JEW and VH were principal authors of the paper and conducted most of the evaluation against observational datasets. VH, IB, SP, BJ and VM performed the three individual simulations which were used for the evaluation. MM and TS helped with the evaluation against the AirNow surface observations. JF contributed to the interpretation of results.

Code availability
Model codes developed at ECMWF are the intellectual property of ECMWF and its member states, and therefore the IFS code is not publicly available. ECMWF member-state weather services and their approved partners will get access granted. Access to a version of the IFS (OpenIFS) that includes this experimental cycle may be obtained from ECMWF under an OpenIFS licence. More details at https://confluence.ecmwf.int/ 980 display/OIFS/OpenIFS+Home (last access: 28 Sept 2021)

Competing Interests 985
The authors declare that they have no conflict of interest