Description and evaluation of tropospheric chemistry and aerosols in the Community Earth System Model (CESM1.2)

The Community Atmosphere Model (CAM), version 5, is now coupled to extensive tropospheric and stratospheric chemistry, called CAM5-chem, and is available in addition to CAM4-chem in the Community Earth System Model (CESM) version 1.2. The main focus of this paper is to compare the performance of configurations with internally derived “free running” (FR) meteorology and “specified dynamics” (SD) against observations from surface, aircraft, and satellite, as well as understand the origin of the identified differences. We focus on the representation of aerosols and chemistry. All model configurations reproduce tropospheric ozone for most regions based on in situ and satellite observations. However, shortcomings exist in the representation of ozone precursors and aerosols. Tropospheric ozone in all model configurations agrees for the most part with ozonesondes and satellite observations in the tropics and the Northern Hemisphere within the variability of the observations. Southern hemispheric tropospheric ozone is consistently underestimated by up to 25 %. Differences in convection and stratosphere to troposphere exchange processes are mostly responsible for differences in ozone in the different model configurations. Carbon monoxide (CO) and other volatile organic compounds are largely underestimated in Northern Hemisphere mid-latitudes based on satellite and aircraft observations. Nitrogen oxides (NOx) are biased low in the free tropical troposphere, whereas peroxyacetyl nitrate (PAN) is overestimated in particular in high northern latitudes. The present-day methane lifetime estimates are compared among the different model configurations. These range between 7.8 years in the SD configuration of CAM5-chem and 8.8 years in the FR configuration of CAM4-chem and are therefore underestimated compared to observational estimations. We find that differences in tropospheric aerosol surface area between CAM4 and CAM5 play an important role in controlling the burden of the tropical tropospheric hydroxyl radical (OH), which causes differences in tropical methane lifetime of about half a year between CAM4-chem and CAM5-chem. In addition, different distributions of NOx from lightning explain about half of the difference between SD and FR model versions in both CAM4-chem and CAM5chem. Remaining differences in the tropical OH burden are due to enhanced tropical ozone burden in SD configurations compared to the FR versions, which are not only caused by differences in chemical production or loss but also by transport and mixing. For future studies, we recommend the use of CAM5-chem configurations, due to improved aerosol description and inclusion of aerosol–cloud interactions. However, smaller tropospheric surface area density in the current version of CAM5-chem compared to CAM4-chem results in Published by Copernicus Publications on behalf of the European Geosciences Union. 1396 S. Tilmes et al.: Evaluation of tropospheric chemistry and aerosols in CESM1.2 larger oxidizing capacity in the troposphere and therefore a shorter methane lifetime.


Introduction
The Community Earth System Model (CESM) is a comprehensive model that couples different independent models for atmosphere, land, ocean, sea ice, land ice, and river runoff (e.g., Neale et al., 2013;Lamarque et al., 2012).It can be used in various configurations, depending on the use of different components and the coupling between them.The atmospheric component of CESM, the Community Atmosphere Model (CAM), has the capability of including chemistry of varying complexity.Default CESM configurations used for long-term climate model simulations usually include prescribed chemical fields in the atmosphere using monthly averages.To produce those prescribed input fields, simulations with a detailed representation of chemistry and aerosol processes are required.Furthermore, nonlinear interactions between chemistry and aerosols in the atmosphere are important for chemistry-climate interactions (e.g., Lamarque et al., 2005;Isaksen et al., 2009) or for the simulation of air quality.
In CESM version 1.2, CAM version 5 (CAM5), extensive tropospheric and stratospheric chemistry, referred hereafter to as CAM5-chem, has been successfully implemented.The performance of CAM version 4 (CAM4) with interactive chemistry, referred to as CAM4-chem, has been discussed in Lamarque et al. (2012).In this study, a similar setup of both CAM4-chem and CAM5-chem allows for the comparison of both versions and their performance in comparison to observations.The two atmospheric configurations CAM4-chem and CAM5-chem differ in various aspects, including the treatment of cloud, convection, turbulent mixing, and aerosol processes (e.g., Neale et al., 2013;Gent et al., 2011;Kay et al., 2012;Liu et al., 2012), whereas the gasphase chemistry is identical.Resulting differences in dynamics, clouds, precipitation, and radiation will alter chemical reactions in the gas, aqueous, and aerosol phases, and removal processes, and therefore the chemical composition of the atmosphere in these configurations.
In addition to exploring differences between the two atmospheric model versions using internally produced meteorology, we also perform simulations in which the meteorology (temperature, winds, and surface fluxes) is nudged towards meteorological analysis (or reanalysis) fields to reduce differences in the dynamics of the two configurations.Furthermore, two slightly different aerosol schemes of the modal aerosol model (MAM) are tested in CAM5-chem, the threemode version (MAM3) (Liu et al., 2012) and the four-mode version (MAM4) (Liu et al., 2015).In addition, sensitivity studies are performed to explore differences in the oxidizing capacity of the atmosphere and therefore in tropospheric methane lifetime in the different model configurations.In this way, relationships between methane lifetime, aerosol and chemistry composition, and meteorological parameters are explored.
A comprehensive evaluation of all configurations is performed, using a set of present-day observational climatologies of different chemistry and aerosol species from ground-based, aircraft and satellite observations.Strengths and weaknesses of the various model configurations are discussed.Evaluation tools for trace gases and aerosols developed in this study are merged to the Atmospheric Model Working Group (AMWG) diagnostics package, and are available to the community on the CESM website (https://www2.cesm.ucar.edu/working-groups/amwg/amwg-diagnostics-package).
This paper is structured as follows.Section 2 gives details of the model configurations and experiments performed for this study.Section 3 describes present-day climatological data sets used in this study to evaluate the model.Model-tomodel differences in dynamics, chemistry and aerosols, and global budgets are discussed in Sect.4.1.A comprehensive evaluation of chemistry and aerosols, based on satellite and in situ observations is performed in Sect.4.2.We discuss reasons for differences in tropospheric methane lifetime of the different model configurations, an indicator of the oxidizing capacity of the atmosphere, in Sect. 5. A summary and discussion of the results is given in Sect.6.

Model configurations and experiments
The presented results are based on output from simulations performed with the NCAR Community Earth System Model (CESM) version 1.2.(https://www2.cesm.ucar.edu/models/current).All model simulations are performed with prescribed sea surface temperatures and sea ice distribution data for present-day climatological conditions, since we focus on the atmospheric component.Dry deposition of gases and aerosols are implemented in the Community Land Model (CLM) (Oleson, 2010) as described in Lamarque et al. (2012).For all experiments CLM version 4.0 was used.CESM 1.2 can also include online calculation of biogenic emissions in CLM using the Model of Emissions of Gases and Aerosols from Nature (MEGAN) version 2.1 (Guenther et al., 2012).In this study, biogenic emissions are prescribed (see below) to ensure having the same amount of emissions in all configurations, and interactive biogeochemistry was not included.
CAM4-chem uses 26 vertical levels while CAM5-chem uses 30, and they both have a model top around 40 km.The horizontal resolution of performed simulations is 1.9 • × 2.5 • and we use the finite volume dynamical core.An important difference between the two atmospheric models is the cloud microphysics, which in CAM4-chem predicts only the mass concentrations of the cloud species, but in CAM5-chem predicts the number as well as mass concentrations.CAM5chem consequently treats the microphysical effect of aerosols on clouds (Ghan et al., 2012), while in CAM4-chem aerosols impact physics and dynamics only through their interaction with radiation.
CAM4-chem and CAM5-chem further differ in the parameterization of aerosols.CAM4-chem runs with a bulk aerosol model (BAM), which considers a fixed size distribution of externally mixed sulfate, black carbon (BC), organic carbon (OC), sea salt and dust (Tie, 2005).Sea salt and dust are described using four different bins.In CAM4-chem, the formation of secondary organic aerosols (SOA) is coupled to chemistry.SOA are derived using the two-product model approach using laboratory determined yields for SOA formation from monoterpene oxidation, isoprene and aromatic photooxidation, as described in Heald et al. (2008).
The current standard CAM5 model version as well as CAM5-chem uses the modal aerosol model with three modes (MAM3) (Liu et al., 2012).The aerosol components, including BC, primary organic matter (POM), SOA, sea salt, dust, and sulfate, are internally mixed in each lognormal mode, and the aerosol mass and the total number in each mode are predicted.CAM5-chem is also tested with the four-mode version, MAM4, called CAM5-MAM4-chem from here on.The main difference between these two modal versions used here is the representation of BC and OC.In MAM3 all BC and OC is assumed to be aged and hence is emitted directly into the accumulation mode with other soluble aerosol species, whereas MAM4 emits the BC and OC in the primary carbon mode and represents the aging process of BC and OC from the primary carbon mode to the accumulation mode, as done in BAM.For the SOA production in CAM5-chem, mass yields of several biogenic and anthropogenic volatile organic compounds (VOCs) are prescribed.The resulting condensable secondary organic gas reversibly and kinetically partitions to the aerosol phase, as described in detail in Liu et al. (2012).This approach results in much larger burden of SOA in CAM5-chem than in CAM4-chem, as shown in Tsigaridis et al. (2014).The dust emissions are calibrated so that the global dust aerosol optical depth (AOD) is between 0.025 and 0.030 (Mahowald et al., 2006).Furthermore, sea salt emissions are calibrated to present-day conditions so that the global mean AOD (for all species) are within the reasonable range.Those values have been evaluated in Liu et al. (2012), who have shown that the difference between model simulations and observations are generally within a factor of 2.
The production of sulfate aerosol (SO 4 ) in CAM4-chem and CAM5-chem is also parameterized differently.In this paper we always consider SO 4 in solid particle phase, SO 4 (p), and sulfur dioxide (SO 2 ) and sulfuric acid (H 2 SO 4 ) in CAM5, in the gas phase, SO 2 (g) and H 2 SO 4 (g), if not explicitly noted differently.In CAM5-chem, sulfate aerosols are assumed to be in the form of ammonium hydrogen sulfate (NH 4 HSO 4 (p)), considering partial neutralization by ammonia (NH 3 ), since NH 3 and ammonium NH + 4 cycles are not explicitly treated in this version.In CAM4-chem, SO 4 is produced directly from SO 2 by oxidation through heterogeneous reactions on aerosols.In CAM5-chem, sulfates are produced via H 2 SO 4 condensation on existing aerosols, where H 2 SO 4 is formed by the oxidation of SO 2 .Both CAM4-chem and CAM5-chem include aqueous-phase production of SO 4 from SO 2 (aq) with more than half formed by the hydroperoxyl (HO 2 ) uptake and subsequent hydrogen peroxide (H 2 O 2 ) oxidation in cloud droplets (Liu et al., 2012).In addition, CAM5-chem includes homogeneous nucleation of sulfate particles from H 2 SO 4 gas, which contributes less than 1 % to the production of SO 4 mass but is an important source of aerosol number.Also, while in CAM4-chem sulfur oxides emissions are in the form of SO 2 only, in CAM5 2.5 % of SO 2 is emitted in the form of sulfate aerosol.
Furthermore, the representation of removal processes is different in CAM4-chem and CAM5-chem.In CAM4-chem all of the aerosol in the cloudy fraction of the grid cell is assumed to reside within cloud droplets and is removed in proportion to the cloud water removal rate.In CAM5-chem the mass and number fraction of the cloud-borne aerosol is determined from the aerosol activation parameterization (Ghan and Easter, 2006), so that smaller particles are not removed by nucleation scavenging.
CAM4-chem has been run and tested with comprehensive tropospheric and stratospheric chemistry (Lamarque et al., 2012).The chemical mechanism is based on the Model for Ozone and Related chemical Tracers (MOZART), version 4 mechanism for the troposphere (Emmons et al., 2010), extended stratospheric chemistry (Kinnison et al., 2007), further updates as described in Lamarque et al. (2012), and additional reaction rate updates following JPL-2010 recommendations (Sander et al., 2011).In CESM1.2CAM4-chem, the lumped aromatic ("TOLUENE") was replaced with the specific species benzene, xylene and toluene, along with simplified oxidation products for the two new species, to accommodate the two-product formation of SOA (new reactions listed in Appendix A).These changes do not have an impact on the chemical performance of the model.
As in CAM4-chem, CAM5-chem couples tropospheric aerosols to chemistry through heterogeneous reactions, as listed in Lamarque et al. (2012, Table 4).Tropospheric heterogeneous reactions of chemical species are parameterized based on aerosol surface area density (SAD) and therefore depend on the overall aerosol loading.The total tropospheric SAD in both model configurations is derived using the mass and size distributions of ammonium sulfates, black carbon, and organic aerosols.The contribution of very small particles, such as the Aitken mode in MAM3 and the primary carbon mode in MAM4, to the SAD are neglected in the the model calculation of surface area density.Furthermore, sea salt and mineral dust aerosols do not contribute to SAD in either model version, as heterogeneous reactions are not assumed to occur on these surfaces.Since reactions on very small particles are important, this may lead to an underestimation of SAD in the model.
For all simulations, model configurations simulate wet deposition of gas species using the Neu and Prather (2012) scheme, including a bug fix to CESM1.2,where the SO 2 Henry's law coefficient has been updated, resulting in reduced washout rates.This fix resulted in an increased burden of SO 4 in CAM4-chem, which has been adjusted by increasing the in-and below-cloud solubility factor of SO 4 from 0.3 to 0.4.In addition, improved calculations of dry deposition velocities for gas species, as discussed in Val Martin et al. (2014), are added to this study, which results in an improved representation of surface ozone, as discussed below.

Experiments
Two different configurations of both CAM4-chem and CAM5-chem are used in this study.In the free running (FR) version the meteorology and dynamics are internally derived.We also run CAM4-chem and CAM5-chem in a specified dynamics (SD) version of the model, called SD-CAM4chem and SD-CAM5-chem, respectively.In this configuration, the internally derived meteorological fields are nudged every time step (30 min) by 10 % towards analysis fields (i.e., a 5 h Newtonian relaxation timescale for nudging) from the Modern-Era Retrospective Analysis For Research And Applications (MERRA) reanalysis product (http://gmao.gsfc.nasa.gov/merra/)(Rienecker et al., 2011), regridded to the model horizontal resolution.The SD model version adopts the vertical levels of the analysis data up to the top of the model (around 40 km), resulting in 56 vertical levels for both CAM4-chem and CAM5-chem simulations; see Lamarque et al. (2012) and Ma et al. (2013) for details.For the SD simulations, we use meteorological analysis for the years 2000-2010.
Emissions and prescribed chemical fields for longer-lived substances follow the protocol defined by the Chemistry Climate Model Initiative (CCMI) hindcast simulations for the year 2000 (Eyring et al., 2013), which are repeated for all the simulated model years for both FR and SD configurations.In particular, greenhouse gases are from Meinshausen et al. (2011), surface mixing ratios of ozone depleting substances are taken from WMO (2010, Tables 5-A3), anthropogenic and biofuel emissions are from the MACCity emission data set (Granier et al., 2011), and biomass burning emissions are taken from the Atmospheric Chemistry and Climate Model Intercomparison Project (ACCMIP) historical emissions data set (Lamarque et al., 2010).Biogenic emissions are prescribed in this study for all model configurations using a climatology based on MEGAN version 2.1, with the same emissions for all model experiments; carbon monoxide (CO): 1053 Tg yr −1 , isoprene: 525 Tg yr −1 , monoterpene: 97 Tg yr −1 , and methanol: 170 Tg yr −1 .All experiments use the same solar forcing, with lower boundary conditions fixed for the year 2000.
Two additional sensitivity experiments are performed to test differences between CAM4-chem and CAM5-chem that may be caused by differences in the aerosol description in the model, in particular the amount of tropospheric SAD in the different configurations.CAM5-chem simulates significantly lower SAD than CAM4 (as discussed in Sect.4.1.2).We perform an additional CAM5-chem (CAM5-chem * ) simulation where SAD is increased by a factor of 1.5 to match the averaged tropospheric SAD amount that is simulated in CAM4chem.We also perform SD-CAM5-chem * that matches averaged tropospheric SAD of the SD-CAM4-chem simulation, requiring SAD to increase by a factor of 1.9.Finally, we perform a simulation that uses the MAM4 modal scheme, CAM5-MAM4-chem, as described above.An overview of the setup and global model diagnostics of the different model configurations is given in Table 1.

Present day climatological data sets
To evaluate the performance of the different model configurations, we made use of several satellite and in situ chemical data sets.We use present-day climatological data sets with a focus on the troposphere that have been derived from observations between 1995 and 2012.

Satellite climatologies
The comparison of the model simulations to satellite observations provides a global picture on the representation of CO and ozone columns.To evaluate tropospheric and stratospheric column ozone in the model simulations, we compare the model to a present-day column ozone climatology compiled by Ziemke et al. (2011).This climatology was derived by combining retrievals from the Aura Ozone Monitoring Instrument (OMI) and Microwave Limb Sounder (MLS) observations over the period between October 2004 and December 2010.The monthly-mean thermal tropopause is used to separate between tropospheric and stratospheric ozone for the model results and satellite climatology.
For comparison with CO, a new climatology is compiled based on Measurements of Pollution in The Troposphere (MOPITT) version 6 Level 3 data, using the multispectral (thermal-infrared plus near-infrared) total column product.This monthly mean gridded climatology on a 1 • × 1 • horizontal resolution includes data between 2003 and 2012.Only daytime MOPITT data were analyzed.The version 6 (V6) MOPITT product is similar to the validated version 5 (V5) product (Deeter et al., 2013) with several differences (Deeter et al., 2014).The V5 products relied on a priori CO concentrations based on the MOZART chemistry transport model and National Centers for Environmental Prediction (NCEP) analysis fields.The a priori for V6 products is based on CAM4-chem simulations for the period from 2000 to 2009 (Lamarque et al., 2012) and the retrieval processing exploits the MERRA reanalysis product.Finally, geolocation (latitude and longitude) data are more accurate for V6 product as the result of a correction for a slight misalignment between the MOPITT instrument and the Terra spacecraft.The V6 product is described in more detail in a user's guide available on the MOPITT website (http://www2.acd.ucar.edu/mopitt/publications).Monthly mean Level 3 MOPITT a priori and averaging kernels are applied to monthly mean model results to account for the a priori dependence and vertical resolution of the MOPITT data.CO columns are derived for altitudes between surface and 100 hPa.
For the comparison of AOD, we use a 1 • × 1 • monthly averaged climatology for present-day AOD at 550 nm, derived using various satellite data including observations from the AErosol RObotic NETwork (AERONET) (Kinne, 2009).

Ozonesonde climatology
For a detailed evaluation of tropospheric ozone profiles and seasonality, a present-day ozonesonde climatology is used (Tilmes et al., 2012).This climatology covers available ozonesonde observations between 1995 and 2011 for 42 stations around the globe.Ozonesonde observations do agree reasonably well with surface and aircraft observations (Tilmes et al., 2012).Maximum summertime ozonesonde data over the Eastern US is biased high by about 10 ppb compared to surface observations, but otherwise the ozone climatology provides reliable ozone vertical profiles for differ-ent seasons and regions.In this study, monthly mean model results are interpolated to the locations of the data and aggregated over defined regions, as suggested in Tilmes et al. (2012).

Aircraft climatologies
For the evaluation of various chemical species, averaged profiles from various aircraft campaigns between 1995 and 2010 were derived for different regions and seasons around the globe.Details of aircraft campaigns included between 1995 and 2010 are given in Table 2.More details, including information of earlier aircraft campaigns, are provided on https://www2.acd.ucar.edu/gcm/aircraft-climatology.As discussed in Emmons et al. (2000), for each aircraft campaign, regions with high frequency occurrence of vertical profiles from the aircraft are identified.Mean and median profiles of available species are compiled over these regions, as well as percentiles of the distribution with a 1 km vertical resolution.Profiles that are outliers of the distribution were removed.Following this approach, we extended the existing climatology as described in Emmons et al. (2000) to include additional aircraft campaigns up to 2010.
The largest sampling frequency of aircraft observations included in this study is over Europe and the US during spring and summer.For each observed regional pro- file, monthly-mean model results are averaged over the location and months of the observations.It is assumed that these regional profiles represent typical background conditions.However, one has to keep in mind that aircraft campaigns often target specific atmospheric conditions that may not be captured in multiyear average model results.Nevertheless, the combination of the numerous aircraft campaigns provides a general overview on the behavior of the chemistry in the model.In this way, aircraft data provide a very powerful evaluation tool, because various species were observed at the same time during the flight and can be evaluated side by side.A comparison is performed for ozone (O 3 ), CO, nitrogen oxides (NO x ), peroxyacetyl nitrate (CH 3 COO 2 NO 2 or PAN), selected hydrocarbons, SO 2 and sulfate aerosol for selected aircraft campaigns.In addition, we averaged profiles over certain altitude intervals and grouped them into four regions and four seasons, to identify systematic differences between models and observations.A data set derived during the HIAPER (High-Performance Instrumented Airborne Platform for Environmental Research) Pole-to-Pole Observations (HIPPO) campaigns (Wofsy et al., 2011) is available for model evaluation purposes (Wofsy et al., 2012).During the campaigns, profiles from 85 • N to 65 • S over the Pacific Ocean and North America were sampled in January and November 2009, March/April 2010, June/July 2011, and August/September 2011.Each of the campaigns sampled very similar flight tracks over the Pacific and North America, which provides information for comparing similar regions in different seasons (Wofsy et al., 2011).For this paper, we use O 3 , BC, and PAN data (Schwarz et al., 2013;Wofsy et al., 2011).The aircraft profiles sampled during different HIPPO campaigns were averaged over 5 • latitude intervals along the flight path over the Pacific Ocean to produce a gridded data set that can be easily compared to model output.Likewise, model results are binned over the same latitude regions as done for the aircraft observations.Here, we compare the observations to monthly mean model data that are aligned with the months of the corresponding campaign.It has to be kept in mind that the HIPPO data set, even though observing the background atmosphere over the Pacific, is influenced by the specific situation for the particular year.This climatological comparison has shortcomings, in particular because the emissions of the particular year were not considered.

Surface observations
We use two sets of surface observations in this study.Surface observations from the United States Interagency Monitoring of Protected Visual Environments (IMPROVE) data set (http: //vista.cira.colostate.edu/improve/)(Malm, 2004) are used for years 1998-2009, to compare sulfur dioxide and sulfate aerosol with the model results.The IMPROVE network includes 165 sites in the US.Major fine particles (with diameter < 2.5 µm) are monitored, including aerosol species, sulfates, nitrates, organics, light-absorbing carbon, and windblown dust.IMPROVE sites are located in rural environments and therefore will not describe the conditions found in large urban areas.Ozone surface observations are used to evaluate daily ozone concentration in our model configurations.Daily averages from available hourly surface ozone data were derived from the Clean Air Status and Trends Network (CAST-NET) (http://java.epa.gov/castnet/) and the European Monitoring and Evaluation Programme (EMEP) network in Europe (http://www.emep.int/)for years 1995-2010, as shown in Tilmes et al. (2012).

Model-to-model comparison
Differences in the physics, including cloud and aerosol schemes between CAM4-chem and CAM5-chem (as described above), result in large differences in tropospheric surface area density, temperature, relative humidity and cloud fraction, with implications in the chemistry particularly of ozone.Additional differences in the vertical resolution of different model configurations influence convection and dynamics in the troposphere and stratosphere and therefore atmospheric composition.The comparison of zonal and annual mean meteorological as well as chemical constituencies between different model versions helps to explain differences in ozone and other chemical tracers.

Dynamics and chemistry
CAM5-chem simulates more ozone in the stratosphere than CAM4-chem, most pronounced in high latitudes in the lower stratosphere.This is aligned with lower temperatures in the stratosphere in the tropics and mid-latitudes in CAM5chem compared to CAM4-chem, resulting in reduced ozone destroying gas-phase chemistry.Furthermore, lower ozone mixing ratios and a cold bias are present in CAM5-chem right around the tropical tropopause in comparison to CAM4chem.Reduced ozone around the tropical tropopause can affect temperatures at the cold point and above (Bardeen et al., 2013).
Differences in zonal winds point to a weaker polar vortex in CAM5-chem compared to CAM4-chem, whereby zonal winds in CAM5-chem are more aligned with analysis fields than in CAM4-chem (not shown).Corresponding higher temperatures in the polar lowermost stratosphere are consistent with higher ozone mixing ratios in high latitudes due to a reduction in halogen activation.
Differences in the microphysics between CAM4-chem and CAM5-chem result in significantly larger relative humidity in the troposphere in mid-and high latitudes in CAM5-chem compared to CAM4-chem (Fig. 1, as discussed in Bardeen et al., 2013).The fraction of low clouds in all configurations varies between 34 % and about 60 % (Table 1) and are caused by the different parameterizations of cloud fraction and cloud condensation with some contribution from the cloud microphysics.Differences exist in the assumed minimum relative humidity values that influence where clouds form.Differences in cloud fraction between different configurations impact photolysis rates in the lower troposphere and therefore ozone photochemistry (discussed below), as well as precipitation and removal processes.
Large differences between CAM5-chem and CAM4-chem configurations are present in the tropospheric SAD, as discussed below.Those differences impact tropospheric chemistry, whereby less SAD in CAM5-chem results in the reduction of NO x , OH, and therefore changes in CO and ozone production, see further discussion in Sect. 5.However, differences in dynamics between CAM5-chem and CAM4-chem have a stronger impact on ozone than differences in clouds and SAD, as shown in comparing SD-CAM5-chem and SD-CAM4-chem (Fig. 1, bottom row).In these two configurations, winds and temperatures are nudged to meteorological, analyzed fields.Similarities in the meteorological fields lead to much smaller differences in ozone than between the FR versions, despite the large differences in relative humidity, cloud fractions, and SAD, which are similar to the differences between two free running model versions.
The impact of differences in dynamics for tropospheric chemistry is further supported in comparing CAM5-chem and SD-CAM5-chem (Figs. 2, 3).In these two model simulations, differences in clouds and SAD are much smaller than between CAM4-chem and CAM5-chem.However, the FR version produces a significantly stronger polar vortex and lower temperatures in high latitudes than the SD version.SD simulations driven by MERRA temperatures are higher than the FR model versions.As shown in Bardeen et al. (2013), differences of the microphysics between different model versions determine the relative humidity in the model, and therefore the relationship between water and temperature.Warmer temperatures in SD-CAM5-chem compared to CAM5-chem therefore cause an increase in water vapor in the stratosphere.
Dynamical differences in the tropics and the stratosphere are investigated for the different model configurations in analyzing the H 2 O tape recorder (Mote et al., 1996) (Fig. 4) and stratospheric age of air (AOA), as described in Garcia et al. (2011), (Fig. 5).The tropical vertical transport between 23 • S and 23 • N and 100 and 10 hPa is analyzed for different model configurations based on the magnitude and slope of   the H 2 O tape recorder (Fig. 4).The slope and magnitude of the tape recorder, as derived from MLS observations between 2005 and 2011 (Fig. 4, bottom row), is best reproduced by the SD configurations, even though H 2 O mixing ratios are too large in SD-CAM5-chem.CAM5-chem reproduces the magnitude of the tape recorder, while minimum H 2 O mixing ratios are too low, and shows a reduced slope compared to SD-CAM5-chem.This points to a faster updraft of air masses above the TTL (tropical tropopause layer).CAM4chem poorly simulates the slope compared to other model configurations, whereas SD-CAM4-chem shows a reasonable magnitude of the tape recorder in comparison to MLS observations.Consistent with the poor representation of the slope of the tape recorder compared to observations, CAM4chem and CAM5-chem produce much shorter stratospheric AOA compared to the SD configurations (Fig. 5).This is consistent with a stronger Brewer-Dobson circulation (BDC) in both free running model configurations and stronger stratosphere to troposphere exchange (STE) (Table 1).Slightly larger AOA values in the tropics and high latitudes are sim-ulated in CAM5-chem compared to CAM4-chem configurations.
The comparison of chemical constituents in the two model configurations further supports a stronger tropical vertical transport in CAM5-chem compared to SD-CAM5-chem and stronger STE in high latitudes (Fig. 3).Stronger tropical vertical transport (mostly in deep convection) in CAM5-chem is evident due to higher mixing ratios in CO and lower mixing ratios of nitric acid in the upper tropical troposphere.The resulting higher CO mixing ratios in the upper troposphere together with increased lightning NO x (LNO x ) production in mid-latitudes lead to greater ozone production, while reduced LNO x production in the tropical belt reduces ozone production.Furthermore, increased nitric acid in addition to higher ozone mixing ratios in high northern latitudes point to more STE.Additionally, lower NO x and CO values in the boundary layer in CAM5-chem indicate that increased STE rather than chemical processing results in larger ozone mixing ratios in CAM5-chem than SD-CAM5-chem.Differences in low clouds between CAM5-chem and SD-CAM5chem also impact chemistry and result in reduced ozone pro-duction in the boundary layer in CAM5-chem.Similar differences are present between CAM4-chem and SD-CAM4chem, however, with smaller differences in STE in high latitudes compared to the CAM5-chem configurations (not shown).

Aerosol burden and surface area density (SAD)
Optical depth and aerosol loading from the different model configurations are listed in Table 1.Total optical depth is somewhat smaller in CAM4-chem than in the CAM5-chem configuration, which is due to different amounts of internally derived sea salt and dust emissions, but also differences in the sulfate burden in comparison to observations, as discussed in Sect.4.2.1.The largest differences in aerosol burden between the configurations occur in the burden of SOA, with about 50 % larger values in CAM5-chem compared to CAM4-chem (as discussed above).The burden of organic matter and black carbon is slightly larger in CAM4chem compared to CAM5-chem using MAM3, due to the different handling of these aerosols in the two configurations.More similar values of BC and OC in CAM4-chem are simulated in CAM5-MAM4-chem.Running two modes for BC in CAM5-MAM4-chem compared to CAM5-chem increases the BC burden by 37 % (see Table 1).SO 4 burdens in CAM4chem are slightly larger than in CAM5-chem.This is because of the different way SO 4 formation and washout is parameterized, as described in Sect. 2.
Heterogeneous reactions on aerosol particles in the model do not directly relate to the aerosol burden but rather depend on the amount of tropospheric SAD.SAD depends not only on aerosol burden or mass but also on their size distribution.For the same aerosol burden, smaller particles provide a larger SAD than larger particles.Both the SD and FR version CAM5-chem simulate much smaller SAD than CAM4chem.This has implications for chemistry and climate (see Sect. 5).The total tropospheric SAD in the model includes SAD from SO 4 , nitrates, POM, SOA, and BC modes.
We compare the burden and SAD between SD-CAM5chem and SD-CAM4-chem for SO 4 , BC, and SOA (Fig. 6).Both magnitude and sign of the differences in burden do not agree with differences in SAD, which is caused by different description of the size distribution of aerosols in the two model versions.In CAM4-chem, BAM assumes a fixed mean radius of 69.5 nm (Emmons et al., 2010;Lamarque et al., 2012), while in MAM3, the size distribution of aerosols is represented in three different modes.For instance, most of SO 4 in the middle and upper troposphere is in the accumulation mode, with a dry diameter size range of 58-270 nm (Liu et al., 2012).On average, SO 4 particles are larger in CAM5-chem compared to CAM4-chem.Larger particles in CAM5-chem in the upper troposphere result in smaller SAD despite the slightly larger SO 4 burden compared to CAM4chem.The increase of BC burden in CAM5-MAM4-chem does not result in an increase of SAD in the model, because only the aged mode of BC is considered in the calculation of SAD.Instead, SAD in MAM4 is slightly reduced compared to MAM3 (see Sect. 5).

Aerosols and aerosol optical depth (AOD)
For the evaluation of aerosols, we compare simulated SO 2 and SO 4 at the surface with observations over the US from the IMPROVE network (see Sect. 3.4), shown in Fig. 7 for SD-CAM4-chem and SD-CAM5-chem, only.All model configurations overestimate SO 2 at the surface, as shown here for the SD configurations (Fig. 7) with larger values in CAM5chem than in CAM4-chem.Annual SO 4 concentrations for all model configurations are about twice as large as observations in rural areas over the US suggest, particularly in summer.In winter, median SO 4 values in SD-CAM4-chem are biased low compared to observations while SD-CAM5chem is biased high, whereas CAM4-chem values are biased high and CAM5-chem are biased low (not shown).
Comparisons to aircraft observations over the US (Fig. 8) show very good agreement for SO 2 that are very close to the observed values for two of the campaigns, while simulated values are slightly larger for ARCTAS-CARB.Furthermore, The evaluation of simulated BC for CAM4-chem, CAM5chem, and CAM5-MAM4-chem is performed by comparisons to HIPPO aircraft campaigns over the Pacific Ocean (Sect.3.3), as shown in Fig. 9.All model configurations overestimate background BC (about 1 µg m −3 or less), as is the case for other climate models (Schwarz et al., 2010;Wang et al., 2014;Samset et al., 2014).The most realistic representation of background BC is in CAM5-chem, where primary BC is assumed to be immediately transitioned into the aged mode and therefore directly emitted in the aged mode.On the other hand, all configurations largely underestimate BC plumes, especially in NH mid-and high latitudes in winter and spring, and in August in the Southern Hemi-sphere (SH).Shortcomings in the simulation of BC plumes are likely caused by a potential underestimate of BC emissions, as well as shortcomings in transport and wet removal by convection (Ma et al., 2013;Wang et al., 2013), while the overestimation of background values may be in part caused by a too long lifetime of BC in the model configurations (Samset et al., 2014).
More work is also needed to improve the representation of POM and SOA, which are not further discussed in this study but were evaluated in Tsigaridis et al. (2014).Large uncertainties exist in the amount of global SOA distribution from observations, and the representation of these aerosols in models, as well as future work is needed for understanding observational yields in comparison to model results.
A comparison of overall aerosol can be given by comparing AOD from satellite and AERONET observations (see Sect. 3.1) with model results, as shown for CAM4-chem and CAM5-chem (Fig. 10).AOD derived using CAM5-MAM4chem (not shown) is very similar to CAM5-chem.The global AOD average in CAM4-chem is slightly lower compared to the observations data set, while it is higher in CAM5-chem.An overestimation of AOD compared to the climatology occurs in CAM5-chem in northern Africa, the Middle East, and around 30 • N and 30 • S over the ocean in CAM5-chem.The AOD bias in the subtropical ocean (mostly from coarse mode   sea salt) can be due to the model deficiency representing the sea salt emission or sedimentation (scavenging) process that requires further investigation.Using reanalysis, winds do not reduce this bias (not shown).Furthermore, AOD values are underestimated over polluted regions like India and Southeast Asia in both model configurations.CAM5-chem has a tendency towards lower AOD in northern mid-and high latitudes, which could be a result of the significant underestimation of high BC plumes in these regions.Larger values than observed in CAM4-chem over the Eastern US and Europe may be in part a result of the larger simulated SO 4 burden.

Ozone
The zonal mean seasonal cycle of tropospheric and stratospheric O 3 column is evaluated in comparison to a monthlymean OMI/MLS climatology (Sect.3.1), Fig. 11 (middle and right columns).The tropospheric ozone column in CAM4-chem and CAM5-chem is overestimated between fall and spring in the NH mid-latitudes, while it is slightly underestimated in the tropics.On the other hand, SD configurations overestimate column ozone in the tropics in summer.All configurations underestimate the tropospheric O 3 column in the SH, with the largest deviations to the observations between September and December.Differences between the FR and SD configurations in NH mid-to high latitudes are aligned with a stronger STE and stronger BDC between fall and spring in the FR versions, as discussed in Sect.4.1.1.The reason for differences of the different model configurations in tropical tropospheric ozone column are further discussed in Sect. 5.The underestimation of tropospheric ozone in the SH, especially in October in the tropics and mid-latitudes may be caused by an underestimation of biomass burning at this time of the year, which is consistent with the underestimation of CO column at the same season in the SH (see below).The stratospheric ozone column is reasonably well reproduced for the tropics and mid-latitudes, showing slightly more ozone in the SD versions compared to the FR versions.In high latitudes, the ozone column is largely overestimated in winter and spring in each hemisphere compared to the climatology, which points to shortcomings in stratospheric transport most pronounced in the FR simulations.On the other hand, the underestimation of column O 3 in the SH in October and December point to the well-known cold bias of polar vortex temperatures in the FR model versions (Eyring et al., 2010).SD configurations do not show the low bias in the ozone column during the ozone hole season in both hemispheres, but instead slightly overestimate column ozone at that time.The reason for this is that temperatures in the SD configurations are slightly higher than for the FR versions, especially the lower stratosphere in high latitudes.
Ozonesonde observations (Sect.3.2), aircraft data (Sect.3.3), and surface observations (Sect.3.4) are used to evaluate the simulated tropospheric chemical composition in more detail.We use a Taylor-like diagram to illustrate relative differences between model configurations and ozonesonde observations, and correlations of the seasonal cycle for different regions, seasons, and different pressure levels; see Figs. 12 and 15.In addition, seasonal cycle comparisons between model results and observations for specific regions are illustrated in Figs. 13 and  western and eastern North America and Western Europe in Fig. 14.Near-surface ozone at 900 hPa is for the most part within the range of variability of ozonesonde observations in both SD and FR configurations (Figs. 12,13,top row).The high bias in summer over the Eastern US and Western Europe, as reported in earlier studies (e.g., Lamarque et al., 2012), has been significantly reduced, due to an improved calculation of dry deposition velocities (Val Martin et al., 2014).In comparison to surface observations (Fig. 14), in winter, FR model configurations slightly overestimate maximum ozone values for North America and Western Europe.SD configurations show a low bias for eastern North America and Western Europe.In summer, all model configurations show a high bias of about 10-15 ppb.However, maximum ozone mixing ratios do agree with observations, whereas low ozone mixing ratios are overestimated.A high bias of about 10 ppb can be attributed to the coarse model resolution, which leads to an overestimate of ozone production, because of diluted emissions of ozone precursors, and therefore an increase in the lower ozone mixing ratios of its distribution (e.g., Pfis-ter et al., 2014).Ozonesondes are not compared to the model configurations at the surface.Those agree well with surface observations besides a high bias over the Eastern US in summer, as discussed in Tilmes et al. (2012).
In the mid-troposphere, model results agree well with ozonesonde observations at 500 hPa (Fig. 12, bottom row).The seasonal cycle is well produced, in particular for the FR configurations in mid-and high latitudes, with correlations around 0.95 compared to the observations (Fig. 13, bottom row).The somewhat higher bias in winter and spring over Western Europe and high latitudes in CAM5-chem in 500 hPa contributes to the high bias in 900 hPa, as more ozone is transported downward, discussed in Sect.4.1.The low bias in ozone in the western Pacific/eastern Indian Ocean is due to the stronger convection in the FR model configurations compared to SD.This bias is also shown in the comparisons at 250 hPa (Figs. 15,16).At 50 hPa, all configurations show a high ozone bias by at least 20 % in the tropics during winter and spring.Mid-and high latitude ozone in the stratosphere is reproduced well for all configurations within the range of variability.Comparisons to the aircraft climatology in the free troposphere (2-7 km) (Fig. 17, top row) confirm the high bias of ozone in CAM5-chem and the low bias in the SD configuration at NH high latitudes, as well as the low bias in the tropics in fall.Deviations from the aircraft climatology are much larger (up to 40 % in mid-and high latitudes and up to 60 % in winter in the tropics) compared to the ozonesonde observations (up to 25 %).
In comparison to HIPPO aircraft observations over the Pacific, ozone mixing ratios are biased high in mid-and high latitudes in both CAM4-chem and CAM5-chem configurations, mainly in fall and winter (Fig. 18, second and third columns).In addition, in spring CAM5-chem simulates larger ozone in the NH mid-and high latitudes than the other model configurations.The high ozone bias in both CAM4-chem and CAM5-chem in the remote region of the Pacific further points to a too strong STE in the FR versions.In the tropical troposphere, CAM5-chem reproduces observed mean ozone mixing ratios very well, while there is also the low biased summer and fall.However, SD configura-tions simulate larger ozone mixing ratios in winter and spring compared to ozonesondes and HIPPO observations.
The better representation of tropical ozone in the SD configurations in summer and fall may therefore be the result of more realistic convection, or due to a larger production of LNO x in this region.The observations further confirm that STE in winter and spring in mid-and high latitudes is slightly too strong in CAM5-chem compared to the other configurations.

CO and hydrocarbons
In comparison to MOPITT satellite observations (Fig. 11, left column) all model configurations show a significant low bias in column CO with a maximum in spring and fall in the NH and a smaller bias in October in the SH.The tropical CO column agrees to within 5 % with the observations.Regional differences in column CO between CAM5-chem and MOPITT (Fig. 19) occur over polluted regions, especially in April and July for the NH and over South America and southern Africa in October.This points to a significant un- derestimation of CO biomass burning emissions over those regions.Furthermore, CO is largely overestimated in January over central Africa, which points to an overestimation of fire emissions.
CO and other hydrocarbons are strongly controlled by emissions but also directly impacted by the amount of OH in the atmosphere.The comparison of CO between aircraft measurement and CAM5-chem model results, averaged over 2-7 km (Fig. 20), confirms the pronounced underestimation of CO mixing ratios in the NH troposphere for seasons where data are available.Intermodel differences can be explained by differences in the oxidizing capacity of the atmosphere, showing largest values for CAM4-chem, consistent with the longest methane lifetime with that configuration (Table 1, and further discussed in Sect.5).Furthermore, in the tropics, in spring, aircraft campaigns show in some regions larger propane (C 3 H 8 ) and to some degree large acetylene (C 2 H 2 ) and CO values (Fig. 17).Too strong convection in the tropics may lead to enhanced mixing ratios of short-lived species, like C 3 H 8 (with an approximately 10-day lifetime) in this re- gion, while longer-lived species are still underestimated by the models for the same campaigns.

NO x and PAN
Differences in the simulation of NO x and PAN between the configurations will have implications for simulated distributions of tropospheric ozone.As for ozone, in the FR version, especially CAM5-chem, both PAN and NO x mixing ratios in the NH mid-and high latitudes are slightly larger compared to the SD versions (Fig. 17).Model comparisons to aircraft observations, show in general an underestimation of NO x and PAN of up to 80 %.Some aircraft campaigns observed much higher NO x and PAN values than simulated, for instance ARCPAC in 2008 and SOS in 1999.Both of these campaigns targeted regions with a significant contribution of biomass burning pollution and local pollution.
In the tropics, ozone deviations from specific aircraft observations often occur along with biases in ozone precursors, NO x , PAN, and CO, and C 3 H 8 ; see Figs. 17 and 20.Variations in biases between observations and model results are expected when comparing to aircraft campaigns that targeted specific conditions.We investigate aircraft profiles from those campaigns where the models reproduced ozone and CO mixing ratios reasonably well in the troposphere (Fig. 21).In this way, shortcomings in NO x and PAN can be identified.In general, PAN is overestimated in the free tropical troposphere, which can be an indicator of too much convection in the model compared to observations (e.g., Fis-cher et al., 2014).In comparison to HIPPO observations of PAN (Fig. 22), all model configurations strongly overestimate PAN in the upper troposphere, and in the NH troposphere especially in winter.Values in the lower troposphere in tropics and the SH are reasonably well reproduced.
Sensitivity studies, CAM5-chem * and SD-CAM5-chem * (Sect.2), where SAD is increased in CAM5-chem configurations to the amount simulated in CAM4-chem simulations (see Table 1), show that only a small fraction of the differences in PAN mixing ratios between the different configurations can be attributed to differences in SAD (Fig. 21).One would expect that larger SAD values result in a faster transition of NO x to NO y and therefore reduced PAN production.However, adjustments of the SAD between CAM4-chem and CAM5-chem configurations are less important in most cases, as shown in Fig. 21.

Methane lifetime and OH differences in CAM4-chem and CAM5-chem
Tropospheric chemistry is strongly controlled by the oxidizing capacity of the atmosphere.The most abundant oxidants in the troposphere are OH, ozone, and nitrate radical (NO 3 ).These control the atmospheric lifetimes of trace gases, including methane.The methane lifetime can therefore be considered as an indicator for the performance of the model.Model configurations differ largely in tropospheric methane lifetime and often underestimate recent observational esti- mates of 10.2 (Prinn, 2005) and 11.3 years (Prather et al., 2012).The reason for differences cannot be easily ascribed to specific processes in models that contributed to intercomparison projects such as ACCMIP (Voulgarakis et al., 2013;Naik et al., 2013).
In this study, all simulations are based on the same framework and run with the same emissions, the same gas-phase chemistry and, in the case of the SD versions, nudged with the same dynamics.Differences in the oxidizing capacity of the atmosphere can be therefore attributed to model physics, aerosol description, and differences in dynamics between SD and FR versions, caused by differences in vertical resolution and transport processes.
The tropospheric methane lifetime in all model configurations in this study varies between 7.6 and 8.8 years (Table 1), which is significantly lower than observational estimates.The tropospheric methane lifetime and CO burden in the tropics (between 30 • S and 30 • N) are both correlated to the tropical OH burden (e.g., Wang and Jacob, 1998;Murray et al., 2014), with slightly different correlations for differ-ent model configurations (Fig. 23, left and middle panels).Since CO and methane are both controlled by OH, all model configurations show a very similar CH 4 / CO correlation (see Fig. 23, right panel).
To understand the processes that lead to the spread of tropical OH in different model configurations in this study, we explore relationships between annual averages of tropical OH burden and other variables averaged over 30 • S-30 • N in the troposphere, including tropospheric SAD, H 2 O 2 , LNO x , HNO 3 , tropospheric and stratospheric column ozone, and ozone production (Figs. 24,25).
A consistent difference in OH burden exists between CAM5-chem and CAM4-chem in both FR and SD versions, whereby the CH 4 lifetime of CAM4-chem is about half a year longer than in CAM5-chem (Fig. 23).Based on the sensitivity simulations (CAM5-chem * and SD-CAM5chem * ), most of the differences in OH burden can be attributed to the differences in SAD between CAM4-chem and CAM5-chem (Fig. 24, left top panel).The increased SAD results in increased heterogeneous reaction, and therefore in- creased H 2 O 2 (Fig. 24, right top), and further reductions in NO x burden in comparison to LNO x production (Fig. 25, left panel).This is due to the fact that enhanced tropospheric heterogeneous reactions increase both the uptake of dinitrogen pentoxide (N 2 O 5 ) as well as the uptake of HO 2 on aerosols, which is the major aqueous-phase source of H 2 O 2 .The hydrolysis of N 2 O 5 on aerosols results in a reduction of NO x .Increased H 2 O 2 further results in increased production of sulfate, since the reaction of H 2 O 2 with SO 2 in cloud drops is the most significant contributor to sulfate formation (Seinfeld and Pandis, 2012).For the gas-phase chemistry, the decrease of NO x leads to a reduction of ozone and, together with the reduction in HO x , this leads to reduced OH and therefore to an increase in methane lifetime.
However, SAD differences do not explain all the differences in the OH burden, especially between FR and SD configurations.To further analyze factors that control OH burden, we scale OH to a fixed SAD value for all configurations and use the mean tropical tropospheric SAD derived using CAM4-chem results (SAD CAM4-chem ) as a reference.For this, we use the slope of the line that describes the OH/SAD change between CAM5-chem and CAM5-chem * configurations, S SAD , -see the blue and cyan lines in Fig. 24, left top panel -to adjust the OH burden for all configurations to the SAD reference for SD and FR configurations: As discussed in Murray et al. (2014), OH is strongly correlated to NO x and CO emissions, as well as to the stratospheric ozone column.Since all the simulations were performed with the same CO and NO x emissions, differences in NO x emissions are due to variations in LNO x .Indeed, Fig. 24, middle top panel, shows a strong dependency of the OH burden to LNO x .The annual variability in LNO x production is much larger in the SD simulations compared to the FR configurations, which is likely introduced by the use of climatological SSTs in the FR configurations.However, the same LNO x in FR and SD does not result in the same OH burden, which shows intermodel differences are only in part (about half) a result of differences in LNO x (Fig. 25, top and middle panels).
On the other hand, variations in OH cannot be explained by differences in stratospheric column ozone between the different model simulations.Stratospheric column ozone in the model increases between FR and SD configurations.One would expect a decrease in OH as a result of reduced photolysis rates with increasing stratospheric ozone.
Tropospheric ozone is an important driver for the OH burden in all the different model configurations.More tropo-  spheric ozone in higher OH burden.The question remains why tropospheric ozone is larger in the SD than the FR version.Considering ozone production, increased SAD between CAM5-chem and CAM5-chem * reduces ozone production as a result of the reduced NO x burden.However, the same amount of ozone production in FR and SD versions does not result in the same OH burden (see Fig. 25, bottom right panel).Therefore, enhanced ozone in the SD versions is not only due to differences in chemical production of ozone but must be also due to differences in transport processes between the SD and FR versions.This is further supported by the OH to HNO 3 correlations (Fig. 25, middle panel).Larger HNO 3 burden is simulated in the SD configurations than in the FR versions, which is pointing at less stratospheric contribution in the FR configurations.Another source of HNO 3 in the troposphere is LNO x .The correlation between HNO 3 and LNO x clearly supports the conclusion that larger HNO 3 mixing ratios in the SD configuration compared to the FR simulations are not due to differences in HNO 3 production (Fig. 25, right panel).Furthermore, the smaller tropical tropospheric ozone burden in CAM5-chem compared to CAM4-chem is not aligned with the larger ozone production in CAM5-chem due to larger LNO x .Differences are therefore likely a result of differences in transport and mixing processes in the tropics.

Conclusions
The evaluation of the different model configurations using various observations of aerosol and chemical species shows by at least a factor of 4 and sulfate aerosol (SO 4 ) is overestimated by about 100 % compared to IMPROVE observations.In the discussed simulations, anthropogenic emissions of SO 2 and SO 4 are emitted at the surface, which can lead to an underestimated transport into the free troposphere.Comparisons to aircraft observations in the troposphere show a reasonable agreement between models and observations in SO 2 and SO 4 , besides a high bias in SO 4 in CAM4-chem over the US.Profiles of SO 2 and SO 4 in high latitudes are for the most part underestimated in the model.
-The different representation of BC in CAM4-chem and CAM5-chem results in a larger burden of BC in CAM4chem, which is due to its consideration of primary and aged BC.A similar description in CAM5-MAM4-chem leads to enhanced BC burden compared to CAM5chem.BC plumes are in general underestimated in all model configurations while background values over the Pacific Ocean are overestimated, whereby CAM5-chem agrees best with observations.
-AOD points to a significant underestimation of biomass burning emissions in the model, and some overestimation in CAM4-chem over Western Europe and the Eastern US that may be due to the overestimation of SO 4 .An overestimation of AOD over the Pacific points to too large background values in aerosols, potentially also from sea salt, which is more pronounced in CAM5chem than in CAM4-chem.
-Tropospheric ozone in the tropics and the Northern Hemisphere is very well represented in all model configurations and agrees within the variability of ozonesonde observations of about 25 %.Surface observations are well reproduced in winter.The summer high bias of all models over Western Europe and North America can be for the most part contributed to a high bias in low and medium ozone mixing ratios as a result of a coarse resolution of the model configurations.
In the free troposphere, FR configurations slightly overestimate ozone in mid-and high latitudes and under-estimate ozone in the tropical free troposphere in summer and fall, while SD configurations slightly overestimate ozone in the upper tropical troposphere and in part underestimate ozone in high latitudes.Southern Hemisphere tropospheric ozone is underestimated by 10-25 % in all model configurations.The comparison to aircraft observations confirms the differences based on ozonesonde observations, but models show a large bias of up to 40 % compared to observations.
-CO is largely underestimated in the Northern Hemisphere, especially in spring, and in the SH in October, pointing to the underestimation of emissions.Other hydrocarbons that are most frequently observed during aircraft campaigns are also significantly underestimated for all seasons.The lowest values of CO and hydrocarbons occur in SD-CAM5-Chem in the tropics.CO is in reasonable agreement with the observations in the tropics.
-PAN is in general overestimated in the upper troposphere in comparison to aircraft observations for all model configurations, while NO x is underestimated in comparison to aircraft observations, particularly in high latitudes.The largest bias of simulated PAN in comparison to HIPPO observations occurs in mid and high northern latitudes throughout the troposphere in winter months.
Differences in CAM4-chem and CAM5-chem, and FR and SD configurations, are to a large part driven by differences in dynamics, including temperature, transport, and mixing processes.Differences in the H 2 O tape recorder and in AOA indicate that the Brewer-Dobson circulation is too strong in the FR model configurations, while both diagnostics are reasonably reproduced in the SD configurations.This is consistent with the overestimation of ozone in high latitudes in FR, particularly in winter and spring for CAM5-chem.Furthermore, shortcomings in transport and mixing are likely responsible for slightly larger ozone mixing ratios in the tropical troposphere in SD compared to FR versions of the model.Differences in the oxidizing capacity of the atmosphere, which impacts the methane and CO lifetimes between different model configurations, are largely controlled by tropospheric surface area density, lightning NO x , and differences in tropospheric ozone.Smaller SAD values in CAM5-chem are responsible for the smaller methane lifetime compared to CAM4-chem.Smaller values in surface area density in CAM5-chem compared to CAM4-chem are a result of different aerosol descriptions in the two model configurations.An underestimation of SAD in the model is possible, because BC plumes are significantly underestimated over source regions.Since background aerosols are in general overestimated, shortcomings may exist in the calculation of SAD.For example, sea salt and dust provide surfaces for heterogeneous reactions that have not been taken into account in any of the simulations (Evans and Jacob, 2005).
Besides SAD, tropospheric ozone impacts the oxidizing capacity of the model.For the SD configuration, larger ozone mixing ratios in the tropics compared to FR result in reduced methane lifetime.Therefore, variations in transport and mixing are an important driver for differences in ozone and therefore methane lifetime, which is critical for climate simulations.
Methane lifetime is in general underestimated in all model configurations compared to observational estimates, with a difference of about 1 year between the different configurations.The main reason for the underestimation compared to observations is likely due to shortcomings in CO and other hydrocarbon emissions, as also found in other model studies (Stein et al., 2014;Monks et al., 2015;Emmons et al., 2014).This is supported by the underestimation of CO over source regions but also by the underestimation of AOD over source regions, pointing to a general underestimation of biomass burning emissions.Also, the underestimation of isoprene emissions can result in a significant underestimation of methane lifetime (Pike and Young, 2009).
In summary, both CAM4-chem and CAM5-chem configurations are well-suited tools for atmospheric-chemistry modeling studies, considering the shortcomings discussed in this study.We recommend the use of CAM5-chem in future studies, due to the improved description of aerosol processes and cloud interactions.Ongoing work is contributing to further improving CAM5-chem configurations.

Figure 3 .
Figure 3.Comparison of ozone, nitric acid, ozone production, lightning NO x , carbon monoxide, NO x , hydroxyl radical, and water vapor between CAM5-chem and SD-CAM5-chem.

Figure 7 .
Figure 7.Comparison between IMPROVE network observations over the US in winter (DJF) in comparison to SD-CAM5-chem (blue) and SD-CAM4-chem (red) for SO 2 (left) and sulfate aerosol (SO 4 ) (right) and different seasons, DJF (top) and JJA (right).The median and correlation coefficient (R) between observations and model results are given at the top left of each panel.

Figure 8 .
Figure 8.Comparison of SO 2 (left) and sulfate aerosol (SO 4 ) (right) between different model configurations and aircraft observations over the US (two left columns) and at high latitudes (two right columns).Black lines show the median of aircraft profiles and error bars indicate the range between the 25th and 75th percentiles of the distribution.Model results are averaged over the region and months of each campaign.

Figure 9 .
Figure 9. HIPPO BC observations for different HIPPO aircraft campaigns taken over the Pacific (left column) and differences between the different model configurations and observations, CAM4-chem (second column), CAM5-chem (third column) and CAM5-MAM4-chem (fourth column).

Figure 10 .
Figure10.Top row: aerosol optical depth at 550 nm for CAM4-chem (left) and CAM5-chem (right).Bottom row: differences between model and observations from a satellite and AERONET composite(Kinne, 2009).Numbers in the parenthesis are the global average AOD only over areas where the satellite composite has a valid value.

Figure 11 .
Figure 11.Differences between model results and observations of zonally averaged CO columns below 100 hPa from the present-day MOPITT climatology (left), and OMI/MLS tropospheric and stratospheric column climatology (right).

Figure 12 .
Figure 12.Taylor-like diagram comparing the mean and correlation of the seasonal cycle between observations using a present-day ozonesonde climatology from 1995 to 2011 and model results, interpolated to the same locations as sampled by the observations and for different pressure levels, 900 hPa (top panel) and 500 hPa (bottom panel).The numbers correspond to specific regions, as defined in Tilmes et al. (2012).Left panels: 1 -NH Subtropics; 2 -W Pacific/E Indian Ocean; 3 -equat.Americas; 4 -Atlantic/Africa.Middle panels: 1 -Western Europe; 2 -Eastern US; 3 -Japan; 4 -SH Mid-latitudes.Right panels: 1 -NH Polar West; 2 -NH Polar East; 3 -Canada; 4 -SH Polar.

Figure 13 .
Figure 13.Seasonal cycle comparison between observations using a present-day ozonesonde climatology from 1995 to 2011 (black) and model results: CAM5-chem (cyan) and CAM4-chem (orange), SD-CAM5-chem (blue) and SD-CAM4-chem (red).Model results are interpolated to the same locations as sampled by the observations and for different pressure levels, 900 hPa (top panel) and 500 hPa (bottom panel) for selected regions.The standard deviations of ozonesonde observations are shown as error bars and the mean and correlation of the seasonal cycle between observations and model results are printed at the top of each figure.

Figure 14 .
Figure14.Probability distribution function (PDF) of the regionally aggregated ozone distribution for western North America, eastern North America, and Western Europe from surface ozone observations (grey shaded area) in comparison to regionally aggregated ozone distributions from the model results interpolated to the location of the ozone stations (different colors), for winter (left) and summer (right).

Figure 18 .
Figure 18.HIPPO O 3 observations for different HIPPO aircraft campaigns taken over the Pacific, left column, and differences between the different model configurations and observations, CAM4-chem (second column), CAM5-chem (third column) and SD-CAM5-chem (fourth column).

Figure 19 .
Figure 19.Regional comparison of CO columns for different months between CAM5-chem model results and MOPITT observations.Model results are shown on the left, and differences between CAM5-chem and MOPITT on the right.The MOPITT averaging kernels and a priori are applied to the model results to account for the a priori dependence and vertical resolution of the MOPITT data.

Figure 21 .
Figure 21.Comparisons of vertical profiles of ozone, CO, NO x and PAN, from different tropical aircraft campaigns and different model configurations.Black lines show the median of aircraft profiles and error bars indicate the range between the 25th and 75th percentiles of the distribution.Model results are averaged over the region and months of each campaign.

S.Figure 22 .
Figure 22.HIPPO PAN observations for different HIPPO aircraft campaigns taken over the Pacific, left column, and differences between the different model configurations and observations, CAM4-chem (second column), CAM5-chem (third column) and SD-CAM5-chem (fourth column).

Figure 23 .
Figure 23.Correlations between tropospheric OH burden, methane lifetime, and CO, for different simulations.OH and CO burden are column-integrated tropical averages (30 • S-30 • N).Each symbol of each configuration (see legend) represents an annual average value.

Figure 24 .
Figure 24.Column integrated tropospheric and tropical OH burden in (30 • S-30 • N), left top panel, and OH burden, adjusted to a reference SAD value (see text) for the other panels, in correlation to different variables that are integrated over the same region.Each symbol of each configuration (see legend) represents an annual average value.

Figure 25 .
Figure 25.Correlations of tropospheric column integrated NO x to column integrated lightning NO x over the tropics (left panel); correlation of OH burden, adjusted to a reference SAD value (see text) to column integrated HNO 3 over the tropics (middle panel); correlations of column integrated HNO 3 to column integrated lightning NO x over the tropics (right panel).

Table 1 .
Overview of model experiments, setup between different simulations, and global model diagnostics.Lifetimes and burdens are calculated for the troposphere defined for regions where ozone is below 150 ppb.
a Net chemical tendency of O 3 .b Top of the atmosphere (TOA) residual.c Downwelling solar flux at surface.d Clear sky downwelling solar flux at surface.

Table 2 .
Measurements form aircraft campaigns used in this study.
16.A comparison of surface ozone is performed, showing probability distribution functions between model results and observations for Relative differences between different model configurations and aircraft observations (different colors) over different regions and seasons as listed in Table1and sorted with regard to season and location (see text for more details), averaged over 2-7 km, for O 3 , NO x , NO y , PAN, and HNO 3 .