Articles | Volume 17, issue 22
https://doi.org/10.5194/gmd-17-8373-2024
https://doi.org/10.5194/gmd-17-8373-2024
Methods for assessment of models
 | 
26 Nov 2024
Methods for assessment of models |  | 26 Nov 2024

Source-specific bias correction of US background and anthropogenic ozone modeled in CMAQ

T. Nash Skipper, Christian Hogrefe, Barron H. Henderson, Rohit Mathur, Kristen M. Foley, and Armistead G. Russell
Abstract

United States (US) background ozone (O3) is the counterfactual O3 that would exist with zero US anthropogenic emissions. Estimates of US background O3 typically come from chemical transport models (CTMs), but different models vary in their estimates of both background and total O3. Here, a measurement–model data fusion approach is used to estimate CTM biases in US anthropogenic O3 and multiple US background O3 sources, including natural emissions, long-range international emissions, short-range international emissions from Canada and Mexico, and stratospheric O3. Spatially and temporally varying bias correction factors adjust each simulated O3 component so that the sum of the adjusted components evaluates better against observations compared to unadjusted estimates. The estimated correction factors suggest a seasonally consistent positive bias in US anthropogenic O3 in the eastern US, with the bias becoming higher with coarser model resolution and with higher simulated total O3, though the bias does not increase much with higher observed O3. Summer average US anthropogenic O3 in the eastern US was estimated to be biased high by 2, 7, and 11 ppb (11 %, 32 %, and 49 %) for one set of simulations at 12, 36, and 108 km resolutions and 1 and 6 ppb (10 % and 37 %) for another set of simulations at 12 and 108 km resolutions. Correlation among different US background O3 components can increase the uncertainty in the estimation of the source-specific adjustment factors. Despite this, results indicate a negative bias in modeled estimates of the impact of stratospheric O3 at the surface, with a western US spring average bias of 3.5 ppb (25 %) estimated based on a stratospheric O3 tracer. This type of data fusion approach can be extended to include data from multiple models to leverage the strengths of different data sources while reducing uncertainty in the US background ozone estimates.

1 Introduction

United States (US) background ozone (O3) is the counterfactual O3 that would exist if US anthropogenic emissions were zero. The National Ambient Air Quality Standard (NAAQS) for O3 was set at a level of 70 ppb in 2015 and may be lowered. In its recent reviews of the O3 NAAQS, the US Environmental Protection Agency (EPA) noted the importance of US background O3 (US EPA, 2013, 2014, 2020b, a). US background O3 takes up a larger portion of the allowed ozone as the NAAQS is tightened and is a larger portion of total observed O3 as anthropogenic precursor emissions decline (Lin et al., 2017; Guo et al., 2018; Jaffe et al., 2018). US background O3 cannot be observed (Fiore et al., 2003; Dentener et al., 2010; McDonald-Buller et al., 2011; Fiore et al., 2014; Jaffe et al., 2018; US EPA, 2013, 2014, 2020b, a). It is typically quantified using a chemical transport model (CTM), most commonly using the zero-out method in which US anthropogenic emissions are set to zero. There is much uncertainty in CTM estimates of US background O3 due to model biases and differences in CTM-estimated US background O3 among different models (McDonald-Buller et al., 2011; Fiore et al., 2014; Dolwick et al., 2015; Huang et al., 2015; Guo et al., 2018; Jaffe et al., 2018). Jaffe et al. (2018) estimated that the typical uncertainty in CTM-simulated seasonal mean US background O3 is ±10 ppb.

Sources of US background O3 include naturally occurring emissions such as wildfires, biogenic volatile organic compounds (VOCs), oxides of nitrogen (NOx) from soil, lightning NOx, stratosphere-to-troposphere exchange, and oxidation of methane (Fiore et al., 2014; Jaffe et al., 2018; US EPA, 2020a). Some portions of total O3 contributions from soil NOx and methane oxidation are US background sources, while some are anthropogenic. Soil NOx is emitted by microbial processes in both natural and agricultural lands and is limited by availability of nitrogen in the soil. There is a pre-industrial level of methane that contributes to US background O3 formation, but any O3 created through oxidation of methane above the pre-industrial level is anthropogenic. Soil NOx and methane oxidation are often treated as US background O3 sources in their entirety in CTM studies due to the complexity of splitting up the natural and anthropogenic portions (US EPA, 2020a). Wildfires are treated as US background O3 sources, but the impacts of wildfires on O3 can be affected by US anthropogenic emissions when VOCs from fires are transported over NOx-rich urban areas, leading to enhanced O3 production (Jaffe et al., 2013; Langford et al., 2023; Rickly et al., 2023). US background O3 sources also include non-US anthropogenic pollution, which may be from long-range transport (Lin et al., 2012b) or from short-range transport from neighboring countries (Wang et al., 2009).

In previous work (Skipper et al., 2021), we developed a bias correction method which used regression modeling to adjust CTM-simulated US anthropogenic and US background O3 to better align with observations and to improve agreement among differing US background O3 estimates from different model configurations. We developed spatially and temporally varying scaling factors to adjust US anthropogenic and US background O3. In that work, US background O3 was treated as a single quantity with no separation of individual sources of US background O3. A consistent low bias in US background O3 in spring was identified, though the specific source of this low bias could not be identified. Here, we extend the bias correction method to estimate biases in separate components of US background O3. Separating the US background O3 components provides new insights into the inferred CTM error in US background O3 that was not possible when US background O3 was treated as a lumped quantity.

2 Methods

2.1 Chemical transport model simulations

Total O3 (i.e., base O3), US background O3, and individual US background O3 components are simulated at both regional and hemispheric scales using the Community Multiscale Air Quality (CMAQ) model. We use maximum daily 8 h average (MDA8) O3 as the metric of interest since this is the metric used in determining attainment of the NAAQS. References to O3 throughout are to MDA8 O3. CMAQ results are from two recent sets of simulations by the US EPA (Table 1). The two sets of simulations include different US background O3 components, allowing us to explore how different components of US background O3 affect the bias in O3.

Table 1Simulation names and descriptions for hemispheric-scale and regional-scale simulations. Table adapted from the 2020 O3 policy assessment Table 2-1 (US EPA, 2020a).

a Emissions estimated to be associated with intentionally set fires (prescribed fires) are grouped with anthropogenic fires. b Only for PA simulations. c Only for EQUATES simulations.

Download Print Version | Download XLSX

The first set of simulations was conducted for the policy assessment (PA) for the review of the O3 NAAQS in 2020 (US EPA, 2020a). These simulations also support the draft PA for the reconsideration of the O3 NAAQS. The PA simulations cover the entire year of 2016 and provide estimates of US anthropogenic and US background O3 as well as natural and international anthropogenic contributions to US background O3. International O3 is also further divided into short-range international anthropogenic contributions from Canada and Mexico (Canada+Mexico) and long-range international contributions from other countries. The PA simulations consist of nested simulations from a hemispheric scale (Mathur et al., 2017) at 108 km horizontal resolution to a continental scale at 36 km resolution and to a finer continental scale at 12 km resolution.

US background O3 components are determined by the zero-out method in which the model is run in the same configuration as the base case but with specified emissions sources removed. The zero-out method is the most common approach for simulating US background O3, though other approaches, such as sensitivity simulations and source tagging techniques, have also been employed previously (Jaffe et al., 2018). The zero-out method neglects non-linear interactions between sources, which can affect the simulated source contribution (Wu et al., 2009; Dolwick et al., 2015). However, the zero-out method is consistent with the definition of US background O3 as the level of O3 in the absence of US anthropogenic emissions, while sensitivity or tagging techniques would instead provide an estimate of source contributions to total simulated O3 (including O3 from US anthropogenic sources). US background O3 is estimated by removing US anthropogenic emissions (ZUSA simulation). US anthropogenic O3 is calculated as base O3 minus US background O3. Natural O3 is estimated by removing all anthropogenic emissions (ZANTH simulation). The non-US anthropogenic O3 contribution is estimated by removing anthropogenic emissions everywhere except the US (ZROW simulation). The international contribution is calculated as base O3 minus O3 from the ZROW simulation. Canada+Mexico O3 is estimated by removing Canada and Mexico anthropogenic emissions (ZCANMEX simulation). The Canada+Mexico O3 contribution is calculated as base O3 minus O3 from the ZCANMEX simulation. Long-range international O3 is estimated as international O3 minus Canada+Mexico O3. Due to non-linear chemistry, there is some residual anthropogenic contribution to base O3, which is not attributed to US or international emissions. Descriptions of these CMAQ simulations and the calculation of O3 components are given in Tables S1 and S2 in the Supplement. Further details of the modeling setup are available in the 2020 policy assessment (US EPA, 2020a).

The second set of simulations was developed from EPA's Air QUAlity TimE Series (EQUATES) project, which spans 2002–2019. Additional simulations using the EQUATES modeling framework were conducted for 2016–2017 to estimate US background O3 and US anthropogenic O3 using the zero-out method. The EQUATES simulations consist of hemispheric-scale simulations at 108 km horizontal resolution and nested US continental-scale simulations at 12 km horizontal resolution. Descriptions of these CMAQ simulations and the calculation of O3 components are given in Table S3. Further details on the model configuration for EQUATES are available from Foley et al. (2020, 2023). More details on both the PA and the EQUATES simulations are summarized in Tables S4 and S5.

The 108 km EQUATES simulations also include an inert tracer species, which serves as a proxy for simulated stratospheric O3 contributions. Separate stratospheric O3 contributions were not available from the PA simulations, so the EQUATES simulations provide an opportunity to assess potential biases specific to stratospheric O3 contributions. CMAQ simulates stratospheric O3 using a parameterization based on the relationship between O3 and potential vorticity (PV) in the upper troposphere and lower stratosphere (UTLS) (Xing et al., 2016). The parameterization was developed using 21 years of ozonesonde data from the World Ozone and Ultraviolet Radiation Data Centre and PV data from the Weather Research and Forecasting (WRF) model for 1990–2010. In the EQUATES 108 km simulations, the parameterization is applied to the top model layer only. A PV tracer species tracks O3 injected into the UTLS throughout the rest of the model domain for the hemispheric simulations. The 12 km continental simulations inherit the PV tracer species through lateral boundary conditions from the hemispheric simulations. This tracer is subject to transport and deposition but not chemistry. We refer to the PV tracer concentration as stratospheric O3 since it relates to the stratospheric influence, but it only partly replicates the impact of stratospheric O3 since it does not undergo chemical losses. The stratospheric O3 tracer does however provide a measure of the spatiotemporal variability of stratospheric O3 impacts. We also estimate the contribution to US background O3 from sources other than the stratosphere as US background O3 minus the stratospheric O3 tracer and refer to it as non-stratospheric US background O3. The use of the chemically inert PV tracer to split up stratospheric and non-stratospheric influences on US background O3 introduces uncertainty as the stratospheric O3 component may be unrealistically high, especially in areas and times with more active chemistry.

The modeling configurations of the PA and EQUATES simulations differ in some respects, which is expected to lead to some differences in simulated O3, though they do share some of the same configuration options. Both the PA and the EQUATES simulations use a 44-layer vertical structure for hemispheric-scale applications (at 108 km resolution) and a 35-layer vertical structure for continental applications (i.e., 36 and 12 km resolutions) with a vertical extent from the surface to 50 hPa and a surface layer height of approximately 20 m for both the hemispheric and the continental configurations (see Mathur et al., 2017, for more details on these vertical-layer structures). CMAQ v5.2.1 was used for the PA simulations, while CMAQ v5.3.2 was used for the EQUATES simulations. These were the latest versions of CMAQ at the respective times that each set of simulations was conducted. One potential source of differences is updates to halogen chemistry that were introduced in CMAQ v5.3 (Sarwar et al., 2019). These updates in the EQUATES simulations enhance halogen-mediated O3 losses, which are the strongest over the oceans. These losses are most relevant for O3 contributions (natural and anthropogenic) that are transported over long distances across oceans. An intercomparison of CMAQ v5.2.1 and CMAQ v5.3.1 (which is not significantly different from CMAQ v5.3.2) showed that the newer version typically had lower O3 compared to the older version, with a mean bias  1 ppb lower in CMAQ v5.3.1 (Appel et al., 2021). Besides the updates to halogen chemistry, there are other differences in the chemical mechanisms used for each set of simulations. The mechanisms used for the hemispheric simulations were cb6r3_ae6_aq for the PA simulations and cb6r3m_ae7_kmtbr for the EQUATES simulations. The part of the mechanism name labeled cb6r3m indicates additional chemistry relevant in marine environments (the halogen chemistry described above), ae6 and ae7 indicate the version number for chemistry relevant to aerosols, and aq and kmtbr indicate different treatments of cloud chemistry. The chemical mechanisms used for continental-scale PA and EQUATES simulations (cb6r3_ae6nvPOA_aq and cb6r3_ae7_aq) also differ in their representation of organic aerosols (Murphy et al., 2017; Pye et al., 2019; Qin et al., 2021; Appel et al., 2021), which could affect O3 concentrations. Different versions of WRF (v3.8 for PA simulations and v4.1.1 for EQUATES simulations) employed may also contribute to differences in O3.

Emission inputs also differ between the PA and EQUATES simulations. Different US anthropogenic emission inventories were used for the simulations. The PA simulations used an early version (sometimes called the “alpha” version) of a 2016 emissions modeling platform developed by the National Emissions Inventory Collaborative (US EPA, 2019b). The EQUATES simulations used an inventory that was developed as part of the broader EQUATES framework to model a long time series using consistent methods for emissions estimates (Foley et al., 2023). For emissions in Canada and Mexico, both sets of simulations use emission inventories developed by the respective national governments, though the EQUATES simulations use more recent inventories (as described by Foley et al., 2020) than the PA simulations (as described by US EPA, 2019b). Both the PA and the EQUATES simulations use the Tsinghua University inventory of emissions in China (Zhao et al., 2018). For other countries, both sets of simulations use the Hemispheric Transport of Air Pollution (HTAP) v2.2 inventory (Janssens-Maenhout et al., 2015) with scaling factors derived from the Community Emissions Data System (CEDS) (Hoesly et al., 2018) to account for yearly changes. Differences in the anthropogenic emissions used in the two model configurations are expected to contribute to differences in simulated O3, most notably for the different US anthropogenic emissions since here we focus on O3 in the US.

For hemispheric-scale simulations, biogenic VOC emissions are from the Model of Emissions of Gases and Aerosols from Nature version 2.1 (MEGAN2.1) (Guenther et al., 2012). The PA simulations additionally replace MEGAN emissions with emissions from the Biogenic Emission Inventory System (BEIS) (Bash et al., 2016) over North America (US EPA, 2019a). The EQUATES MEGAN emissions are obtained from a compilation by Sindelarova et al. (2014). Soil NOx emissions for the PA hemispheric simulations are also from MEGAN with replacement by BEIS soil NOx over North America. Soil NOx emissions for the hemispheric EQUATES simulations are from a dataset by the Copernicus Atmosphere Monitoring Service (CAMS, 2018) based on methods by Yienger and Levy (1995). Both the PA and the EQUATES simulations use BEIS for biogenic VOC and soil NOx emissions in the continental-scale simulations. Lightning NOx emissions for both the PA and the EQUATES hemispheric simulations are from monthly climatology obtained from the Global Emissions Initiative (GEIA) and are based on Price et al. (1997). Lightning NOx was not included in the PA continental-scale simulations, while lightning NOx for the EQUATES continental-scale simulations is calculated using an inline module in CMAQ (Kang et al., 2019). For both PA and EQUATES, wildfire emissions outside of North America are based on the Fire Inventory from NCAR (FINN) v1.5 (Wiedinmyer et al., 2011), which provides day-specific fire emissions. Wildfires are vertically allocated with 25 % of emissions distributed to the lowest two layers ( 0–45 m), 35 % distributed to layers 3–9 ( 45–350 m), and the remaining 40 % distributed to layers 10–19 ( 350–2000 m) as described in the technical support document for northern hemispheric emissions (US EPA, 2019a). Wildfire emissions within North America are based on the Hazard Mapping System (HMS) fire product, which provides day-specific fire activity data. Emission processing for North American wildfires is further described in the technical support document for North American emissions (US EPA, 2019b) (applicable to PA simulations) and Foley et al. (2023) (applicable to EQUATES simulations). Although the methods are similar, North American wildfire emissions may differ between PA and EQUATES based on the specific fire activity data that were used in each case. Fire plume injection height for North American fires is determined by an inline plume rise algorithm in CMAQ based on fire heat content (see, e.g., Wilkins et al, 2022, for more details on fire plume injection height in CMAQ). Stratospheric O3 in both the PA and the EQUATES simulations is from the PV parameterization by Xing et al. (2016) (described in more detail above) in the hemispheric simulations. Stratospheric O3 in the continental-scale simulations only comes from any stratospheric O3 inherited from the lateral boundary conditions provided by the hemispheric simulations.

2.2 O3 observations

O3 observational data are from the Air Quality System (AQS) database, which provides data from federal, state, local, and tribal air quality monitoring networks across the US. The average precision of O3 monitors in the AQS database was reported as 2.2 % and 2.4 % in 2016 and 2017, respectively, and the national average absolute bias was reported as 1.5 % in both 2016 and 2017 (https://www.epa.gov/amtic/amtic-ambient-air-monitoring-assessments, last access: 17 March 2024). There were  360 000 MDA8 O3 observations available per year for 2016 and 2017 from  1250 unique monitoring sites. These numbers take into account monitoring sites where O3 is measured by multiple instruments at the same location (as indicated in the AQS database by a parameter occurrence code). In these cases, the MDA8 O3 observations from multiple instruments are averaged for a given site and day and treated as a single observation. The observations overrepresent the eastern US compared to the western US. About 40 % of MDA8 O3 observations and  36 % of O3 monitoring sites are in the western US (as defined by longitude west of 97° W). Western US sites are also overrepresented by sites in the state of California. About 40 % of MDA8 O3 observations and  40 % of O3 monitoring sites in the western US are in California. The observations also overrepresent the high-O3 season of April–October (Fig. 1) since many monitors are only required to be operated during the high-O3 season.

https://gmd.copernicus.org/articles/17/8373/2024/gmd-17-8373-2024-f01

Figure 1Locations of O3 observational sites in 2016 indicated with a circle whose color shows the number of MDA8 O3 observations available from each site in 2016 (a). Total number of MDA8 O3 observations in each month of 2016 (b).

2.3 O3 data fusion model

We use multivariate ordinary least squares regression to model the relationship between the individual model components and observed MDA8 O3. Regression parameters provide estimates of the spatial and temporal model bias attributable to each individual O3 component. The regression model for the ozone mixing ratio O3 on day d and location (long, lat, z) is formulated as follows:

(1) O 3 = i α i O 3 i simulated + ε ,

where αi=α0,i+αx,ilong+αy,ilat+αz,iz+αsin,isin(d)+αcos,icos(d); d is the day of the year in radians; z is the elevation above sea level; long, lat, z, sin (d), and cos (d) are normalized to a zero mean and unit standard deviation (Table S6); εN(0,σ2); and index i represents different sets of O3 components. Specifically, we consider four sets of i:

i{USanthropogenic,USbackground}(PAandEQUATES)i{USanthropogenic,natural,international}(PA)iUS anthropogenic, natural, long-range international,Canada+Mexico(PA)iUSanthropogenic,stratosphericUSbackground,stratospheric(EQUATES).

Each simulated O3 component O3isimulated is multiplied by the alpha adjustment factor for that component (αi), which varies as a function of space and time, to calculate an adjusted estimate of each O3 component. The inferred model bias for a particular component is calculated as the difference between the original simulated O3 and adjusted O3 for that component. The individual adjusted O3 components are summed to calculate the total adjusted O3. The longitude and latitude terms of αi are intended to capture the spatial variability of O3 biases, while the z term of αi is intended to capture biases in O3 related to elevation. The sinusoidal day of the year terms of αi is intended to capture the cyclical nature of O3 production and to identify any seasonal dependence in O3 biases. The modeled O3 components do not add up to observed O3 because of biases in the model or its inputs. The CMAQ-simulated O3 components are adjusted by applying estimated regression coefficients to the gridded data so that the sum of the components more closely aligns with observed O3. A more complex method (e.g., non-linear regression or machine learning) may give a better fit to observed O3, but the interest here is to estimate potential biases in the modeled O3 components, which is more straightforward with a linear regression. Empirical orthogonal function (EOF) analysis was used to further explore the spatial and temporal structure of the inferred bias fields and is discussed in the Supplement.

A separate regression model is developed for each separate model configuration (i.e., model resolution, PA or EQUATES simulation, and US background O3 component split). There are three model resolutions and three US background O3 splits for the PA simulations, resulting in nine PA models. There are two model resolutions for the EQUATES simulations. The 12 km EQUATES data have two US background O3 splits, while the 108 km EQUATES data have one US background O3 split, resulting in three EQUATES models. For the PA models, only 2016 PA simulation data are used to train the models since these simulations are only for that year. For the EQUATES models, both 2016 and 2017 EQUATES simulation data are used to train the models. The location and sampling schedule of the monitoring sites overrepresent the eastern US, low elevations, and the high-O3 season, which may impact how representative the results are of non-monitored locations. Overfitting of the regression model is tested using three cross-validation approaches in which the data are split in both space and time, in space only, and in time only. In the first approach (spatial and temporal withholding), 10 % of all observational data are randomly selected and reserved as a test set, while the remaining 90 % are used as the training set. In the second approach (spatial withholding), data from 10 % of randomly selected observation sites are used as a test set, while data from the remaining 90 % of sites are used as the training set. In the third approach (temporal withholding), data from 10 % of randomly selected days of the year are used as a test set, while data from the remaining 90 % of days of the year are used as the training set. The root mean square error (RMSE) and mean bias for the test and training set are compared to evaluate the potential of the model to overfit the data.

3 Results and discussion

3.1 CTM results

The overall performance of MDA8 O3 for each simulation is summarized here by the normalized mean bias (NMB) compared to O3 monitoring sites. The 12 km PA simulations were biased high for 2016 (NMB = 1.2 %), while the 12 km EQUATES simulations were biased low for 2016 and 2017 (NMB =3.7 % and 5.1 %). The 36 and 108 km PA simulations were biased high over the US for 2016 (NMB = 5.2 % and 10.0 %). The 108 km EQUATES simulations were also biased high over the US for 2016 and 2017 (NMB = 2.8 % and 0.5 %). The two sets of simulations are broadly consistent with one another for base, US anthropogenic, and total US background O3, which are common to both. Details on the contributions from the different O3 components in the PA and EQUATES simulations follow hereafter.

https://gmd.copernicus.org/articles/17/8373/2024/gmd-17-8373-2024-f02

Figure 2Annual average MDA8 O3 from policy assessment CMAQ simulations. Results are shown for 12 km (top row), 36 km (middle row), and 108 km (bottom row) horizontal resolutions. O3 concentrations include total (base) O3 and O3 components from US anthropogenic, natural, long-range international, and Canada+Mexico sources.

Table 2Summary of annual average of MDA8 O3 components for the policy assessment set of simulations. Averages are shown for the entire US and separately for the eastern and western US, with a longitude of 97° W serving as the east–west dividing line. The mean across all grid cells within the given area is shown along with the minimum and maximum for any grid cell within the given area in parentheses. The numbers in the table are in units of parts per billion. Seasonal averages are provided in Table S13.

Download Print Version | Download XLSX

CMAQ-simulated annual average MDA8 O3 from the PA simulations shows similar results across the three different model resolutions for US background O3 sources (Fig. 2; Table 2). Simulated US anthropogenic O3 tends to increase with coarser model resolution, which results in corresponding increases in base O3. Natural O3 makes the largest contribution to annual average O3 across the US, with a larger contribution in the western US ( 55 % of base) than in the eastern US ( 45 % of base). US anthropogenic O3 is the second-largest component of annual average O3, with a larger contribution in the eastern US ( 35 % of base) than in the western US ( 20 % of base). There are a small number of US grid cells with negative annual averages for US anthropogenic O3. This means that US background O3 was greater than base O3 and indicates that anthropogenic emissions suppress O3 through NOx titration. Long-range international sources impact the western US ( 15 % of base) more strongly than the eastern US ( 10 % of base). Both natural and long-range international O3 levels tend to be higher at higher elevations, suggesting that some of the effects from natural and long-range international O3 are from O3 in the free troposphere. In spring, O3 lifetimes are longer, and trans-Pacific transport of O3 is more likely, which is consistent with the spring peak in long-range international O3 (Liu et al., 1987). The other components and base O3 peak in the summer with some exceptions (Fig. 3). In the southeastern US, natural O3 is lower during summer compared to surrounding areas and is lower than natural O3 in the southeastern US during spring. This is likely because O3 loss through reaction with biogenic VOCs (which peak in the summer and are abundant in the southeastern US) reduces O3 under the extremely low NOx conditions with zero anthropogenic emissions. The Canada+Mexico contribution to O3 is small, except at some locations along the border with Mexico where the contributions can be high, especially in the summer. For US grid cells within 100 km of the border with Canada, the annual average impact is  2 ppb, while for US grid cells within 100 km of the border with Mexico, the annual average impact is  5 ppb.

https://gmd.copernicus.org/articles/17/8373/2024/gmd-17-8373-2024-f03

Figure 3Seasonal average MDA8 O3 from policy assessment CMAQ simulations. Results are shown for 12 km horizontal resolution for winter (DJF), spring (MAM), summer (JJA), and fall (SON). Seasonal averages for the 36 and 108 km simulations are provided in Figs. S1 and S2. O3 concentrations include total (base) O3 and O3 components from US anthropogenic, natural, long-range international, and Canada+Mexico sources.

https://gmd.copernicus.org/articles/17/8373/2024/gmd-17-8373-2024-f04

Figure 4MDA8 O3 on the day of the fourth-highest base case MDA8 O3 from policy assessment CMAQ simulations. Results are shown for 12 km (top row), 36 km (middle row), and 108 km (bottom row) horizontal resolutions. O3 concentrations include total (base) O3 and O3 components from US anthropogenic, natural, long-range international, and Canada+Mexico sources.

While the annual (Fig. 2, Table 2) and seasonal (Fig. 3) average MDA8 O3 contributions provide insight into longer-term contributions, compliance with the NAAQS is determined based on the fourth-highest observed MDA8 O3, averaged over 3 years. We examine the fourth-highest total (base) MDA8 O3 along with the contribution from each of the MDA8 O3 components on the same day (Fig. 4). The areas with the greatest fourth-highest MDA8 O3 in the base simulation mostly have large contributions from the US anthropogenic O3 component. This includes much of California and major metropolitan areas in the rest of the US. The eastern US has a higher level of US anthropogenic O3 outside of the metropolitan areas compared to most of the western US where US anthropogenic O3 outside of urban areas is typically in the range of 5–20 ppb. Although the western US and eastern US have similar fourth-highest MDA8 O3 values for base O3 (western US 60 ppb, eastern US 61 ppb for 12 km simulations), the western US has a lower average contribution from the US anthropogenic component (14 ppb) compared to the eastern US (33 ppb).

The contribution to the fourth-highest MDA8 O3 from natural O3 is the largest in parts of the western US with extreme wildfire effects. Large impacts on natural O3 from wildfire events can be seen in Idaho, Wyoming, and California. The contribution from natural O3 is nearly always less than the contribution from US anthropogenic O3 in the eastern US. However, in much of the western US (excluding California and large urban areas), the contribution from natural O3 typically exceeds that of US anthropogenic O3. On average the natural contribution is higher in the western US than in the eastern US (western US 34 ppb, eastern US 22 ppb for 12 km simulations), which reflects the greater prevalence of wildfires in the western US, a larger background contribution from stratospheric O3 due to the higher elevation of the western US, and a larger impact from both long-range and short-range (Canada+Mexico) international sources. The contribution from long-range international MDA8 O3 is a maximum of 20 ppb in the western US and is typically lower in the eastern US compared to the western US on average (western US 6 ppb, eastern US 2 ppb for 12 km simulations). The seasonal average of the long-range international contribution is the highest in the spring, while base MDA8 O3 is typically the highest in the summer (Fig. 3), so days with the highest total O3 tend not to be the same days with the highest long-range international O3. The contribution from Canada+Mexico MDA8 O3 is the largest in states along the southern and northern borders, as expected. Contributions from Canada+Mexico tend to be small, except in border areas. The average MDA8 O3 contributions on days of the top 10 highest base MDA8 O3 levels are similar to the results for the fourth-highest MDA8 O3 shown here (Fig. S3).

https://gmd.copernicus.org/articles/17/8373/2024/gmd-17-8373-2024-f05

Figure 5Annual average MDA8 O3 from EQUATES CMAQ simulations. Results are shown for 12 km resolution (top and middle rows) and 108 km resolution (bottom row). O3 concentrations include total (base) O3 and O3 components from US anthropogenic, non-stratospheric US background, and stratospheric sources for 12 km. For both the 12 km and the 108 km simulations, base, US anthropogenic, and total US background O3 concentrations are also shown.

Table 3Summary of annual average of MDA8 O3 components for the EQUATES set of simulations. Averages are shown for the entire US and separately for the eastern and western US, with a longitude of 97° W serving as the east–west dividing line. The mean across all grid cells within the given area is shown along with the minimum and maximum for any grid cell within the given area in parentheses. The numbers in the table are in units of parts per billion. Seasonal averages are provided in Table S14.

Download Print Version | Download XLSX

A second set of simulations (EQUATES) splits US background O3 into different components compared to the PA simulations. The use of different US background O3 components provides additional insight into the source-specific biases in US background O3. CMAQ-simulated O3 results from the 2016 EQUATES simulations are comparable to the results from the PA simulations for the 12 km simulations, though the EQUATES simulations have slightly less O3 from US anthropogenic sources and more from US background sources compared to the PA simulations (Fig. 5; Table 3). US anthropogenic O3 contributed  20 % of the annual average base O3 across all US model grid cells ( 25 % for PA simulations). As in the PA simulations, the contribution to US anthropogenic O3 was higher in the eastern US ( 25 % of base) than in the western US ( 15 % of base). Stratospheric O3 is higher in the western US, especially at higher elevations, which is consistent with previous studies (Jaffe et al., 2018). On average, stratospheric O3 is 40 % of the base O3 in the western US and 34 % of the base O3 in the eastern US. Stratospheric O3 represents an upper bound of stratospheric influences because the tracer species used for its calculation in this study does not undergo chemical losses. Non-stratospheric US background O3 contributes 47 % of the annual average base O3 in the western US and 42 % in the eastern US. Non-stratospheric US background O3 is likely underestimated in regions and seasons with more active chemistry due to the use of the chemically inert tracer species used to calculate non-stratospheric US background O3. The 108 km hemispheric CMAQ (H-CMAQ) results for the EQUATES and PA simulations are similar on average but do have some notable differences. The H-CMAQ simulations are similar in their simulation of US background O3. The US anthropogenic O3 contributions are also similar on average, though the PA simulations have higher maximum values compared to the EQUATES simulations, which leads to higher maximum values of base O3.

https://gmd.copernicus.org/articles/17/8373/2024/gmd-17-8373-2024-f06

Figure 6Seasonal average MDA8 O3 from EQUATES CMAQ simulations. Results are shown for 12 km horizontal resolution for winter (DJF), spring (MAM), summer (JJA), and fall (SON). O3 concentrations include total (base) O3 and O3 components from US anthropogenic, non-stratospheric US background, and stratospheric sources. Seasonal averages for the other US background O3 split cases are provided in the Supplement (Figs. S4 and S5).

Base O3 in EQUATES is the highest in the summer (Fig. 6). US background O3 is the highest during spring throughout most of the US. However, in much of the Mountain West, US background O3 is the highest during the summer (Figs. S4 and S5). The stratospheric O3 tracer is the highest in the western US. Much of the western US has stratospheric O3 at about the same level in the spring and summer. In the southeastern US, stratospheric O3 is the highest in the summer, while in the northeastern US, there are similar levels of stratospheric O3 in the spring and summer. Stratospheric O3 is elevated in the summer because of the lack of chemical sinks due to the inert tracer species used to estimate stratospheric O3. Most previous studies have indicated that stratospheric O3 peaks in the spring (Lin et al., 2015). The stratospheric contribution to O3 from H-CMAQ calculated using the decoupled direct method (which does account for chemical losses) also showed higher stratospheric contributions in spring than in summer (Mathur et al., 2022). The higher summer stratospheric O3 here is explained by the lack of chemical losses due to the tracer method used. Potential biases are explored further in Sect. 3.3. US anthropogenic O3 is the highest in the summer in the eastern US and in California, consistent with the PA simulations. Non-stratospheric US background O3 is relatively uniform outside of summer, though it tends to be slightly lower in the southeast and higher in the western US.

The results from both the PA and the EQUATES simulations indicate that US background O3 contributes more than US anthropogenic O3 to base O3 on an annual average basis. Simulated US background O3 is higher in the western US than in the eastern US due to greater impacts from both natural and non-domestic anthropogenic sources. Simulated US anthropogenic O3 is higher in the eastern US than in the western US due to the higher population density and consequently greater anthropogenic emissions. The contributions from US anthropogenic O3 peak in the summer, which causes base O3 to peak in the summer as well. US background O3 varies by season but is not as seasonally variable as US anthropogenic O3. These results are broadly consistent with previous efforts to quantify US background and US anthropogenic O3 using CTMs (McDonald-Buller et al., 2011; Jaffe et al., 2018).

Similar to the PA simulations, we examine the fourth-highest total (base) MDA8 O3 along with the contribution from each of the MDA8 O3 components on the same day for the EQUATES simulations. As in the PA simulations, the areas with the greatest fourth-highest MDA8 O3 values for base MDA8 O3 tend to have a larger contribution from US anthropogenic O3 than from US background O3. The EQUATES fourth-highest base MDA8 O3 is slightly lower than in the PA simulations (56 ppb in the western US and 57 ppb in the eastern US compared to 60 and 61 ppb in the PA simulations at 12 km). The US anthropogenic contribution is similarly lower in the EQUATES simulations (10 ppb in the western US, 25 ppb in the eastern US compared to 14 and 33 ppb in the PA simulations at 12 km). The contributions from US background O3 are higher in the western US than in the eastern US on average (eastern US 32 ppb, western US 46 ppb for 12 km simulations). The contribution from non-stratospheric US background O3 (western US 25 ppb, eastern US 18 ppb) is generally greater than the contribution from stratospheric US background O3 (western US 21 ppb, eastern US 14 ppb). The western US has larger contributions from stratospheric O3, long-range international O3, and wildfires. In the EQUATES simulations, the Flint Hills area of Kansas stands out as an area influenced by fires. The fires in this area are typically prescribed burning of grasslands used for agricultural land management. While these were included in the fire emissions for the US background O3 simulation, prescribed burns are typically classified as anthropogenic sources rather than background sources. The average MDA8 O3 contributions on days of the top 10 highest base MDA8 O3 levels are similar to the results for the fourth-highest MDA8 O3 shown here (Fig. S6).

3.2 Cross-validation of regression modeling

Overfitting is tested using a cross-validation analysis as described in Sect. 2.2. Three different cross-validation methods are used: spatial and temporal withholding, spatial withholding, and temporal withholding. The parameters derived from the training set are then used to predict the observed O3 in the test set. The RMSE and mean bias with respect to the true observations of both the training and the test sets are compared to one another (Table 4; Tables S7 and S8). For each of the three cross-validation methods, the RMSE and mean bias of the training and test sets are similar to one another. This indicates that the model does not overfit and is generalizable to data outside of its training data, providing confidence that we can apply the regression models to the gridded CTM results to estimate the bias in O3 and individual O3 components across the US.

Table 4Summary of the performance for cross-validation of the MDA8 O3 data fusion model. Values shown are the average over all regression model cases. RMSE and mean bias statistics for individual cases are provided in Tables S7 and S8. The performance for the base O3 simulations prior to applying the bias adjustment is also provided for comparison.

Download Print Version | Download XLSX

https://gmd.copernicus.org/articles/17/8373/2024/gmd-17-8373-2024-f07

Figure 7MDA8 O3 on the day of the fourth-highest base case MDA8 O3 from EQUATES CMAQ simulations. Results are shown for 12 km resolution (top and middle rows) and 108 km resolution (bottom row). O3 concentrations include total (base) O3 and O3 components from US anthropogenic, non-stratospheric US background, and stratospheric sources for 12 km. For both the 12 km and the 108 km simulations, base, US anthropogenic, and total US background O3 concentrations are also shown.

3.3 Inferred CTM biases

The coefficients from the regression models (Tables S9–S12) are applied to the gridded CTM data to calculate adjusted values of each O3 component. The inferred CMAQ bias for each component is the difference between the original CMAQ-simulated value and the adjusted value. The inferred bias in base O3 is the original CMAQ-simulated base O3 minus the sum of adjusted O3 components. For the PA simulations, there is a residual anthropogenic component of base O3 that is not apportioned to either US anthropogenic or international sources due to the effects of non-linear chemistry (Table S2). The residual anthropogenic component is equal to base O3 minus natural O3 minus international O3 minus US anthropogenic O3. This means that the sum of biases in the individual components does not add up to the bias in base O3 as the residual anthropogenic component was not included in the adjusted O3 results. In the PA simulations, base O3 is inferred to be biased high in most of the eastern US and in some parts of California and Arizona (Fig. 8). US anthropogenic O3 is inferred to be biased high in the same areas. Reducing the amount of US anthropogenic O3 improves the fit to base O3, which suggests that biases in the effects from US anthropogenic emissions contribute to the high biases inferred in base O3. The inferred high biases in base and US anthropogenic O3 increase with increasing coarseness of model resolution in the eastern US. Similarly, the high bias increases with coarser model resolution in the Canada+Mexico component along the border with Mexico. The inferred high biases in US anthropogenic O3 in the eastern US are primarily driven by biases in the summer and fall (Table S15, Figs. S7–S9). Inferred eastern US anthropogenic O3 biases average 2, 7, and 11 ppb in the summer and 3, 4, and 5 ppb in the fall for the 12, 36, and 108 km simulations. In the western US, where US anthropogenic O3 is mostly found to be biased low, coarser model resolution results in the summer average bias changing from slightly negative in the 12 km simulations (0.5 ppb) to slightly positive in the 36 and 108 km simulations (+0.7 and +1.0 ppb).

https://gmd.copernicus.org/articles/17/8373/2024/gmd-17-8373-2024-f08

Figure 8Annual average of inferred MDA8 O3 model bias from policy assessment CMAQ simulations. Results are shown for 12 km (top row), 36 km (middle row), and 108 km (bottom row) horizontal resolutions. O3 concentrations include total (base) O3 and O3 components from US anthropogenic, natural, long-range international, and Canada+Mexico sources. Seasonal averages are provided in Figs. S7–S9.

In contrast to our results showing an increase in O3 with coarser resolution, Schwantes et al. (2022) found that O3 tended to increase for a finer-resolution simulation ( 14 km vs.  111 km over the CONUS) during the summer over urban areas using the Community Earth System Model (CESM)/Community Atmosphere Model with full chemistry (CAM-chem), which was attributed to improvements in the spatial resolution of NOx emissions, resulting in less artificial dilution of NOx and enhanced O3 production. Similarly, Lin et al. (2024) found that a variable-resolution global model (AM4VR with a horizontal resolution of 13 km over CONUS) had increased O3 over urban areas compared to a fixed-resolution model (AM4.1 with a horizontal resolution of  100 km globally). In particular for the Los Angeles Basin and Central Valley regions of California, Lin et al. (2024) found that the increased resolution of AM4VR led to better simulation of observed O3 levels in these areas due the finer-resolution model's ability to represent sharp spatial gradients in areas with NOx-limited vs. NOx-saturated O3 production regimes. Our analysis of the fourth-highest MDA8 O3 levels shows similar findings over California (Figs. 4 and 7). Given the previous results that found increased O3 with finer-resolution simulations, our results that found higher biases in US anthropogenic O3 in the eastern US with coarser resolution should be taken to apply specifically to the CMAQ model results described here rather than as a general finding on the impact of model resolution on O3 production. Additionally, given that the finding of higher US anthropogenic O3 with coarser model resolution does not hold for the analysis of the fourth-highest MDA8 O3 levels, this finding should be taken to apply only to longer-term (e.g., annual or seasonal) averages.

There are offsetting inferred biases in the long-range international and natural O3 components in much of the western US. The offsetting inferred biases may reflect an inability of the regression model to separate the signals from long-range international and stratospheric O3. Long-range international and stratospheric O3 levels are expected to impact sites at similar spatial and temporal scales, with larger impacts expected at high elevations in the western US during spring. Stratospheric O3 effects are not limited to episodic intrusion events but also come from constant entrainment of stratospheric air into the free troposphere. The impacts from long-range international emissions are primarily from long-range transport in the free troposphere, so stratospheric O3 and long-range international O3 are expected to be correlated. The regression model may be assigning bias due to stratospheric O3 to long-range international O3 because the CTM-modeled long-range international component has better correlation with the stratospheric O3 impact than the CTM-modeled natural component. This could result in the regression model adjusting long-range international O3 upwards (i.e., inferred negative bias) to add stratospheric O3. The natural O3 is then adjusted downwards (i.e., inferred positive bias) in the same locations because some of the effects of stratospheric O3 are captured in the CTM-modeled natural O3 component but need to be offset because of the O3 that was added to the long-range international component. This indicates a limitation of this method in that it is sensitive to correlation between modeled O3 components. Correlation of the O3 components is a major confounding issue in this analysis. In interpreting the results, it is necessary to consider both the inferred biases and the correlation of the components together.

https://gmd.copernicus.org/articles/17/8373/2024/gmd-17-8373-2024-f09

Figure 9Daily average of inferred MDA8 O3 model bias from policy assessment CMAQ simulations averaged across US model grid cells in the eastern and western US. A longitude of 97° W is used as the dividing line between east and west. PA O3 concentrations include total (base) O3 and O3 components from US anthropogenic, natural, long-range international, and Canada+Mexico sources. US background indicates the sum of biases for individual US background components.

Download

In the temporal trends of inferred base O3 bias, the PA simulations show a consistent low bias in winter and spring and high bias in summer and fall, which is consistent across model resolution scales (Fig. 9). There is also a consistent high bias in US anthropogenic O3 in summer and fall in the eastern US, which increases with coarser model resolution. Inferred bias in US anthropogenic O3 in the western US has some small seasonal variability but is near zero on average. The seasonal patterns of long-range international O3 bias have the largest underestimate in the winter and spring and the smallest underestimate in late summer and early fall. The temporal trend of natural O3 differs in the 12 km simulation compared to the 36 and 108 km simulations. In the 12 km simulation, natural O3 biases are higher in the middle of the year than in the beginning and end of the year. In the 36 and 108 km simulations, the opposite is found. This change in sign is a result of changes in the spatial patterns of natural O3 inferred bias in different seasons. In the 12 km simulation, natural O3 is inferred to be biased low in the southern part of the US and biased high in the northern part of the US. In the 36 and 108 km simulations, natural O3 is inferred to be biased low in the eastern US and mostly biased high in the western US, particularly in the Mountain West region. These spatial changes in the seasonal average natural O3 bias are enough to change the sign of the US average temporal bias trend. As described before, the offsetting negative long-range international bias and positive natural O3 bias in the high-elevation areas of the western US are thought to be a result of the regression model allocating stratospheric O3 bias to the long-range international O3 signal while removing some stratospheric O3 from the natural O3 signal. Canada+Mexico O3 biases are very small when averaged across the US since this source primarily affects border areas and only has small impacts elsewhere.

https://gmd.copernicus.org/articles/17/8373/2024/gmd-17-8373-2024-f10

Figure 10Annual average of inferred MDA8 O3 model bias from EQUATES CMAQ simulations. Results are shown for 12 km resolution (top and middle rows) and 108 km resolution (bottom row). O3 concentrations include total (base) O3 and O3 components from US anthropogenic, non-stratospheric US background, and stratospheric sources for 12 km. For both the 12 km and the 108 km simulations, base, US anthropogenic, and total US background O3 concentrations are also shown. Seasonal averages are provided in Figs. S10–S12.

The spatial results for the EQUATES 12 km simulations are shown for two O3 split cases. One case splits US background O3 into stratospheric and non-stratospheric sources, while the other considers all US background O3 together. Results show a mostly low bias inferred in base O3 throughout most of the US for the 12 km simulation (Fig. 10). For the 108 km H-CMAQ simulation, there is a high bias in the eastern US and a low bias in the western US for base O3. As with the PA results, there is a high bias in US anthropogenic O3 in the eastern US that increases with coarser model resolution. The inferred low bias in the stratospheric O3 component indicates that there is too little stratospheric O3 in the western US. There is an inferred high bias in stratospheric O3 in the eastern US. The stratospheric O3 results should be interpreted with some caution because the stratospheric component comes from a chemically inert tracer. The stratospheric O3 biases are partly offset by opposite biases in the non-stratospheric US background O3. The low biases in stratospheric O3 and the lack of low biases in the non-stratospheric US background O3 provide more evidence that the low biases in the long-range international O3 from the PA simulations are related to low biases in stratospheric O3.

In the case where US background O3 is not split into stratospheric and non-stratospheric components, the 12 and 108 km simulations both have low biases in US background O3, but the magnitude of the bias is greater in the 12 km simulation than in the 108 km simulation. This may be a result of differences in the impacts of stratospheric O3 at the surface level in the H-CMAQ simulation compared to the continental-scale simulation. Differences in the estimation of stratospheric O3 impacts may arise from differences in how the vertical structure of the model in the H-CMAQ simulations is configured compared to the continental simulations. The UTLS PV O3 scaling is turned on during the H-CMAQ simulation. For the continental simulation, PV O3 scaling is turned off because the continental model configuration uses fewer vertical layers and a coarser vertical resolution in the UTLS compared to the H-CMAQ simulations. The stratospheric O3 influences in the continental simulation are only influences that are inherited from the lateral boundary conditions. Previous work indicates that O3 in the upper layers of the continental-scale model is driven mostly by horizontal advection of the lateral boundary conditions (Hogrefe et al., 2018), meaning that if stratospheric intrusion events are captured by the hemispheric-scale simulation, the effects of these events are also expected to be captured by the continental-scale simulation. However, a sensitivity test with UTLS PV O3 scaling turned on during the continental simulation may be an area for future study. This would require the addition of more vertical layers with finer resolution in the UTLS in the continental simulation to support the PV O3 scaling parameterization. The differences in the vertical structure of the hemispheric and continental simulations can affect the vertical mixing of stratospheric O3 from the upper layers down to the surface, which may explain the differences in the inferred bias of US background O3. Alternatively, the differences in US background O3 biases could also occur due to differences in O3 production from local US background O3 sources across model resolution scales and may not necessarily be affected by differences in stratospheric O3.

https://gmd.copernicus.org/articles/17/8373/2024/gmd-17-8373-2024-f11

Figure 11Daily average of inferred MDA8 O3 model bias from EQUATES CMAQ simulations averaged across US model grid cells in the eastern and western US. A longitude of 97° W is used as the dividing line between east and west. EQUATES O3 concentrations include base O3 and O3 components from US anthropogenic, non-stratospheric US background, and stratospheric sources for 12 km. For both the 12 km and the 108 km simulations, base, US anthropogenic, and total US background O3 concentrations are also shown. For the case with multiple US background O3 components, US background indicates the sum of biases for individual US background components.

Download

For the EQUATES temporal results, base O3 is biased low in the spring and high in the summer in the eastern US (Fig. 11). In the western US, base O3 is biased low throughout most of the year. Averaged across the US, bias is near zero in the summer and fall in the 12 km simulation, with high biases in the 108 km simulation during the same period (+1 ppb in summer; +2 ppb in fall). The high biases in base O3 in the eastern US are mostly due to high biases in the US anthropogenic O3 component, which peak in the summer (average +1.4 and +6.0 ppb for the 12 and 108 km simulations) and continue to be biased high into the fall (average +0.8 and +2.2 ppb for the 12 and 108 km simulations). The stratospheric O3 component is inferred to be biased low, except in the summer and early fall. In the western US, stratospheric O3 bias is near zero in the summer and fall, while in the eastern US, stratospheric O3 is biased high in the summer and fall. The lowest biases in stratospheric O3 occur in the winter. The stratospheric O3 biases are partially offset by opposing biases in the non-stratospheric US background O3. The regression model formulation without the separate stratospheric O3 indicates that there is a low bias in US background O3 throughout most of the year in the 12 km simulation, which is at its lowest in the spring. The 108 km simulations show a low bias for US background O3 in the spring and summer and high bias in the fall and winter.

In the 12 km EQUATES simulations, the stratospheric O3 tracer averages 14 ppb in the western US during spring, with a maximum spring average across all western US grid cells of 17 ppb. Using the bias correction approach developed here, we find that the spring average stratospheric O3 in the western US is biased low by 3.5 ppb, resulting in an adjusted (i.e., bias-corrected) estimate of western US spring average stratospheric O3 of 17 ppb. Consistent with the low bias in stratospheric O3 suggested here, other CTMs have estimated higher stratospheric O3 contributions compared to those simulated here with CMAQ. The spring average of stratospheric O3 contributions estimated with the AM3 model has been estimated at 20–25 ppb (Lin et al., 2012a; Langford et al., 2015; Lin et al., 2015). The AM3 estimates of stratospheric O3 have sometimes been estimated to be biased high (Lin et al., 2012a) and have also been shown to lead to overestimated springtime O3 concentrations when used as boundary conditions for regional-scale CMAQ simulations (Hogrefe et al., 2018), but at other times they have been estimated to be relatively unbiased based on evaluation against observations from intensive field studies (Langford et al., 2015). The stratospheric O3 contribution simulated by AM3 has previously been found to be higher than that of the GEOS-Chem global model (Fiore et al., 2014). Using GEOS-Chem, Zhang et al. (2014) found the spring mean stratospheric O3 influence in the Intermountain West to range from 8–10 ppb, as estimated using the standard GEOS-Chem definition of stratospheric O3 as described in Zhang et al. (2011), and, alternatively, they found a spring mean of 12–18 ppb using a definition of stratospheric O3 adopted from Lin et al. (2012a) (the same method used for the AM3 estimates reported here). Itahashi et al. (2020) previously found that the stratospheric O3 representation in CMAQ was biased low in the free troposphere and suggested that improvements to the CMAQ representation of stratosphere to troposphere transport were needed. Our bias-adjusted estimate of western US spring mean stratospheric O3 (17 ppb) falls in between the estimates from the default GEOS-Chem representation (8–10 ppb) and from AM3 (20–25 ppb). As these are seasonal averages, the values are more representative of the continual entrainment of stratospheric air into the troposphere rather than episodic deep stratospheric intrusion events.

3.4 CTM biases by O3 concentration

The contributions and biases of different O3 components have so far been presented as annual or seasonal averages (Figs. 2–3, 5–6, 8, and 10), as the fourth-highest value that is relevant from a regulatory perspective (Figs. 4 and 7), or as daily averages over US model grid cells (Figs. 9 and 11). However, the relative contributions of O3 components at different total O3 concentrations are also of interest. For example, the relative contribution of US anthropogenic and US background O3 to total O3 may be different on days with higher total O3 vs. days with lower total O3. Situations where O3 exceeds the NAAQS, which is currently set at a level of 70 ppb, are of particular interest. We analyze the different O3 components at O3 monitoring sites for cases where O3 is less than 60 ppb, between 60 and 70 ppb (inclusive), and greater than 70 ppb. These concentration bins are selected because they reflect the current level of the standard (70 ppb) and a potential range that might be considered the level of the standard in the future (60–70 ppb). We compare the results of the analysis when using both simulated and observed O3 bins. Simulated O3 has a positive bias on average when simulated O3 is high and a negative bias on average when observed O3 is high, so selection bias influences these results. For this analysis, we consider the 12 km resolution simulations for the PA and EQUATES simulations. The resolution of 12 km is the resolution that is typical of simulations that support regulatory analyses. Monitoring sites are split into the western and eastern US using a longitude of 97° W as the dividing line. The division into the western and eastern US is made because there are differences in the contribution of US anthropogenic vs. background emissions between the two parts of the country.

https://gmd.copernicus.org/articles/17/8373/2024/gmd-17-8373-2024-f12

Figure 12Bias compared to MDA8 O3 observations of original simulations (black) and residual bias (purple) obtained as the difference between the adjusted MDA8 O3 and observations for the PA (top row) and EQUATES (bottom row) simulations. The horizontal line shows the median, the box shows the 25th–75th percentiles, and the whiskers show the 5th and 95th percentiles. The vertical grey lines separate the boxplots for each MDA8 O3 concentration bin. The numbers at the bottom of each panel are the number of data points falling within each concentration bin.

Download

The impacts of the linear regression adjustment technique at the observation sites are examined by comparing the original simulated bias to the residual bias (i.e., the sum of the adjusted individual O3 components minus observed O3) (Fig. 12). The change in bias from the original to residual bias is the inferred bias that has been referenced elsewhere. In all cases when O3 is binned by simulated O3 levels, the adjustment brings the bias closer to zero. In the eastern US, high biases at higher simulated O3 levels were reduced for both the PA and the EQUATES simulations. In the western US, low biases when simulated O3 was below 60 ppb were brought closer to zero for both the PA and the EQUATES simulations. At higher simulated O3 levels, the PA simulations originally had high biases in the western US, which were reduced in the adjusted results, while the EQUATES simulations originally had low biases in the western US, which were improved in the adjusted results. The effects on bias when binning by observed O3 are mixed. In both the western and the eastern US for both the PA and the EQUATES simulations, the simulations were originally biased low at higher observed O3 levels, with the EQUATES simulations being more biased low than the PA simulations. The low bias is improved in the EQUATES simulations, but in the PA simulations the bias either is about the same or becomes more biased low. The inability of the adjustment to improve the bias across the range of both observed and simulated O3 levels is a limitation of this technique. The fitting of multi-axis (latitude, longitude, season) linear correction factors (αi) will be strongly influenced by the larger population of lower (O3< 70 ppb) concentrations and will only correct the upper end if the bias structure is consistent.

https://gmd.copernicus.org/articles/17/8373/2024/gmd-17-8373-2024-f13

Figure 13Contributions to MDA8 O3 from the PA simulation (top row) and inferred biases (bottom row) of US anthropogenic, natural, long-range international, and Canada+Mexico O3 separated by both observed and simulated base MDA8 O3 concentrations at O3 monitoring sites. The sum of natural, long-range international, and Canada+Mexico O3 is shown as the US background O3. The horizontal line shows the median, the box shows the 25th–75th percentiles, and the whiskers show the 5th and 95th percentiles. The vertical grey lines separate the boxplots for each MDA8 O3 concentration bin. The numbers in the bottom row of the panels are the number of data points falling within each concentration bin.

Download

For the PA simulations, the contribution from US anthropogenic O3 tends to increase with higher simulated O3 and with higher observed O3 (Fig. 13), indicating that domestic anthropogenic pollution is driving the highest O3 concentrations. The contribution from US anthropogenic O3 is higher at eastern US sites than at western US sites due to higher anthropogenic precursor emissions in the east. There may also be impacts on US anthropogenic O3 in the eastern US from O3 or precursor pollutants transported from the western to eastern US. The median US anthropogenic O3 contribution is biased high (+1 ppb in the western US; +4 ppb in the eastern US) when base O3 is between 60 and 70 ppb with higher median biases (+2 ppb in the western US; +6 ppb in the eastern US) when base O3 exceeds 70 ppb. When observed O3 is between 60 and 70 ppb, the median US anthropogenic O3 contribution is biased slightly low in the western US (0.2 ppb) and biased high in the eastern US (+2 ppb). Bias is higher in the western US when observed O3 exceeds 70 ppb (+1 ppb) but is about the same in the eastern US (+2 ppb). Inferred biases of US anthropogenic O3 are higher across the range of simulated and observed O3 levels in the eastern US compared to the western US.

In the western US, natural O3 tends to be higher when either simulated or observed O3 is greater than 60 ppb; however, the distribution of natural O3 when O3 is above 70 ppb is similar to the distribution of natural O3 when O3 is between 60 and 70 ppb. In the eastern US, the distribution of natural O3 is similar across the range of simulated and observed O3 concentration bins but is slightly higher when O3 is greater than 60 ppb. Long-range international O3 makes a small contribution to O3 across concentration bins and tends to be lower as simulated or observed O3 increases. Canada+Mexico O3 is typically very small and only makes significant contributions at a few near-border sites (not shown). The natural and long-range international O3 components are biased slightly low at monitoring sites in the western US. For western US sites, the sum of the median biases in US anthropogenic and US background (i.e., natural + long-range international + Canada+Mexico) O3 at monitoring sites is negative across the simulated and observed O3 concentration bins but gets closer to zero at higher O3 levels. For eastern US sites, the bias in US anthropogenic O3 is predicted to be the main contributor to biases at high simulated O3 when simulated O3 concentrations exceed 60 ppb. When the O3 components are binned by observed O3 rather than simulated O3, the sum of the median biases in US anthropogenic and US background O3 at monitoring sites in the eastern US is negative across the range of simulated O3, with US background O3 becoming less negatively biased as observed O3 increases and US anthropogenic O3 becoming more positively biased as observed O3 increases.

https://gmd.copernicus.org/articles/17/8373/2024/gmd-17-8373-2024-f14

Figure 14Contributions to MDA8 O3 by the EQUATES simulation (top row) and inferred biases (bottom row) of US anthropogenic, non-stratospheric US background, and stratospheric sources separated by both observed and simulated base MDA8 O3 concentrations at O3 monitoring sites. The sum of non-stratospheric US background and stratospheric O3 is shown as the US background O3. The line shows the median, the box shows the 25th–75th percentiles, and the whiskers show the 5th and 95th percentiles. The vertical grey lines separate the boxplots for each MDA8 O3 concentration bin. The numbers in the bottom row of the panels are the number of data points falling within each concentration bin.

Download

For the 12 km EQUATES simulations, the US anthropogenic O3 contribution is similar to the 12 km PA results across the simulated O3 concentration bins (Fig. 14). At higher observed O3, the EQUATES simulations generally simulate lower US anthropogenic O3 compared to the PA simulations. As in the PA simulations, the US anthropogenic O3 contribution increases with increasing simulated and observed O3, meaning that domestic anthropogenic emissions are mostly driving the highest O3 levels. There is an inferred negative bias in US anthropogenic O3 in the western US, which becomes increasingly more negative as simulated or observed O3 increases. In the eastern US, there is an inferred positive bias in US anthropogenic O3, which becomes larger at higher simulated O3 concentrations (median bias of +0.05, +2, and +4 ppb at < 60, 60–70, and > 70 ppb simulated O3). There is also an inferred high bias across the range of observed O3; however, the magnitude is smaller, and the bias does not increase much at higher levels of observed O3 (median bias of +0.05, +0.5, and +0.6 ppb at < 60, 60–70, and > 70 ppb observed O3).

The contribution from stratospheric O3 is higher in the western US than in the eastern US across simulated and observed O3 concentrations. In the western US, stratospheric O3 tends to be higher when either observed or simulated O3 is above 60 ppb. In the eastern US, stratospheric O3 is at similar levels across the range of simulated and observed O3. In the western US, stratospheric O3 has a negative bias, which gets closer to zero when simulated and observed O3 levels are above 60 ppb. In the eastern US, stratospheric O3 has a positive bias, which gets higher when simulated and observed O3 levels are above 60 ppb. In both the western and the eastern US, non-stratospheric US background O3 makes similar contributions across different O3 concentrations. In the western US, non-stratospheric US background O3 has a negative bias when simulated or observed O3 is below 60 ppb and a positive bias when O3 is above 60 ppb. In the eastern US, non-stratospheric US background O3 has a negative bias across the range of simulated and observed O3. The magnitude of the negative bias is smaller when simulated or observed O3 is below 60 ppb than when O3 is above 60 ppb.

Binning the O3 contributions and inferred biases by observed and simulated O3 results in different numbers of data points in each sample. In the western US, there were 4145 instances when observed O3 exceeded 70 ppb, while there were 3302 (PA) and 627 (EQUATES) instances when simulated O3 exceeded 70 ppb at a monitoring site, with a large fraction of the observed and simulated exceedances occurring in California. In the eastern US there were 2135 instances when observed O3 exceeded 70 ppb and 2901 (PA) and 556 (EQUATES) instances when simulated O3 exceeded 70 ppb. The PA simulations more accurately simulated the number of exceedances compared to EQUATES, though this does not consider the timing or location of exceedances. Given the different number of samples in the observed vs. simulated bins and the lower number of data points for EQUATES-simulated O3 exceeding 70 ppb, it is possible that the population of data points when simulated O3 exceeds 70 ppb is not spatially representative of the population when observed O3 exceeds 70 ppb.

https://gmd.copernicus.org/articles/17/8373/2024/gmd-17-8373-2024-f15

Figure 15Spatial distribution of the number of times MDA8 O3 exceeded 70 ppb for observed and simulated O3. The circles show the locations of sites, and the color indicates the number of times MDA8 O3 exceeds 70 ppb at each site for the observations (a), PA 12 km simulation (b), and EQUATES 12 km simulation (c). Only sites with at least one exceedance are shown. The dotted black line shows the longitude of 97° W, which is used to divide west and east. Similar results for other model resolutions are shown in Fig. S13.

For the western US, the PA simulations largely capture the spatial distribution of exceedances seen in the observations, although the number of exceedances is underestimated (Fig. 15). The exceedances from the EQUATES simulations are not very representative of the spatial distribution of observed exceedances in the western US as there are very few sites with more than one or two exceedances outside of California. In particular, the numbers of exceedances in the Denver, Colorado; Phoenix, Arizona; Las Vegas, Nevada; and Boise, Idaho areas are underestimated in EQUATES relative to both the PA simulations and the observations. Both the PA and the EQUATES simulations underestimate the number of exceedances in the state of Utah. For the eastern US, the PA simulations generally capture the spatial distribution of observed exceedances but simulate too many exceedances. This is particularly notable in the northeastern US and along the Gulf Coast. The EQUATES simulations underestimate the number of exceedances, although the spatial distribution is generally similar to the observations. The degree of spatial representativeness provides additional context for interpreting the findings for the O3 component contributions and biases binned by O3 levels. For the western US, the findings for instances when O3 exceeds 70 ppb are not more broadly applicable to the western US. There are a limited number of instances when O3 exceeds 70 ppb in the western US outside of California. These results are mostly indicative of conditions in the Los Angeles area and in the Central Valley in California. This applies especially to the EQUATES results, but it is also the case for the PA simulations and the observations. For the eastern US, on the other hand, there is enough spatial variability in the observations and in both sets of simulations to interpret the findings for the eastern US more generally. These results are informative in an average sense but are not expected to hold in all cases when applied to specific monitoring sites or to specific days (e.g., fourth-highest O3). The biases for bins of 60–70 ppb and greater than 70 ppb should be interpreted with caution because the inferred biases apply the mean tendency to these high concentration subpopulations.

4 Conclusions

In this work, we use two sets of CMAQ simulations to analyze the contributions to US background O3 from different sources. Naturally occurring sources, long-range international anthropogenic pollution, and short-range international anthropogenic pollution from Canada and Mexico are considered separately for one set of simulations. In the other set of simulations, stratospheric and non-stratospheric sources of US background O3 are also considered separately. We also consider the contribution to total O3 from US domestic anthropogenic sources. The measurement–model data fusion approach for apportioning bias to US anthropogenic and US background O3 components from our previous study (Skipper et al., 2021) was extended to identify biases in separate US background O3 components. The results generally confirm previous high-level results but provide new insights from additional components and more detailed analysis.

Results indicated that US anthropogenic O3 was consistently inferred to be biased high (on an annual and seasonal average basis) in the eastern US, where domestic anthropogenic emissions are the dominant contributor to total O3, with increasingly higher biases with coarser model resolution and at higher simulated O3 concentrations. This is consistent with our previous findings. This does not necessarily imply that the trend of decreasing biases with finer resolutions would continue at resolutions finer than 12 km, as we have not tested this approach at those resolutions. As noted in Sect. 3.3, previous modeling studies examining the effects of horizontal resolution have found that O3 increased over urban areas with finer resolution, so the findings for the effects of model resolution should be taken to apply our current results rather than as a general finding on the impacts of model resolution. Our finding that US anthropogenic O3 biases increase with higher O3 does not hold when O3 is binned by observed rather than simulated concentrations. There is much less variation in the US anthropogenic O3 bias across the range of observed O3 than for simulated O3. Although the choice of binning O3 by observed or simulated levels changes the sample of data, the results for the eastern US are generalizable to this part of the country because the samples have consistent spatial representation across the eastern US. In the western US, US anthropogenic O3 was inferred to be biased high at higher O3 levels for the PA simulations and biased low at higher O3 levels for the EQUATES simulations. These differences are explained by the use of different emission inventories in the two sets of simulations. Regardless, the findings for inferred O3 biases at higher O3 levels in the western US are not broadly applicable to the entire western US because the sample that these findings are based on is dominated by sites in California. There are relatively few sites in other states in the western US that contribute to this sample, so the results are not likely to be indicative of conditions in other parts of the western US.

The correction of US background components provided results that are consistent with previous studies but more detail. Like Skipper et al. (2021) and Hosseinpour et al. (2024), simulated US background O3 was inferred to be biased slightly low overall. The original simulated annual averages of US background O3 across all the PA and EQUATES modeling configurations considered here ranged from 30–33 ppb, while the adjusted annual average US background O3 ranged from 31–34 ppb. The annual average of simulated US background O3 for the hemispheric-scale (108 km resolution) and continental-scale (12 km resolution) modeling was slightly higher for the EQUATES simulations (32–33 ppb) than for the PA simulations (30–31 ppb). The differences are not explainable by the updated chemical mechanism used in EQUATES because the most relevant updates (halogen-mediated O3 loss) tend to reduce O3 at the northern mid-latitudes (Sarwar et al., 2019; Appel et al., 2021). The difference is also not likely due to anthropogenic emissions outside of the US, which are similar between the two sets of simulations. Therefore, the higher US background O3 in EQUATES likely relates to differences in the natural emissions. The EQUATES simulations used MEGAN for biogenic emissions throughout the entire Northern Hemisphere, while the PA simulations used BEIS for biogenic emissions in North America and MEGAN elsewhere. The two hemispheric model configurations also used different sources for soil NOx emissions (see Sect. 2.1), which could contribute to differences in US background O3. Lightning NOx emissions were the same in EQUATES and PA hemispheric-scale simulations, but the continental-scale PA simulations did not include lightning in the continental domain. Given that US background O3 levels in both the EQUATES and the PA 12 km continental-scale simulations are 1 ppb lower than their northern hemispheric counterparts, the differences in US background O3 in the continental-scale simulations are more likely driven by the large-scale background inherited through the lateral boundary conditions than by differences in lightning NOx configurations.

This work separated US background O3 into natural, short-range international, and long-range international components, and each had distinct seasonality from the inferred bias. Short-range international (Canada+Mexico) O3 was marginally biased high in spring and winter and marginally biased low in summer. The contributions from natural and long-range international O3 have larger seasonality, which are slightly out of phase. Natural O3 bias was low in winter but high in summer, peaking in July. Long-range international O3 was consistently biased low with a minimum in April and a maximum (near unbiased) in August–September. From May to October, the natural and long-range international O3 biases were largely offset, while they were reinforced in other parts of the year.

The seasonality of inferred long-range international bias highlights a key uncertainty in correlative bias attribution. The biases associated with long-range international O3 may be misattributed due to the difficulty of the regression model formulation in isolating stratospheric influences from other natural sources such as lightning and soil NOx, wildfires, and biogenic VOC emissions, all of which have a high degree of uncertainty. Stratospheric O3 is expected to have similar temporal and spatial patterns to long-range international O3, with contributions being higher in spring and at high elevations. It is suspected that the regression model formulation may be assigning a negative bias in long-range international O3 to make up for missing stratospheric O3 that has a similar pattern to long-range international O3 while at the same time assigning a high bias to natural O3 to reallocate some of stratospheric O3 that is present in natural O3 to long-range international O3 instead. Results for the stratospheric O3 tracer in the second set of simulations support the idea that there is missing stratospheric O3 at the surface level in the western US as the stratospheric O3 is inferred to be biased low. Taken together, there is an overall low bias in the simulated US background O3 that is most pronounced in the spring. This may be a result of too little stratospheric O3 reaching the surface. Photolysis of particulate nitrate over oceans has been found to increase O3 (Shah et al., 2023; Sarwar et al., 2024). This process is not included in the chemical mechanism, which could contribute to low biases in O3 during the same time of the year. The potential for misattribution is not specific to the methods employed here but is inherent to correlative bias approaches with incomplete information contained in independent variables.

Analyses of the original bias and residual bias emphasize the importance of subpopulation diversity. The correction factors are optimized for the whole population and can degrade performance at any subpopulation (e.g., a site, a day, or a subgroup). For example, in the western US, the PA simulation was originally biased high for days with high predictions and biased low for days with high observations (> 70 ppb). The overall correction was downwards for both populations because they are generally consistent spatially and seasonally. This means that the corrected model has more bias on days with high observations in the western US than the uncorrected model. This is not unexpected but highlights that correlative adjustments should be considered to be broad conclusions and should only be applied cautiously to narrower circumstances (e.g., to specific monitors or days). This is a limitation of the linear formulation, as noted by Hosseinpour et al. (2024).

This work only focused on surface O3. We are not able to draw a conclusion as to whether the potential lack of stratospheric O3 is a result of biases in the UTLS PV scaling in the upper layers or errors in vertical transport from the upper layers to the surface. More detailed studies that analyze the entire vertical structure, such as a recent study of CMAQ stratospheric O3 by Itahashi et al. (2020), are needed to identify the exact causes of and solutions for the surface biases identified here. Another potential area for future work is to separate stratospheric O3 from natural sources in sets of simulations like those conducted for the O3 policy assessment. This might solve the suspected issue of bias in stratospheric O3 being allocated to long-range international emissions that may be caused by the correlation of stratospheric O3 and long-range international impacts. While details on the spatial and temporal characteristics of biases in different O3 components are provided here, the correlational bias attribution method employed here does not necessarily identify the specific factors that drive the biases. These results provide estimates of potential biases in US background and US anthropogenic O3 that can inform more targeted future work examining the individual sources in greater detail. Additional future work could take a process-oriented approach rather than the source-oriented approach described here. A process-oriented approach would focus on how different physical and chemical processes (deposition, transport, photochemical activity, etc.) relate to biases in O3 simulations. The role of uncertainties in O3 deposition and in O3 production efficiency across various chemical regimes could be examined in a more process-focused analysis. A further area for future work is to apply the data fusion bias correction method to an ensemble of US background O3 estimates from different models. This work only used the CMAQ model. A test of the method would be to apply it to several different models to determine whether it is able to reduce the uncertainty in US background O3 estimates while also reducing bias in total O3.

Code and data availability

The CMAQ source code is available from GitHub (https://github.com/USEPA/CMAQ, last access: 11 February 2024) and Zenodo (https://doi.org/10.5281/zenodo.1079878, US EPA Office of Research and Development, 2024). O3 observational data are available via the AQS website (https://aqs.epa.gov/aqsweb/airdata/download_files.html, US EPA, 2024).

Supplement

The supplement related to this article is available online at: https://doi.org/10.5194/gmd-17-8373-2024-supplement.

Author contributions

TNS: conceptualization, investigation, methodology, software, visualization, writing – original draft. CH: data curation, software, writing – review and editing. BHH: data curation, software, writing – review and editing. RM: software, writing – review and editing. KMF: data curation, software, writing – review and editing. AGR: conceptualization, methodology, resources, supervision, writing – review and editing.

Competing interests

The contact author has declared that none of the authors has any competing interests.

Disclaimer

Funding organizations have not dictated the topic or content of this work nor have they had any editorial role. The views expressed in this paper are those of the authors and do not necessarily represent the view or policies of the U.S. Environmental Protection Agency, the Phillips 66 Company, or NASA.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors.

Acknowledgements

T. Nash Skipper and Armistead G. Russell received funding from the Phillips 66 Company. Armistead G. Russell also received funding from NASA HAQAST. We thank Benjamin Murphy and Sergey Napelenok for their comments on a draft version of the paper.

Financial support

This research has been supported by the National Aeronautics and Space Administration (grant no. 80NSSC21K0506) and the Phillips 66 Company.

Review statement

This paper was edited by Jason Williams and reviewed by four anonymous referees.

References

Appel, K. W., Bash, J. O., Fahey, K. M., Foley, K. M., Gilliam, R. C., Hogrefe, C., Hutzell, W. T., Kang, D., Mathur, R., Murphy, B. N., Napelenok, S. L., Nolte, C. G., Pleim, J. E., Pouliot, G. A., Pye, H. O. T., Ran, L., Roselle, S. J., Sarwar, G., Schwede, D. B., Sidi, F. I., Spero, T. L., and Wong, D. C.: The Community Multiscale Air Quality (CMAQ) model versions 5.3 and 5.3.1: system updates and evaluation, Geosci. Model Dev., 14, 2867–2897, https://doi.org/10.5194/gmd-14-2867-2021, 2021. 

Bash, J. O., Baker, K. R., and Beaver, M. R.: Evaluation of improved land use and canopy representation in BEIS v3.61 with biogenic VOC measurements in California, Geosci. Model Dev., 9, 2191–2207, https://doi.org/10.5194/gmd-9-2191-2016, 2016. 

CAMS: Soil N emissions for 2000–present, D81.3.6.1, CAMS. https://atmosphere.copernicus.eu/sites/default/files/2019-11/25_CAMS81_2017SC1_D81.3.6.1-201810_APPROVED_Ver1.pdf (last access: 16 June 2024), 2018. 

Dentener, F., Keating, T., and Akimoto, H. (Eds.): Hemispheric Transport of Air Pollution 2010, Part A: Ozone and Particulate Matter. Task Force on Hemispheric Transport of Air Pollution, Air Pollution Studies, No. 17, United Nations Economic Commission for Europe, Geneva, https://doi.org/10.18356/2c908168-en, 2010. 

Dolwick, P., Akhtar, F., Baker, K. R., Possiel, N., Simon, H., and Tonnesen, G.: Comparison of background ozone estimates over the western United States based on two separate model methodologies, Atmos. Environ., 109, 282–296, https://doi.org/10.1016/j.atmosenv.2015.01.005, 2015. 

Fiore, A., Jacob, D. J., Liu, H., Yantosca, R. M., Fairlie, T. D., and Li, Q.: Variability in surface ozone background over the United States: Implications for air quality policy, J. Geophys. Res.-Atmos., 108, D244787, https://doi.org/10.1029/2003jd003855, 2003.​​​​​​​ 

Fiore, A. M., Oberman, J. T., Lin, M. Y., Zhang, L., Clifton, O. E., Jacob, D. J., Naik, V., Horowitz, L. W., Pinto, J. P., and Milly, G. P.: Estimating North American background ozone in U.S. surface air with two independent global models: Variability, uncertainties, and recommendations, Atmos. Environ., 96, 284–300, https://doi.org/10.1016/j.atmosenv.2014.07.045, 2014. 

Foley, K., Pouliot, G., Eyth, A., Possiel, N., Aldridge, M., Allen, C., Appel, W., Bash, J., Beardsley, M., Beidler, J., Choi, D., Eder, B., Farkas, C., Gilliam, R., Godfrey, J., Henderson, B., Hogrefe, C., Koplitz, S., Mason, R., Mathur, R., Misenis, C., Pye, H., Reynolds, L., Roark, M., Roberts, S., Schwede, D., Seltzer, K., Sonntag, D., Talgo, K., Toro, C., and Vukovich, J.: EQUATES: EPA's Air QUAlity TimE Series Project, 19th Annual CMAS Conference 2020, virtual, 26–30 October 2020, https://www.cmascenter.org/conference/2020/slides/KFoley_EQUATES_CMAS_2020.pdf (last access: 16 November 2024), 2020. 

Foley, K. M., Pouliot, G. A., Eyth, A., Aldridge, M. F., Allen, C., Appel, K. W., Bash, J. O., Beardsley, M., Beidler, J., Choi, D., Farkas, C., Gilliam, R. C., Godfrey, J., Henderson, B. H., Hogrefe, C., Koplitz, S. N., Mason, R., Mathur, R., Misenis, C., Possiel, N., Pye, H. O. T., Reynolds, L., Roark, M., Roberts, S., Schwede, D. B., Seltzer, K. M., Sonntag, D., Talgo, K., Toro, C., Vukovich, J., Xing, J., and Adams, E.: 2002–2017 anthropogenic emissions data for air quality modeling over the United States, Data in Brief, 47, 109022, https://doi.org/10.1016/j.dib.2023.109022, 2023. 

Guenther, A. B., Jiang, X., Heald, C. L., Sakulyanontvittaya, T., Duhl, T., Emmons, L. K., and Wang, X.: The Model of Emissions of Gases and Aerosols from Nature version 2.1 (MEGAN2.1): an extended and updated framework for modeling biogenic emissions, Geosci. Model Dev., 5, 1471–1492, https://doi.org/10.5194/gmd-5-1471-2012, 2012. 

Guo, J. J., Fiore, A. M., Murray, L. T., Jaffe, D. A., Schnell, J. L., Moore, C. T., and Milly, G. P.: Average versus high surface ozone levels over the continental USA: model bias, background influences, and interannual variability, Atmos. Chem. Phys., 18, 12123–12140, https://doi.org/10.5194/acp-18-12123-2018, 2018. 

Hoesly, R. M., Smith, S. J., Feng, L., Klimont, Z., Janssens-Maenhout, G., Pitkanen, T., Seibert, J. J., Vu, L., Andres, R. J., Bolt, R. M., Bond, T. C., Dawidowski, L., Kholod, N., Kurokawa, J.-I., Li, M., Liu, L., Lu, Z., Moura, M. C. P., O'Rourke, P. R., and Zhang, Q.: Historical (1750–2014) anthropogenic emissions of reactive gases and aerosols from the Community Emissions Data System (CEDS), Geosci. Model Dev., 11, 369–408, https://doi.org/10.5194/gmd-11-369-2018, 2018. 

Hogrefe, C., Liu, P., Pouliot, G., Mathur, R., Roselle, S., Flemming, J., Lin, M., and Park, R. J.: Impacts of different characterizations of large-scale background on simulated regional-scale ozone over the continental United States, Atmos. Chem. Phys., 18, 3839–3864, https://doi.org/10.5194/acp-18-3839-2018, 2018. 

Hosseinpour, F., Kumar, N., Tran, T., and Knipping, E.: Using machine learning to improve the estimate of U.S. background ozone, Atmos. Environ., 316, 120145, https://doi.org/10.1016/j.atmosenv.2023.120145, 2024. 

Huang, M., Bowman, K. W., Carmichael, G. R., Lee, M., Chai, T., Spak, S. N., Henze, D. K., Darmenov, A. S., and da Silva, A. M.: Improved western U.S. background ozone estimates via constraining nonlocal and local source contributions using Aura TES and OMI observations, J. Geophys. Res.-Atmos., 120, 3572–3592, https://doi.org/10.1002/2014jd022993, 2015.​​​​​​​ 

Itahashi, S., Mathur, R., Hogrefe, C., and Zhang, Y.: Modeling stratospheric intrusion and trans-Pacific transport on tropospheric ozone using hemispheric CMAQ during April 2010 – Part 1: Model evaluation and air mass characterization for stratosphere–troposphere transport, Atmos. Chem. Phys., 20, 3373–3396, https://doi.org/10.5194/acp-20-3373-2020, 2020. 

Jaffe, D. A., Wigder, N., Downey, N., Pfister, G., Boynard, A., and Reid, S. B.: Impact of Wildfires on Ozone Exceptional Events in the Western U.S, Environ. Sci. Technol., 47, 11065–11072, https://doi.org/10.1021/es402164f, 2013. 

Jaffe, D. A., Cooper, O. R., Fiore, A. M., Henderson, B. H., Tonnesen, G. S., Russell, A. G., Henze, D. K., Langford, A. O., Lin, M. Y., and Moore, T.: Scientific assessment of background ozone over the US: Implications for air quality management, Elem. Sci. Anth., 6, 30, https://doi.org/10.1525/elementa.309, 2018. 

Janssens-Maenhout, G., Crippa, M., Guizzardi, D., Dentener, F., Muntean, M., Pouliot, G., Keating, T., Zhang, Q., Kurokawa, J., Wankmüller, R., Denier van der Gon, H., Kuenen, J. J. P., Klimont, Z., Frost, G., Darras, S., Koffi, B., and Li, M.: HTAP_v2.2: a mosaic of regional and global emission grid maps for 2008 and 2010 to study hemispheric transport of air pollution, Atmos. Chem. Phys., 15, 11411–11432, https://doi.org/10.5194/acp-15-11411-2015, 2015. 

Kang, D., Pickering, K. E., Allen, D. J., Foley, K. M., Wong, D. C., Mathur, R., and Roselle, S. J.: Simulating lightning NO production in CMAQv5.2: evolution of scientific updates, Geosci. Model Dev., 12, 3071–3083, https://doi.org/10.5194/gmd-12-3071-2019, 2019. 

Langford, A. O., Senff, C. J., Alvarez, R. J., Brioude, J., Cooper, O. R., Holloway, J. S., Lin, M. Y., Marchbanks, R. D., Pierce, R. B., Sandberg, S. P., Weickmann, A. M., and Williams, E. J.: An overview of the 2013 Las Vegas Ozone Study (LVOS): Impact of stratospheric intrusions and long-range transport on surface air quality, Atmos. Environ., 109, 305–322, https://doi.org/10.1016/j.atmosenv.2014.08.040, 2015. 

Langford, A. O., Senff, C. J., Alvarez II, R. J., Aikin, K. C., Ahmadov, R., Angevine, W. M., Baidar, S., Brewer, W. A., Brown, S. S., James, E. P., McCarty, B. J., Sandberg, S. P., and Zucker, M. L.: Were Wildfires Responsible for the Unusually High Surface Ozone in Colorado During 2021?, J. Geophys. Res.-Atmos., 128, e2022JD037700, https://doi.org/10.1029/2022JD037700, 2023. 

Lin, M., Fiore, A. M., Cooper, O. R., Horowitz, L. W., Langford, A. O., Levy II, H., Johnson, B. J., Naik, V., Oltmans, S. J., and Senff, C. J.: Springtime high surface ozone events over the western United States: Quantifying the role of stratospheric intrusions, J. Geophys. Res.-Atmos., 117, D00V22, https://doi.org/10.1029/2012jd018151, 2012a. ​​​​​​​ 

Lin, M., Fiore, A. M., Horowitz, L. W., Cooper, O. R., Naik, V., Holloway, J., Johnson, B. J., Middlebrook, A. M., Oltmans, S. J., Pollack, I. B., Ryerson, T. B., Warner, J. X., Wiedinmyer, C., Wilson, J., and Wyman, B.: Transport of Asian ozone pollution into surface air over the western United States in spring, J. Geophys. Res.-Atmos., 117, D00V07, https://doi.org/10.1029/2011jd016961, 2012b. 

Lin, M., Fiore, A. M., Horowitz, L. W., Langford, A. O., Oltmans, S. J., Tarasick, D., and Rieder, H. E.: Climate variability modulates western US ozone air quality in spring via deep stratospheric intrusions, Nat. Commun., 6, 7105, https://doi.org/10.1038/ncomms8105, 2015. 

Lin, M., Horowitz, L. W., Payton, R., Fiore, A. M., and Tonnesen, G.: US surface ozone trends and extremes from 1980 to 2014: quantifying the roles of rising Asian emissions, domestic controls, wildfires, and climate, Atmos. Chem. Phys., 17, 2943–2970, https://doi.org/10.5194/acp-17-2943-2017, 2017. 

Lin, M., Horowitz, L. W., Zhao, M., Harris, L., Ginoux, P., Dunne, J., Malyshev, S., Shevliakova, E., Ahsan, H., Garner, S., Paulot, F., Pouyaei, A., Smith, S. J., Xie, Y., Zadeh, N., and Zhou, L.: The GFDL Variable-Resolution Global Chemistry-Climate Model for Research at the Nexus of US Climate and Air Quality Extremes, J. Adv. Model. Earth Sy., 16, e2023MS003984, https://doi.org/10.1029/2023MS003984, 2024. 

Liu, S. C., Trainer, M., Fehsenfeld, F. C., Parrish, D. D., Williams, E. J., Fahey, D. W., Hübler, G., and Murphy, P. C.: Ozone production in the rural troposphere and the implications for regional and global ozone distributions, J. Geophys. Res.-Atmos., 92, 4191–4207, https://doi.org/10.1029/JD092iD04p04191, 1987. 

Mathur, R., Xing, J., Gilliam, R., Sarwar, G., Hogrefe, C., Pleim, J., Pouliot, G., Roselle, S., Spero, T. L., Wong, D. C., and Young, J.: Extending the Community Multiscale Air Quality (CMAQ) modeling system to hemispheric scales: overview of process considerations and initial applications, Atmos. Chem. Phys., 17, 12449–12474, https://doi.org/10.5194/acp-17-12449-2017, 2017. 

Mathur, R., Kang, D., Napelenok, S. L., Xing, J., Hogrefe, C., Sarwar, G., Itahashi, S., and Henderson, B. H.: How Have Divergent Global Emission Trends Influenced Long-Range Transported Ozone to North America?, J. Geophys. Res.-Atmos., 127, e2022JD036926, https://doi.org/10.1029/2022JD036926, 2022. 

McDonald-Buller, E. C., Allen, D. T., Brown, N., Jacob, D. J., Jaffe, D., Kolb, C. E., Lefohn, A. S., Oltmans, S., Parrish, D. D., Yarwood, G., and Zhang, L.: Establishing Policy Relevant Background (PRB) Ozone Concentrations in the United States, Environ. Sci. Technol., 45, 9484–9497, https://doi.org/10.1021/es2022818, 2011. 

Murphy, B. N., Woody, M. C., Jimenez, J. L., Carlton, A. M. G., Hayes, P. L., Liu, S., Ng, N. L., Russell, L. M., Setyan, A., Xu, L., Young, J., Zaveri, R. A., Zhang, Q., and Pye, H. O. T.: Semivolatile POA and parameterized total combustion SOA in CMAQv5.2: impacts on source strength and partitioning, Atmos. Chem. Phys., 17, 11107–11133, https://doi.org/10.5194/acp-17-11107-2017, 2017. 

Price, C., Penner, J., and Prather, M.: NOx from lightning: 1. Global distribution based on lightning physics, J. Geophys. Res.-Atmos., 102, 5929–5941, https://doi.org/10.1029/96JD03504, 1997. 

Pye, H. O. T., D'Ambro, E. L., Lee, B. H., Schobesberger, S., Takeuchi, M., Zhao, Y., Lopez-Hilfiker, F., Liu, J., Shilling, J. E., Xing, J., Mathur, R., Middlebrook, A. M., Liao, J., Welti, A., Graus, M., Warneke, C., de Gouw, J. A., Holloway, J. S., Ryerson, T. B., Pollack, I. B., and Thornton, J. A.: Anthropogenic enhancements to production of highly oxygenated molecules from autoxidation, P. Natl. Acad. Sci. USA, 116, 6641–6646, https://doi.org/10.1073/pnas.1810774116, 2019. 

Qin, M., Murphy, B. N., Isaacs, K. K., McDonald, B. C., Lu, Q., McKeen, S. A., Koval, L., Robinson, A. L., Efstathiou, C., Allen, C., and Pye, H. O. T.: Criteria pollutant impacts of volatile chemical products informed by near-field modelling, Nature Sustainability, 4, 129–137, https://doi.org/10.1038/s41893-020-00614-1, 2021. 

Rickly, P. S., Coggon, M. M., Aikin, K. C., Alvarez II, R. J., Baidar, S., Gilman, J. B., Gkatzelis, G. I., Harkins, C., He, J., Lamplugh, A., Langford, A. O., McDonald, B. C., Peischl, J., Robinson, M. A., Rollins, A. W., Schwantes, R. H., Senff, C. J., Warneke, C., and Brown, S. S.: Influence of Wildfire on Urban Ozone: An Observationally Constrained Box Modeling Study at a Site in the Colorado Front Range, Environ. Sci. Technol., 57, 1257–1267, https://doi.org/10.1021/acs.est.2c06157, 2023. 

Sarwar, G., Gantt, B., Foley, K., Fahey, K., Spero, T. L., Kang, D., Mathur, R., Foroutan, H., Xing, J., Sherwen, T., and Saiz-Lopez, A.: Influence of bromine and iodine chemistry on annual, seasonal, diurnal, and background ozone: CMAQ simulations over the Northern Hemisphere, Atmos. Environ., 213, 395–404, https://doi.org/10.1016/j.atmosenv.2019.06.020, 2019. 

Sarwar, G., Hogrefe, C., Henderson, B. H., Mathur, R., Gilliam, R., Callaghan, A. B., Lee, J., and Carpenter, L. J.: Impact of particulate nitrate photolysis on air quality over the Northern Hemisphere, Sci. Total Environ., 917, 170406, https://doi.org/10.1016/j.scitotenv.2024.170406, 2024. 

Schwantes, R. H., Lacey, F. G., Tilmes, S., Emmons, L. K., Lauritzen, P. H., Walters, S., Callaghan, P., Zarzycki, C. M., Barth, M. C., Jo, D. S., Bacmeister, J. T., Neale, R. B., Vitt, F., Kluzek, E., Roozitalab, B., Hall, S. R., Ullmann, K., Warneke, C., Peischl, J., Pollack, I. B., Flocke, F., Wolfe, G. M., Hanisco, T. F., Keutsch, F. N., Kaiser, J., Bui, T. P. V., Jimenez, J. L., Campuzano-Jost, P., Apel, E. C., Hornbrook, R. S., Hills, A. J., Yuan, B., and Wisthaler, A.: Evaluating the Impact of Chemical Complexity and Horizontal Resolution on Tropospheric Ozone Over the Conterminous US With a Global Variable Resolution Chemistry Model, J. Adv. Model. Earth Sy., 14, e2021MS002889, https://doi.org/10.1029/2021MS002889, 2022. 

Shah, V., Jacob, D. J., Dang, R., Lamsal, L. N., Strode, S. A., Steenrod, S. D., Boersma, K. F., Eastham, S. D., Fritz, T. M., Thompson, C., Peischl, J., Bourgeois, I., Pollack, I. B., Nault, B. A., Cohen, R. C., Campuzano-Jost, P., Jimenez, J. L., Andersen, S. T., Carpenter, L. J., Sherwen, T., and Evans, M. J.: Nitrogen oxides in the free troposphere: implications for tropospheric oxidants and the interpretation of satellite NO2 measurements, Atmos. Chem. Phys., 23, 1227–1257, https://doi.org/10.5194/acp-23-1227-2023, 2023. 

Sindelarova, K., Granier, C., Bouarar, I., Guenther, A., Tilmes, S., Stavrakou, T., Müller, J.-F., Kuhn, U., Stefani, P., and Knorr, W.: Global data set of biogenic VOC emissions calculated by the MEGAN model over the last 30 years, Atmos. Chem. Phys., 14, 9317–9341, https://doi.org/10.5194/acp-14-9317-2014, 2014. 

Skipper, T. N., Hu, Y., Odman, M. T., Henderson, B. H., Hogrefe, C., Mathur, R., and Russell, A. G.: Estimating US Background Ozone Using Data Fusion, Environ. Sci. Technol., 55, 4504–4512, https://doi.org/10.1021/acs.est.0c08625, 2021. 

US EPA: Integrated Science Assessment (ISA) of Ozone and Related Photochemical Oxidants, Final Report, Feb 2013, U.S. Environmental Protection Agency, Washington, DC, EPA/600/R-10/076F, 2013. 

US EPA: Policy Assessment for the Review of the Ozone National Ambient Air Quality Standards, U.S. Environmental Protection Agency, Washington, DC, EPA-452/R-14/006, 2014. 

US EPA: Technical Support Document (TSD) Preparation of Emissions Inventories for the Version 7.1 2016 Hemispheric Emissions Modeling Platform, U.S. Environmental Protection Agency, https://www.epa.gov/sites/default/files/2019-12/documents/2016fe_hemispheric_tsd.pdf (last access: 16 November 2024), 2019a. 

US EPA: Technical Support Document (TSD) Preparation of Emissions Inventories for the Version 7.1 2016 North American Emissions Modeling Platform, U.S. Environmental Protection Agency, https://www.epa.gov/sites/default/files/2019-08/documents/2016v7.1_northamerican_emismod_tsd.pdf (last access: 16 November 2024), 2019b. 

US EPA: Policy Assessment for the Review of the Ozone National Ambient Air Quality Standards, U.S. Environmental Protection Agency, Washington, DC, EPA-452/R-20-001, 2020a. 

US EPA: Integrated Science Assessment (ISA) for Ozone and Related Photochemical Oxidants (Final Report), U.S. Environmental Protection Agency, Washington, DC, EPA/600/R-20/012, 2020b. 

US EPA: AQS Pre-Generated Data Files, US EPA [data set], https://aqs.epa.gov/aqsweb/airdata/download_files.html, last access: 11 February 2024. 

US EPA Office of Research and Development: CMAQ, Zenodo [code], https://doi.org/10.5281/zenodo.1079878, 2024. 

Wang, H., Jacob, D. J., Le Sager, P., Streets, D. G., Park, R. J., Gilliland, A. B., and van Donkelaar, A.: Surface ozone background in the United States: Canadian and Mexican pollution influences, Atmos. Environ., 43, 1310–1319, https://doi.org/10.1016/j.atmosenv.2008.11.036, 2009.  

Wiedinmyer, C., Akagi, S. K., Yokelson, R. J., Emmons, L. K., Al-Saadi, J. A., Orlando, J. J., and Soja, A. J.: The Fire INventory from NCAR (FINN): a high resolution global model to estimate the emissions from open burning, Geosci. Model Dev., 4, 625–641, https://doi.org/10.5194/gmd-4-625-2011, 2011. 

Wilkins, J. L., Pouliot, G., Pierce, T., Soja, A., Choi, H., Gargulinski, E., Gilliam, R., Vukovich, J., and Landis, M. S.: An evaluation of empirical and statistically based smoke plume injection height parametrisations used within air quality models, Int. J. Wildland Fire, 31, 193–211, https://doi.org/10.1071/WF20140, 2022. 

Wu, S., Duncan, B. N., Jacob, D. J., Fiore, A. M., and Wild, O.: Chemical nonlinearities in relating intercontinental ozone pollution to anthropogenic emissions, Geophys. Res. Lett., 36, L05806, https://doi.org/10.1029/2008GL036607, 2009.​​​​​​​ 

Xing, J., Mathur, R., Pleim, J., Hogrefe, C., Wang, J., Gan, C.-M., Sarwar, G., Wong, D. C., and McKeen, S.: Representing the effects of stratosphere–troposphere exchange on 3-D O3 distributions in chemistry transport models using a potential vorticity-based parameterization, Atmos. Chem. Phys., 16, 10865–10877, https://doi.org/10.5194/acp-16-10865-2016, 2016. 

Yienger, J. J. and Levy, H.: Empirical model of global soil-biogenic NOx emissions, J. Geophys. Res.-Atmos., 100, 11447–11464, https://doi.org/10.1029/95JD00370, 1995. 

Zhang, L., Jacob, D. J., Downey, N. V., Wood, D. A., Blewitt, D., Carouge, C. C., van Donkelaar, A., Jones, D. B. A., Murray, L. T., and Wang, Y. X.: Improved estimate of the policy-relevant background ozone in the United States using the GEOS-Chem global model with 1/2 degrees x 2/3 degrees horizontal resolution over North America, Atmos. Environ., 45, 6769–6776, https://doi.org/10.1016/j.atmosenv.2011.07.054, 2011. 

Zhang, L., Jacob, D. J., Yue, X., Downey, N. V., Wood, D. A., and Blewitt, D.: Sources contributing to background surface ozone in the US Intermountain West, Atmos. Chem. Phys., 14, 5295–5309, https://doi.org/10.5194/acp-14-5295-2014, 2014. 

Zhao, B., Zheng, H., Wang, S., Smith, K. R., Lu, X., Aunan, K., Gu, Y., Wang, Y., Ding, D., Xing, J., Fu, X., Yang, X., Liou, K.-N., and Hao, J.: Change in household fuels dominates the decrease in PM2.5 exposure and premature mortality in China in 2005–2015, P. Natl. Acad. Sci. USA, 115, 12401–12406, https://doi.org/10.1073/pnas.1812955115, 2018. 

Download
Short summary
Chemical transport model simulations are combined with ozone observations to estimate the bias in ozone attributable to US anthropogenic sources and individual sources of US background ozone: natural sources, non-US anthropogenic sources, and stratospheric ozone. Results indicate a positive bias correlated with US anthropogenic emissions during summer in the eastern US and a negative bias correlated with stratospheric ozone during spring.