Interactive comment on “ Assessment of bias-adjusted PM 2 . 5 air quality forecasts over the continental United States during 2007 ” by D . Kang

GENERAL COMMENTS This paper evaluates a post-processing method that combines recent and new hourly PM2.5 predictions from a source-oriented air-quality modeling system (WRF-NMMCMAQ) with recent hourly PM2.5 measurements from a network of monitoring stations in order to produce improved forecasts at the locations of those monitoring stations (though not elsewhere). A set of model PM2.5 forecasts and measurements from the North American AIRNow meta-network for a one-year period have been used to evaluate this method. The same Kalman Filter Predictor biasadjustment approach has been used previously for ozone forecasts and was shown to


Introduction
Ozone (O 3 ) and fine particulate matter (PM 2.5 ; particles with aerodynamic diameters less than 2.5 µm) in the atmosphere have been a major concern because of their adverse effects on human and ecosystem health.Adverse health effects in humans have been shown to be associated with exposure to elevated ambient PM 2.5 levels (e.g., NRC, 1998).O 3 and PM 2.5 are the two pollutants used in the US to compute the Air Quality Index (AQI), a standardized indicator of air quality conditions at a given location (http://www.airnow.gov); the current AQI standard in the United States is primarily based on daily maximum 8-h O 3 and daily mean (24-h average) PM 2.5 concentrations.Thus, to develop accurate AQIbased health advisories, it is desirable that air quality forecast systems at least be capable of forecasting these two species well.Real-time O 3 forecasts using air quality models have been publicly available in the US for several years over different domains (McHenry et al., 2004;McKeen et al., 2005;Otte et al., 2005;Eder, et al., 2006), while real-time PM 2.5 forecasts are mainly in the developmental stage and not available to the general public.The NAQFC (Otte et al., 2005), developed by the National Oceanic and Atmospheric Administration (NOAA) and the US Environmental Protection Agency (EPA) couples NOAA's operational North American Mesoscale (NAM) weather prediction model (Black, 1994;Rogers et al., 1996; http://www.dtcenter.org/wrf-nmm/users)with EPA's Community Multiscale Air Quality (CMAQ) model (Byun and Schere, 2006).It has the capability to provide real-time forecasts for both O 3 and PM 2.5 .The developmental mode model predictions are available for the year of 2007 over the continental US domain, providing a consistent and unique data set for performing comprehensive evaluations of bias-corrected pollutant fields.
While it is recognized that PM 2.5 pollution results from both primary emissions and secondary formation through complex photochemical and heterogeneous chemical pathways, significant scientific and technical challenges surround the characterization of ambient PM 2.5 distributions both through modeling and measurements (e.g., McMurry, 2000;Donahue et al., 2009).The emissions and physical, chemical, and removal processes controlling day-to-day levels of ambient PM 2.5 and precursor concentrations also exhibit seasonal variability, resulting in significant spatial and seasonal variability in ambient PM 2.5 mass and its chemical composition.Current uncertainties in these individual components pose enormous challenges for developing accurate short-term PM 2.5 forecasts (Mathur et al., 2008;Yu et al., 2008).Nevertheless, a need exists for local air quality agencies to provide accurate forecast of PM 2.5 concentrations to alert the sensitive population on the onset and duration of unhealthy air associated with elevated PM 2.5 levels.To address this need, the utility of PM 2.5 forecast guidance obtained from comprehensive atmospheric models can, in the short-term, be improved through post-processing of forecast output with bias-adjustment methods; this is the primary motivation for the analysis presented in this study.It should be noted that post-processing bias-adjustment techniques are routinely used in conjunction with numerical weather prediction models, despite decades of research to improve the formulations in the meteorological models, to develop more accurate forecast products (Glahn and Lowry, 1972; http: //www.weather.gov/mdl/synop/products.php).Given the relatively early state of PM 2.5 forecast models and large uncertainties in process representations, the exploration of biasadjustment techniques to improve the usefulness of PM 2.5 forecasts is warranted.
Different bias-adjustment (also referred to as biascorrection) techniques have been used for improving surface O 3 predictions in recent years (McKeen et al., 2005;Delle Monache et al., 2006;Wilczak et al., 2006;Delle Monache et al., 2008;and Kang et al., 2008).Among these techniques, the Kalman Filter (KF) predictor (hereafter referred to as KF bias-adjustment or simply KF) forecast method yielded the most forecast skill improvement.Kang et al. (2008) presented the application of KF technique to O 3 forecasts over the continental US domain for a three-month period from July to September 2005.While the technique was found to improve the forecast skill for O 3 , it was not clear if they would be readily applicable for PM forecasts and whether they would yield similar improvements in PM forecast skill.This is primarily due to the fact that unlike O 3 , elevated PM 2.5 concentrations are encountered throughout the year and that significant seasonal biases exist in current models both in the representation of total PM 2.5 mass as well as its composition (cf.Mathur et al., 2008;McKeen et al., 2007;Appel et al., 2008).Additionally, the chemical con-stituent contributing to the bias could also vary both spatially and seasonally.Thus, for improved PM forecasts, the biasadjustment techniques should be capable of correcting biases and errors that not only change with time, but that also may have widely varying sources of origin.
In this study, the KF bias-adjustment technique is applied to PM 2.5 forecasts for the year of 2007 over the continental US domain.To our knowledge, this is the first comprehensive assessment of the bias-adjustment technique for PM 2.5 forecasts.Within the continental US domain, there are about 500 AIRNow sites that report hourly PM 2.5 concentrations which are measured using the Tapered Element Oscillating Microbalance (TEOM) method.The year-long forecast period over the continental US has provided a unique data set covering a wide range of atmospheric conditions and a broad PM 2.5 concentration range to test the performance of the bias-adjustment technique for PM 2.5 forecasts.
The objectives of this study include: (1) apply the KF postprocessing technique to improve skills for real-time PM 2.5 forecasts, (2) investigate the spatial and temporal characteristics of this technique when applied to PM 2.5 forecasts, and (3) analyze the impact of bias adjustment on forecast errors of different types (e.g., systematic versus unsystematic).Section 2 describes the modeling system, the implementation of the KF bias-adjustment technique, observational data, and evaluation metrics.In Sect. 3 the performance evaluation results and discussions are presented.And the results and conclusions are summarized in Sect. 4.

The NAQFC system
The NAQFC system consists of 3 primary components: (1) the National Weather Service's North American Mesoscale (NAM) model based on the Weather Research Forecast nonhydrostatic mesoscale model (WRF-NMM) (http://www.dtcenter.org/wrf-nmm/users)which provides the meteorological and atmospheric dynamic conditions for the Air Quality Forecast (AQF); (2) the US EPA's Community Multiscale Air Quality (CMAQ) (Byun and Schere, 2006) model, which simulates the transport, chemical evolution, and deposition of atmospheric substances; and (3) an interface component (PREMAQ) that processes both the meteorological and emission inputs to conform with the CMAQ grid structure, coordinate system, and input format.The WRF-NMM (version 2.0) covers 1/3 of Northern Hemisphere with central latitude-longitude at N52, W106 (southern central Canada) using 12 km horizontal grid spacing and rotated latitude-longitude projection with Arakawa E-gridstaggering.There are 60 vertical layers with lowest interface at 38 m and the model top is set at 2 hPa.The CMAQ (version 4.6) domain for PM 2.5 forecasts covers the continental US (Fig. 1) using a 12-km horizontal grid spacing on the Lambert Conformal map projection and 22 vertical layers of variable thickness set on a sigma coordinate ranging from the surface to 100 hPa.Since the PM 2.5 forecasts were in the developmental stage, changes or modifications to the AQF components were allowable to accommodate new developments reflecting evolving science.For instance, on 17 September 2007, the treatment for the PBL mixing height scheme in CMAQ was changed from the Turbulence Kinetic Energy (TKE)-based method to the Asymmetric Convective Model-2 (ACM2)-based method, which on average decreased the PBL depth, helping reduce forecast errors for both O 3 and PM 2.5 in the Pacific Coast region.However, this study does not deal with the impacts of the various changes or modifications to the forecast model; rather, it focused on how the bias-adjustment technique can improve the forecast results over the raw model forecasts.Since the bias-adjustment technique employed in this study is statistical type, it does not involve any modifications in the physical and chemical processes treated in the forecast model.
The emissions inventories used by the AQF system were updated from the US EPA's 2001 national emission inventory to represent the 2007 forecast year (Eder et al., 2009).The biogenic emissions were processed using Biogenic Emission Inventory System (BEIS) version 3.13 (Schwede et al., 2005).Emissions from sea salt, wild fires, and wind-blown dust were not considered for the AQF system, which may contribute to the underestimation of PM 2.5 forecasts under some circumstances.The Carbon Bond chemical mechanism (version 4.2) is used to represent the photochemical reactions and AERO3 aerosol module is used to represent aerosol formation and distribution.The chemical fields for CMAQ are initialized using the previous forecast cycle.The primary NAM-CMAQ model forecast for the next 48-h surface-layer PM 2.5 is based on the current day's 06:00 UTC cycle, and this is the only cycle available for the developmental PM 2.5 forecasts.

Observations
Hourly, near real-time, PM 2.5 measurements (µg/m 3 ) obtained from EPA's AIRNow program are used in this study (http://www.epa.gov/airnow).All measurements are made using TEOM instruments and concentrations are averaged over hourly intervals from the beginning of one hour to the next.It should be recognized that TEOM measurements are somewhat uncertain and are believed to be lower limits to a "true" value because of volatilization of semivolatile material (ammonium nitrate and organic carbon) in the drying stages of the measurement (Eatough et al., 2003;Grover et al., 2005).Nevertheless, the TEOM measurements are the only real-time hourly PM 2.5 observation data available for use in the purpose of this study.About 500 PM 2.5 monitoring stations are available within the continental US domain (Fig. 1) for the year of 2007.For verification purposes and forecast products, the daily (24-h) mean PM 2.5 concentrations are often used.

Implementation of the KF bias-adjustment method
The KF predictor bias-adjustment algorithm (Kalman, 1960) was described in detail by Delle Monache et al. ( 2006) and a concise description of its implementation was provided by Kang et al. (2008).The specification of the error ratio, a key parameter in the KF approach which determines the relative weighting of observed and forecast values, was previously investigated extensively for O 3 forecasts.Even though the optimal error ratios were found to vary across space, the impact of using different optimal values over the model domain on the resultant bias-adjusted O 3 predictions was insignificant when compared to using a representative fixed value across all locations (Kang et al., 2008).To test whether the same conclusion is valid for the PM 2.5 forecasts, error ratios ranging from 0.01 to 0.10 were selected to perform PM 2.5 forecasts for all the sites across the domain over the entire year, and RMSE values were calculated at each site to gauge the impact of spatially different error ratio values on the forecast performance.As shown in Fig. 2, the impact of different error ratio values ranging from 0.04 to 0.10 on the forecast performance is small, and only when the error ratio of 0.01 was used, the RMSE values were relatively larger than using other values.Hence, in this study, we used the same single fixed error ratio value of 0.06 at all the locations for developing bias-adjusted PM 2.5 forecasts.In these plots (and also applies to all the boxplots in this paper), the metric is calculated at each site for the specific period and then is presented across all the sites within a region or the entire domain as a box plot, with the lower and upper borders of the box representing the first and third quartiles while the middle line represents the median value.
There are two steps to implement the KF bias-adjustment technique.First, the KF is initialized with the initial estimates of KF parameters as outlined in Kang et al. (2008) and hourly observations and raw model predictions for the prior 2 days.Then the updated parameters and the third day's raw model forecasts are used to create bias-adjusted forecasts for the 3rd day.All updated KF parameters for each hour and at each site are saved into a file for use in the subsequent KF run.The KF runs then continue by reading the previous day's KF parameters and observations and raw model predictions from the prior 2 days to generate the next day's bias-adjusted forecasts by combining with the next day's raw model forecasts.Thus, in developing the daily KF forecasts, if data for two consecutive days are missing at a site, the KF will automatically drop this site from future bias-adjustment forecasts; however, if a new site with two consecutive days' data appears in the observation data set, the KF will initialize the site with initial values of KF parameters and generate bias-adjusted forecasts further onward.This implementation is adaptable in real-time to the variable nature of monitoring stations which report hourly observations to the AIRNOW network and can be easily combined with AQF system to produce real-time bias-adjusted forecasts.

Verification statistics and spatial-temporal considerations
To assess the performance of the KF bias-adjusted forecasts, a variety of statistical metrics are used, including Root Mean Square Error (RMSE) and its systematic and unsystematic components, Normalized Mean Error (NME), Mean Bias (MB), Normalized Mean Bias (NMB), correlation coefficient (r), and Index Of Agreement (IOA).For a forecast product, it is also important to evaluate its performance over categorical forecasts (Kang et al., 2005).Two categorical metrics, False Alarm Ratio (FAR) and Hit Rate (H), are used in this study.Since the NAQFC domain covers the continental United States and given large region-to-region differences in the physical and chemical processes, the continental US domain is divided into six subregions following US state boundaries to facilitate the performance evaluations (Fig. 1).The four easternmost subregions, northeast (NE), southeast (SE), upper Midwest (UM), and lower Midwest (LM), are based on O 3 and other chemical species climatology that identified areas of homogeneous variability using principal component analysis (Eder et al., 1993;Gego et al., 2005).The domainwide statistics are calculated using all the observations available within the domain.
Figure 3 presents comparisons of time series of the domain-wide daily average observed, raw model forecasts, and KF bias-adjustment forecasts of PM 2.5 concentrations during 2007.As shown in Fig. 3, the raw model tends to overpredict during the cool season (before mid-April and after August) and underpredict during the warm season (mid-April to end of August) when compared with observations.To facilitate the temporal performance evaluations, the time series are divided into cool season (from January to 20 April and from September to December) and warm season (from 21 April to 31 August).

General performance
It is evident in Fig. 3 that the raw model has overestimated PM 2.5 concentrations on average during the cool season, especially during the period from September to December.During the warm season, the raw model significantly underestimated, and the KF predictions were well above the raw model predictions and much closer to the observations.From late July to early September, the raw model underwent ACM2-based PBL height is generally lower than the TKEbased PBL height.The under-prediction is thus reduced for the west region of the domain during the cool season, but the over-prediction is further aggravated for the eastern part of the domain during the same period.Nonetheless, the time series of the KF bias-adjusted predictions tracked the observed time series better than the raw model predictions.
To further investigate the performance of the KF biasadjusted forecasts and compare with the raw model forecasts, Fig. 5 displays the scatter plots of forecast and observed values across various percentiles for the daily mean PM 2.5 for all the stations within the continental US domain.Following Mathur et al. (2008), at each site the time series of both measured and model (or KF bias-adjusted model) daily mean PM 2.5 over the entire year was examined and percentiles of the distribution over the study period were computed for both modeled and observed values.Scatter plots of specific percentiles of the concentration distributions (e.g., median) of the model and observed time series are then examined to assess the ability of the model to capture the spatial variability in frequency distributions of PM 2.5 concentrations across the sites (Mathur et al., 2008).As shown in Fig. 5, compared with the raw model forecasts (left), the KF bias-adjusted forecasts displayed a much better match with the observed distributions as reflected by the reduced scatter about the 1:1 line, especially for the higher percentiles.The overall correlation between model forecasts and observations was greatly improved with the value of R 2 increasing from 0.43 for the raw model forecasts to 0.90 for the KF bias-adjusted forecasts.Similar improvements in O 3 forecasts after the application of the KF bias adjustment were previously reported in Kang et al. (2008).
The ability of the KF bias-adjustment technique to improve the predicted PM 2.5 concentration distributions is further illustrated in Fig. 6 which displays the histograms of observed daily mean PM 2.5 concentrations along with the fitted probability density functions (PDFs) of daily mean PM 2.5 concentrations for the observations, raw model forecasts, and KF bias-adjusted forecasts.Figure 6a   distribution for the entire domain during 2007, while Fig. 6b presents the distribution for Lower Midwest during the warm season and Fig. 6c for Pacific Coast during the cool season to typify the sub-regional and seasonal signals.As seen in Fig. 6, the KF technique brings the PDFs of forecast values much closer to those of the observations.The improvements are more pronounced in the sub-regional and seasonal distribution comparisons illustrated in Fig. 6b and c.The distributions of the raw model forecasts for both cases were out of phase compared to those of the observations, especially for Lower Midwest during the warm season.The KF biasadjusted forecasts were able to reproduce the observations very well in both cases.

Regional performance
Tables 1 and 2 present the domain and sub-regional summary of discrete statistics for the raw model and the KF bias-adjusted daily mean PM 2.5 forecasts during the cool and warm seasons, respectively.Examination of Table 1 reveals that during the cool season, the RMSE values range from 7.2 to 11.4 (µg/m 3 ) for the raw model forecasts, and from 5.2 to 7.6 (µg/m 3 ) for the KF bias-adjusted forecasts; this translates to about a 20% reduction in RMSE as a result of the application of the bias adjustment.Similar reductions are also noted for the NME.The MB and NMB indicate that during the cool season, the raw model systematically overpredicted daily mean PM 2.5 across all the sub-regions except the Pacific Coast where it under-predicted.The KF biasadjusted forecast reduced NMB values across all the subregions.Correlation coefficients also increased significantly across all the regions as a result of the bias adjustment, with the largest increase in the LM and RM regions.The summary statistics during the warm season (Table 2) indicate compara-ble improvement in the error statistics (RMSE and NME) for the KF bias-adjusted forecasts relative to the raw model.In contrast to the cool season, systematic under-predictions are noted in the warm season raw model PM 2.5 forecasts (Mathur et al., 2008).The application of the KF bias adjustment helps reduce both the cool season high bias and the warm season low bias, and also results in consistently improved correlations with measurements across all seasons.
Figure 7 presents comparisons of the distribution of monthly RMSE values of daily mean PM 2.5 for the raw model and KF forecasts for the different sub-regions.As seen in Fig. 7, the RMSE values are consistently lower for the KF forecasts relative to those of the raw model across all sub-regions and months.In addition, the error distribution range (the size of the boxes) for the KF forecasts is also much smaller than the raw model forecasts.During October-December, the raw model forecasts exhibited large RMSE values for both the UM and LM sub-regions (partly attributed to a change in the PBL height parameterization discussed earlier).The KF bias adjustment was able to reduce these large RMSEs significantly.In making comparisons across different regions, it should be noted that the relatively larger spread in RMSE for the RM and PC regions, especially for the raw model forecasts likely resulting from a combination of effects related to complex topography, land-sea breeze transitions in the PC region, greater spatial heterogeneity in emissions, and their impact on chemistry leading to PM 2.5 formation and distribution.
Figure 8 presents the spatial distribution of mean biases at each site within the modeling domain for both the cool and warm seasons.As illustrated in Fig. 8a, during the warm season, the raw model predominantly under-predicted at most sites (orange and purple squares) in the eastern part of the domain, over-predicted in the northwest regions and exhibited both over-and under-predictions at sites in California.During the cool season, the raw model generally over-predicted (Fig. 8c) in the east, but under-predictions dominated at sites in western portions of the domain.The application of the KF bias adjustment was able to effectively reduce these biases at more than 90% of the sites (Fig. 8b and d) to less than 2 µg/m 3 .Even at the sites where absolute values of mean biases were greater than 2 µg/m 3 for the raw model, the magnitude of the bias was significantly reduced with bias correction.
The forecast skill improvement over space by the KF forecasts over the raw model forecasts is further demonstrated by the IOA as shown in Fig. 9.The IOA increased on average from 8% (at NE and UM) to 30% at PC during the warm season (Fig. 9a) and from 15% (at NE and SE) to 28% at RM during the cool season (Fig. 9b).The domain-wise average IOA values increased by 13% and 19% for the warm season and the cool season, respectively.

Systematic/unsystematic errors and performance over concentration bins
The RMSE can be further decomposed into its systematic and unsystematic components (Willmott, 1981) based on the least-square linear regression relationship between forecast values and observations (Kang et al., 2008).The boxplots in Fig. 10 show the distribution of the RMSE, and its systematic (RMSEs) and unsystematic (RMSEu) components of the predicted daily mean PM 2.5 for the raw model and KF forecasts across all the stations within the continental US domain.Shown in the boxplots are the first quartile (lower border of the box), the third quartile (upper border of the box), and the median (the central line) values of the distributions.The whiskers represent the 1.5 IQR (inter-quartile range).The decomposition of the RMSE displays different error characteristics for PM 2.5 relative to those noted previously for O 3 forecasts (Kang et al., 2008).First, for the raw model forecasts, while systematic errors were larger than the unsystematic components for O 3 , the converse is noted for PM 2.5 forecasts.The larger contribution of unsystematic errors to the PM 2.5 RMSE not only reflect the bigger uncertainty in the emissions inventory used and in our understanding of the relevant atmospheric processes, but also the local-level variability in the predominantly urban AIRNOW measurement network.The application of the KF bias adjustment helps  1 2 3 4 5 6 7 8 9 10 11 12  1 2 3 4 5 6 7 8 9 10 11 12  1 2 3 4 5 6 7 8 9   reduce both the unsystematic and systematic errors in PM 2.5 forecasts.
To further examine the performance of the KF biasadjustment technique over different concentration ranges, Fig. 11 displays the forecast RMSE and MB values as a function of observed concentrations for both the warm and cool seasons.During the warm season (Fig. 11a), when observed PM 2.5 concentrations were less than 10 µg/m 3 , the KF bias-adjustment technique was unable to reduce RMSE val-ues compared to the raw model forecasts, though the distributions were narrower.This may in part be attributed to the fact that during the warm season, the weather conditions tend to be more variable (more convective weather conditions) than those during the cool season and lower concentrations are often associated with precipitation processes, and the raw model generally has difficulty to accurately simulate these weather conditions, resulting in larger unsystematic errors in the prediction of PM 2.5 concentrations.When the  observed PM 2.5 concentrations were larger than 10 µg/m 3 , the RMSE values associated with KF forecasts were much smaller in both the mean values and the distributions compared to the raw model forecasts.In contrast, during the cool season (Fig. 11b), the KF forecasts performed better than the raw model forecasts across all the concentration bins.Examination of the MB distributions over the observed concentration bins (Fig. 11c and d) reveals that the raw model over-predicted at lower concentrations and under-predicted at higher concentrations, which is similar to the raw model performance for O 3 forecasts (Kang et al., 2008).The underprediction at higher concentration bins for PM 2.5 forecasts during the warm season was more severe than that during the cool season.In general, the KF forecasts were able to adjust the MB towards zero over all the concentration bins for both seasons.
the NAQFC system could simulate the occurrences of an exceedance or non-exceedance.Categorical evaluations for O 3 forecasts have been extensively performed in the past (Kang et al., 2005;Eder et al., 2006Eder et al., , 2009)), but similar assessments for PM 2.5 forecasts have been limited.Figure 12 displays the false alarm ratio (FAR; also known as probability of false alarm) and hit rate (H; also known as probability of detection) (see Kang et al., 2005;Barnes et al., 2009) for the raw model and KF bias-adjusted daily mean PM 2.5 forecasts for each of the sub-regions during both the warm and cool seasons.An exceedance threshold value of 35 µg/m 3 for the 24h mean PM 2.5 , based on the US National Ambient Air Quality Standard (NAAQS) for PM 2.5 is used.As seen in Fig. 12, the FAR values associated with the raw model forecasts were similar (∼85%) for both seasons over the entire domain, but the H values varied from less than 10% during the warm season to greater than 30% during the cool season.For the KF forecasts, the FAR values were reduced by more than 20% during both seasons, and the H values have more than doubled during the warm season and were increased by about 20% during the cool season for the entire domain.Compared to the raw model forecasts, the KF forecasts reduced the FAR values across all the sub-regions, with differing magnitudes and increased the H values for all the sub-regions except for the LM and RM in the warm season and he UM in the cool season.In general, the H values were higher during the cool season than those during the warm season for both the raw model forecasts and the KF forecast, while the FAR values didn't differ significantly.

Summary
The Kalman filter bias-adjustment technique has been applied to post-process raw PM 2.5 air quality forecasts over the continental US domain during the year of 2007 at hourly PM 2.5 monitoring sites.Though the application and analysis were conducted on archived PM 2.5 model forecast output, the methodology is easily adopted for real-time applications.
To facilitate performance evaluation, the continental US portion of the domain was divided into six sub-regions and the year was split into a cool season and a warm season to examine spatial and seasonal characteristics of the performance of the method.The evaluation of raw model performance suggests that the daily mean PM 2.5 concentrations were generally over-predicted over the eastern part of the domain during the cool season and under-predicted during the warm season; in contrast, the opposite is true for the western part of the domain, i.e., the daily mean PM 2.5 concentrations were typically under-predicted along the Pacific Coast during the cool season and over-predicted during the warm season; the Rocky Mountain region is an exception where the daily mean PM 2.5 concentrations were over-predicted through the year.
The KF bias-adjustment technique significantly improved the PM 2.5 forecasts for locations with hourly PM 2.5 monitors as revealed by reductions in errors and biases, and higher correlation coefficients throughout the year and across the entire model domain.The analysis also shows that the KF bias adjustment can quickly respond to transitions from one regime to another during the transition of seasons or model changes.
Analysis of RMSE and MB as a function of observed concentrations suggests that the KF method significantly reduces the raw model error and bias across all concentration ranges except at lower concentration bins during the warm season.However, the significant reductions in error and bias at the moderate-high concentration ranges helps improve the ability to predict exceedances, a feature desirable for air quality forecasting.The effectiveness and benefits of biasadjustment of PM 2.5 model forecasts is also reflected in the categorical evaluations; the KF bias-adjustment technique improved the categorical evaluation metrics significantly by reducing the false alarm ratio and increasing the hit rate for almost all the regions during both the cool and warm seasons.
It should be pointed out that the performance of biasadjusted forecasts is dependent on the performance of the raw model to which the bias-adjustment technique is applied.Because of the complexity in PM 2.5 composition, formation, and distribution, it is even more critical for the raw model to provide a stable and well-behaved basis to make biasadjusted forecasts more reliable.This bias-adjusted forecast study was based on the total mass of PM 2.5 .If the components of PM 2.5 could be bias-adjusted separately, the results may be further improved than those derived from the bias-adjustment of the total PM 2.5 mass performed in this study.However, the lack of real-time measurements of speciated PM 2.5 hampers the use of KF adjustments on individual species.Improvements in the representation of fine particulate matter emissions as well as physical and chemical processes regulating sources and sinks in atmospheric models are expected as a result of on-going research over the next several years.Nevertheless, our analysis indicates that despite the current uncertainties in the representation of atmospheric processes dictating the distribution of ambient PM 2.5 , the KF bias-adjustment techniques can be used to improve the reliability of short term PM 2.5 forecasts from such models and, consequently, help in issuance of air-qualitydegradation-related health advisories.
In this study, the KF bias-adjustment technique is only applied at discrete points, i.e., at location of the monitors.Further research is needed to extend this technique for the development of bias-corrected spatial maps (i.e., also at location where no monitor information is available) for surfacelevel PM 2.5 distributions.Since surface-level PM 2.5 concentrations are influenced by local forcing associated with several meteorological drivers and spatially-heterogeneous emissions, information on the spatial representativeness of the individual measurements and, consequently, the adjusted bias is critical to the extension of this method presented here to develop bias-adjusted spatial maps of PM 2.5 forecast.

Fig. 2 .Fig. 2 .
Fig. 2. Impact of error ratios on the performance (RMSE) of Kalman Filter adjusted forecasts for the daily mean PM 2.5 concentrations (µg/m 3 )

Fig. 6 .
Fig. 6.The histogram of observed and the fitted Gaussian probability density function of observed, raw model forecast, and KF forecast daily mean PM 2.5 concentrations (µg/m 3 ): (a) Domain over entire year, (b) LM during warm season, (c) PC during cool season.

Fig. 6 .
Fig. 6.The histogram of observed and the fitted Gaussian probability density function of observed, raw model forecast, and KF forecast daily mean PM 2.5 concentrations (µg/m 3 ): (a) Domain over entire year, (b) LM during warm season, and (c) PC during cool season.

10 11 12 Fig. 7 .Fig. 7 .
Fig. 7. Monthly box plots (only 25 th and 75 th percentiles and median values are shown) of RMSE values of the daily mean PM 2.5 (µg/m 3 ) for the raw model and KF bias-adjusted forecasts for all sub-regions Month of the Year

Fig. 8 .Fig. 8 .
Fig. 8. Mean Bias (MB, µg/m 3 ) for daily mean PM 2.5 forecasts at each location within the continental U.S. domain: (a) raw model during warm season, (b) KF bias-adjustment during warm season, (c) raw model during cool season, (d) KF bias-adjustment during cool season

Fig. 9 .Fig. 9 .Fig. 10 .
Fig. 9. Box plots of index of agreement (IOA) of daily mean PM 2.5 (µg/m 3 ) for the raw model (MOD) forecasts and KF bias-adjusted forecasts over the domain (DM) and for all subregions during (a) warm season and (b) cool season Fig. 9. Box plots of index of agreement (IOA) of daily mean PM 2.5 (µg/m 3 ) for the raw model (MOD) forecasts and KF bias-adjusted forecasts over the domain (DM) and across all sub-regions during (a) warm season and (b) cool season.
Fig. 11.RMSE (a and b) and MB (c and d) values over observed daily mean PM 2.5 concentration (µg/m 3 ) bins for the raw model and the KF bias-adjusted forecasts.The sample sizes for each bin from small to large are 28203, 17526,19855, 6855, 2716 (Warm Season) and 34072, 36402, 30697, 6771, 2502 (Cool Season).

Table 1 .
Regional summary of discrete statistics for raw model and KF bias-adjusted daily mean PM 2.5 forecasts during 2007 cool season (n is the number of records).

Table 2 .
Regional summary of discrete statistics for raw model and KF bias-adjusted daily mean PM 2.5 forecasts during 2007 warm season.

ig. 12. False alarm ratio (FAR) and hit rate (H) for the daily mean PM 2.5 forecasts by e raw model and the KF bias-adjustment over the domain (DM) and all the sub- gions during (a) warm season and (b) cool season: FAR-MD, FAR associated with raw odel
forecasts; FAR-KF, FAR associated with KF forecasts; H-MD, H associated with w model forecasts; and H-KF, H associated with KF forecasts.Fig.12. False alarm ratio (FAR) and hit rate (H) for the daily mean PM 2.5 forecasts by the raw model and the KF bias-adjustment over the domain (DM) and all the sub-regions during (a) warm season and (b) cool season: FAR-MD, FAR associated with raw model forecasts; FAR-KF, FAR associated with KF forecasts; H-MD, H associated with raw model forecasts; and H-KF, H associated with KF forecasts.