CALIOPE-Urban v1.0: Coupling R-LINE with a mesoscale air quality modelling system for urban air quality forecasts over Barcelona city (Spain)

The NO2 annual air quality limit value is systematically exceeded in many European cities. In this context, understanding human exposure, improving policy and planning, and providing forecasts requires the development of accurate air quality models at urban (street–level) scale. We describe CALIOPE-Urban, a system coupling CALIOPE an operational mesoscale air quality forecast system based on HERMES (emissions), WRF (meteorology) and CMAQ (chemistry) models with the urban roadway dispersion model R-LINE. Our developments have focused on Barcelona city (Spain), but the 5 methodology may be replicated for other cities in the future. WRF drives pollutant dispersion and CMAQ provides background concentrations to R-LINE. Key features of our system include the adaptation of R-LINE to street canyons, the use of a new methodology that considers upwind grid cells in CMAQ to avoid double counting traffic emissions, a new method to estimate local surface roughness within street canyons, and a vertical mixing parametrization that considers urban geometry and atmospheric stability to calculate surface level background concentrations. We show that the latter is critical to correct the 10 nighttime overestimations in our system. Both CALIOPE and CALIOPE-Urban are evaluated using two sets of observations. The temporal variability is evaluated against measurements from five traffic sites and one urban background site for April-May 2013. While both systems show a fairly good agreement at the urban background site, CALIOPE-Urban shows a better agreement in traffic sites. The spatial variability is evaluated using 182 passive dosimeters that were distributed across Barcelona during two weeks for February-March 2017. In this case, also the coupled system shows a more realistic distribution than 15 the mesoscale system, which systematically underpredicts NO2 close to traffic emission sources. Overall CALIOPE-Urban improves mesoscale model results, demonstrating that the combination of both scales provides a more realistic representation of NO2 spatio-temporal variability in Barcelona.

presented here to downscale mesoscale meteorology to street-scale describing wind conditions and atmospheric stability in each street can be a promising solution to drive dispersion models and vertical mixing.
Background concentrations can be obtained from observations or mesoscale models, which are commonly used in forecasting applications. However, coupling mesoscale and urban dispersion models can lead to a double counting of traffic emissions.
To avoid double counting, Arunachalam et al. (2014) multiply urban background site observations by an estimated ratio be-5 tween two mesoscale air quality simulations. The first run contains all the emission sources and the second neglects traffic emissions. Lefebvre et al. (2011) and Stocker et al. (2014) run first the urban dispersion model at mesoscale grid resolution with only traffic emissions and subtract its result to the mesoscale model simulation, which includes all the emission sources.
Then, street scale model outputs are added to the result from the prior computation at finer resolution. Although, these methods avoid double counting emissions they do not explicitly account for vertical mixing, a process that occurs at the intersection of 10 regional and street scales. Urban air quality models such as SIRANE (Soulhac et al., 2011) have already implemented vertical mixing depending on local meteorology. In this study, we will show that this process may be relevant and explain some systematic errors found in the literature: nighttime NO 2 concentration values tend to be overestimated and afternoon values tend to be underestimated in traffic areas (e.g., Hood et al., 2018). Further efforts are necessary to explicitly resolve processes happening among scales and to correct these biases in the mentioned periods of the day. 15 This work describes a methodology to couple the mesoscale air quality forecasting system CALIOPE (Baldasano et al., 2011; http://www.bsc.es/caliope/?language=en) with the Research LINE source dispersion model (R-LINE;  and its evaluation over the city of Barcelona, Spain. In Barcelona, chronic NO 2 exceedances have been recorded since the year 2000, and according to the local Public Health Agency about 68% of citizens were exposed to NO 2 levels above the annual air quality limit value in 2016 (ASPB, 2017). Barcelona has a very high vehicle density (approx. 5500 vehicles 20 km −2 ) and the majority of passenger cars are diesel (67%) (Barcelona City Council, 2017). Located in the north east of the Iberian Peninsula, Barcelona is surrounded by the Mediterranean sea, two rivers and a mountain range. Due to its coastal emplacement, during the warm season, transport and dispersion of air pollutants within the city are dominated by the breeze blowing in from the sea during daytime and from the land during nighttime. This pattern persists under the presence of highpressure systems accompanied by clear skies and warm temperatures in the summer season. In contrast, the winter season 25 is dominated by north western advections typically cleaning the atmosphere of the city (Jorba et al., 2011). Our aim is to produce more accurate NO 2 concentrations with CALIOPE-Urban, the coupled system, than with the mesoscale system alone and give a more realistic representation of NO 2 spatial distribution and temporal variability across the city. To achieve these objectives a set of system enhancements have been implemented: an adaptation of R-LINE to dense urban areas (e.g. street canyons); a background model to estimate over background roof-level concentrations; a parametrization of the vertical mixing 30 to estimate background concentrations within the street that considers atmospheric stability and urban geometry; and a local surface roughness parametrization to estimate turbulent parameters within a street canyon. The mesoscale system has been executed using the operational forecast configuration. We compare the estimated temporal variability of NO 2 concentrations from the coupled modeling system with those derived from CALIOPE and with ambient street-level measurements (i.e. 5 traffic site and 1 urban background site) in April and May 2013. Its spatial variability is evaluated using a two-week measurement campaign that covered Barcelona with 182 NO 2 passive dosimeters for two weeks in February and March 2017.

Methods
CALIOPE-Urban estimates hourly NO 2 concentrations by coupling the CALIOPE mesoscale air quality forecasting system, providing background concentrations, meteorological data and road-link traffic emissions, with the R-LINE dispersion model 5 adapted to street canyons. Here we introduce and describe the components of the coupled model as depicted in Fig. 1. Meteorology and background from WRF and CMAQ are combined with urban geometry to create inputs for R-LINE. R-LINE dispersion is left untouched, after adjusting meteorology and surface roughness for local urban geometry.

Mesoscale air quality forecasting system-CALIOPE
CALIOPE (Baldasano et al., 2011) integrates the Weather Research and Forecasting model version 3 (WRF; Skamarock and Klemp, 2008), the High-Elective Resolution Modelling Emission System (HERMESv2.0; Guevara et al., 2013), the Community Multiscale Air Quality Modeling System version 5.0.2 (CMAQ; Byun and Schere, 2006) and the mineral Dust REgional 10 Atmospheric Model (BSC-DREAM8b; Basart et al., 2012). The mesoscale system is run over Europe at a 12 km × 12 km 4 Geosci. Model Dev. Discuss., https://doi.org/10.5194/gmd-2019-48 Manuscript under review for journal Geosci. Model Dev. HERMESv2.0 implements a SNAP sector-dependent spatial, temporal and speciation treatment of the original annual EMEP gridded emissions (Ferreira et al., 2013). For Spain, the model uses a bottom-up approach for pollutant sources including point (e.g. power plants, industries), maritime (e.g. ports), air traffic (e.g. airports), agricultural machinery (e.g. tractors and harvesters) and road transport. For the rest of pollutant sources a combination of top-down approaches (i.e. residential/commercial combustion; energy consumption statistics combined with a population map) and downscaling methodologies (i.e. use of sol-20 vents, extraction and distribution of fossil fuels; specific spatial proxies and temporal profiles assigned to the Spanish National Emission Inventory by categories at third level of SNAP) is adopted. The results of the HERMESv2.0 model have been used to support several air quality evaluation and planning studies (e.g., Baldasano et al., 2014;Soret et al., 2014) as well as emission inventory intercomparison exercises (Guevara et al., 2017). The chemical transport model used in the CALIOPE system is the CMAQv5.0.2. It uses the CB05 gas-phase chemical 25 mechanism, the AERO5 aerosol scheme, and an in-line photolysis calculation. CMAQ vertical levels are collapsed from the 38 WRF levels to 15 layers up to 50 hPa with six layers falling within the PBL. We use as boundary conditions for the European domain MOZART-4.

Street scale dispersion model: R-LINE
R-LINE is a near-road Gaussian dispersion model  that incorporates state-of-the-art Gaussian dispersion 30 curves  to simulate dispersion of road source emissions. The model resolves either numerically or analytically the integration of the contributions of point sources along a street segment . The first option is more accurate and the latter spends less time computing dispersion. The analytical version is best suited for nearground level sources and receptors. In order to estimate NO 2 concentrations R-LINE incorporates a chemistry module to resolve simple NO to NO 2 chemistry with the Generic Reaction Set (GRS; Valencia et al., 2018). R-LINE has been applied to estimate exposure to traffic-related air pollutants in a large scale study in Detroit, United States (Isakov et al., 2014). However, to our knowledge it has not been applied to European cities, where street canyon morphology dominates. Hence, in order to apply R-LINE over Barcelona its meteorology has been adapted to street canyons as described in Sect. 2.3.1 and the background concentrations are obtained from CMAQ model considering local meteorology and urban geometry as described in Sect. 2.3.3. 5 2.3 Coupling CALIOPE with R-LINE CALIOPE and R-LINE are coupled offline, first CALIOPE is run over Europe, Iberian Peninsula and Catalonia and then R-LINE is executed for Barcelona city. This approach presents two main challenges that have already been highlighted in the research literature: (1) downscaling regional meteorology to street scale to drive pollutant dispersion; and (2) obtaining background concentrations from the mesoscale model without double counting traffic emissions in regional and street scale 10 models. In addition to these challenges, we consider relevant to couple meteorology and background concentrations in a consistent way, taking into account atmospheric stability and urban geometry when estimating background contribution within urban streets. Here we describe our methodology when coupling the models to mitigate these challenges.

Meteorology
WRF bottom layer results are assumed to represent over roof wind and stability conditions because its mid-point height (20.3 15 m) is similar to average building height (bh) in a typical neighbourhood of Barcelona (e.g. Eixample district; 20.7 m). WRF is executed consistently with the forecasting air quality system CALIOPE, giving a constant surface roughness (z 0 ) equal to 1 m over the urban area. In order to apply R-LINE over Barcelona, its meteorology has been adapted to street canyons. We have developed a methodology to estimate specific z 0 based on urban geometry (e.g. building height, street width). Once z 0 is adjusted, the displacement height (dispht), friction velocity (u * ), convective velocity scale (w * ), PBL height, and  Obukov length (L) are re-calculated (Cimorelli et al., 2005). The increase in z 0 generally leads to a larger dispht, u * , w * , and PBL height. Therefore, L is less stable and atmospheric conditions are more convective. Ultimately, these adjustments have an effect on the way the winds are profiled and on the rate of dispersion of the roadway emissions within the urban area.
The geometrical parameters used for z 0 calculation are divided into two categories: (1) averaged over an area of 250m × 250m (planar building density, bd; average building height, bh; and building height standard deviation, bhdev); and (2) specific 25 aspect ratio (a r ) for each street segment consisting of street-averaged building height divided by street width. The geometrical parameters are calculated from a Barcelona City Council dataset containing 2-D geometries and number of floors for each building (Barcelona City Council, 2016), assuming 3 m height for each floor.
To estimate specific z 0 for each street segment we propose a new morphometric method inspired by previous studies in the literature. z 0 is composed by the WRF's background roughness (z 0bg ) and the one estimated locally (Eq. 1), which incorporates 30 building height influence through the range parameter scaled by two parabolic ratios based on aspect ratio (a rr ) and building density (bd r ). The range parameter (Eq. 2) and z 0 increase with bh following most morphometric methods (e.g. Macdonald et al., 1998). In addition, range and z 0 increase with an increasing bhdev. This assumption is based on Kent et al. (2017), who compared nine methods to estimate z 0 concluding that methods considering height variability through bhdev (i.e. a higher bhdev brings an increase of z 0 ) provide better results (e.g. Kanda et al., 2013). The parameter C multiplying the equation for range calculation is an empirical constant set to 1/20 after calibrating the system with the NO 2 measurements used in this work for CALIOPE-Urban evaluation. dispht is calculated following R-LINE methodology given a factor of displacement height (f acdispht) equal to 5 (Eq. 3) as suggested by . (1) To model the influence of building density and aspect ratio, we use Oke (1988) finding based on wind tunnel and experimental studies. Oke concluded that over-roof air roughness and satisfactory dispersion within the street canyon are maximum under 10 similar geometrical conditions. Specifically, showing that an a r equals 0.65 and a bd equals 0.25 give maximum roughness for overlying air and optimal dispersion conditions in the street canyon.
In practice, z 0 increases with an increasing a r to a maximum of a r = 0.65 and decreases for a r > 0.65 (Eq. 4). Additionally, an increasing bd produces higher z 0 until a maximum at bd = 0.25 and decreases for higher bd (Eq. 5). We model these ratios using parabolic shapes ranging from 0 to 1. Both urban characteristics are modelled using one parabola to the left of the 15 maximum and another to the right due to the non symmetrical distribution of the parameter values within Barcelona city (see Fig. A1 in Appendix A). The parabolic ratios will be maximum (i.e. equal to 1) if the roughness effect is maximum. The ratios are prevented from having negative values by setting a minimum of 0. 20 In addition to the z 0 adjustment, we adjust the wind speed and direction to represent more closely the winds blowing down the street as constrained by the buildings, which is called "channelling" (similarly to Fisher et al., 2005). We have adapted R-LINE to incorporate the orientation of roadways (and thus the buildings) where the wind direction follows the street direction.
This leads to a recalculation of the wind direction and speed for each roadway before emissions are dispersed within a city.
Wind speed channelling is parametrized following Soulhac et al. (2008) who showed that mean velocity along a canyon for any wind direction is directly proportional to the cosine of the angle between street direction and over roof wind direction (i.e. angle of incidence).
ws ch = ws bh · max(0.1, abs(cos(θ))) where ws ch means channelled wind speed at roof level, the wind speed at roof level (ws bh ) is taken from the WRF bottom layer in m/s and θ is the angle of incidence. The minimum value of the right component is set to avoid an unrealistic zero value for 5 wind speed. Its value of 0.1 is defined in line with Kastner-Klein et al. (2001), who showed that minimum longitudinal mean flow velocity component at canyon top is equivalent to 0.12 times the above canyon wind speed for perpendicular over roof winds according to their wind tunnel experiments. Then, to estimate wind speed at street level a logarithmic profile incorporated within R-LINE that is based on similarity theory (Monin and Obukhov, 1954) is used.  We have estimated NO 2 /NO X ratio following Carslaw and Beevers (2004), which produces an approximation to the NO 2 primary contribution. This method relates total O X (NO 2 + O 3 ) to total NO X (NO 2 + NO) in a traffic monitoring station

Emissions
subtracting O X and NO X from a background site in order to remove the effect of background and to only calculate the contribution at the traffic site. As the traffic station we used Eixample site and as the urban background station Ciutadella Park (see Fig. 2), which is located upwind of the dominant wind direction. Figure 3 compares O X to NO X in Eixample after 5 subtracting the background represented by Ciutadella from the beginning of October to end of February for years 2012 to 2016. The photochemical season (April-September) is not used to avoid greater scatter than it is found in the winter months as shown by Clapp and Jenkin (2001). The O X slope value of 18.9% is considered an estimate of the potential primary NO 2 contribution from vehicles on Eixample traffic station. This value is consistent with studies conducted in other cities with high diesel vehicle fleet (e.g., Carslaw et al., 2016;Wild et al., 2017) and is assumed to represent the NO 2 /NO X ratio in Barcelona 10 in the present work.   Background concentrations are required at each receptor in CALIOPE-Urban. Urban dispersion models are typically run at a very high spatial resolution (e.g. 20 m x 20 m). Running the UBS every 20 meters would have a high computational cost due to its spatial computations and background concentration values are not expected to vary substantially over tens of meters because CMAQ produces results with 1 km × 1 km spatial resolution. Hence, we first run the UBS to produce background concentration values at CMAQ grid cell centroids, then we apply a bilinear interpolation method to provide background at very 5 high spatial resolution.
In addition to the UBS we implement a background decay method to calculate the surface level background concentrations assuming that the UBS provides the concentration at rooftop level. The relationship between rooftop and surface level concentrations is assumed to depend on atmospheric stability, localized surface roughness and urban geometry. The ratio of wind speeds at surface and rooftop levels (ws sf c /ws bh ) estimated by R-LINE using similarity theory (Monin and Obukhov, 1954) 10 is used as a proxy for the vertical mixing. Using this ratio, we calculate f ac bg that represents the adimensional vertical mixing variable that is multiplied to rooftop background concentration to obtain surface level background concentration at a given height. In order to diminish the effect of afternoon underestimations from the regional system near traffic, background levels under convective situations are enhanced. We consider the upward heat flux at the surface (hf lux) as representing convective conditions for values higher than 0.30. This value is set to exclude slightly stable night hours with low positive hf lux values mainly caused by the urban heat island (i.e. Barcelona city has been found to be 2.9 • C warmer than its periphery by Moreno-Garcia, 1994). The following parametrization is used for cases with bd higher than 0.1, where F = m+abs(0.25−bd), being m an empirical parameter set to 0. 35 (1999), who set it as an inferior limit for real cities and show that below this value an isolated flow regime governs.
Within this regime, street level and over roof air is well mixed due to the low building density. Hence, for cases with bd equal 10 or lower than 0.1, f ac bg tends linearly to 1 following, Equations 8 are linear variations between the point at bd = 0 and f ac bg = 1, and the point at bd = 0.1 with the corresponding f ac bg value from the Eq. 7. 15 We have run CALIOPE-Urban for receptors as far as 250 metres from roads with sufficient Annual Average Daily Traffic spatial evaluation runs, we locate receptors at the specific coordinates of the measurement sites.

Execution setup
To obtain high resolution concentration maps for the entire city, we set the spatial context as the minimum rectangle where Barcelona municipality is contained and extended it by 250 m buffers that include the highways surrounding the city. The context is covered by a regular receptor grid of 10 meter resolution. R-LINE execution loops over each hour, road and receptor to estimate the contribution from each source to each receptor. 25 Aiming to understand the impact on accuracy of the local parametrization for background and meteorology and the impact of using the analytical approach for dispersion, we have run CALIOPE-Urban with different configurations. In Table 1  Experimental campaign sites are considered traffic sites in this work because they are exposed to similar AADT and vehicles km −2 compared to official traffic sites as shown in the table below. We apply Eq. (9) to obtain vehicles km −2 , a variable that describes traffic density in an area of 1 km 2 . vehicles To obtain the amount of vehicles per second, AADT is divided by 3600 * 24 and multiplied by a temporal factor (i.e. 1.47) representing a typical factor for morning traffic peak in Barcelona. Length is street length in metres. st is the number of streets over the circular area of 1 km 2 centered in the measurement site. two-week passive dosimeter campaign described in Sect. 3.2. Model performance is quantified using performance measures as described by Chang and Hanna (2004) and using assessment target plots (defined in the FAIRMODE initiative, Janssen et al., 2017). The performance statistics used here are the geometric mean bias (GeoMean), the fraction of model results within a factor of two of observations (FAC2), the geometric standard deviation (GeoSD), the correlation coefficient (R), the mean bias (MB) and the root mean square error (RMSE). The mathematical expressions of these statistics can be found in the Appendix 5 C.

Temporal variation of NO 2 concentrations within urban streets
The scatter plots of Fig. 6 compare CALIOPE and CALIOPE-Urban outputs with observations based on hourly, daily mean and maximum modelled concentrations in the six sites described in Sect. 3.1 for April and May 2013. In general, CALIOPE-Urban shows a greater agreement for hourly, daily means and maximum concentrations but tends to underpredict daily peak 10 concentrations in sites not exposed to very high traffic intensity (i.e. sites where urban background contribution predominates like Gràcia-Sant Gervasi). During the study period most of daily maxima (i.e. 56 %) occur at morning or evening traffic peak times (i.e. 6-7 or 18-20 UTC) when atmospheric conditions are typically stable and traffic intensity is high.   Table 3 shows the model performance statistics computed with hourly data, including CALIOPE-Urban-nl run. We compare CALIOPE-Urban and CALIOPE-Urban-nl to assess the difference in performance derived by the use of the local developments described in Sect. 2.3. All systems perform well at urban background sites and only CALIOPE-Urban gives good agreement with observations in traffic sites. The greatest difference between CALIOPE and CALIOPE-Urban systems performance is produced at the 455 Valencia Street site due to its street canyon morphology (a r = 0.86). In this site, the mean transport 5 is well resolved by the channelled winds, and its high AADT produces a high increase in traffic emissions within R-LINE.
CALIOPE-Urban-nl largely overestimates NO 2 concentrations in this site for several reasons: it uses directly the output of UBS for background, instead of applying the vertical mixing that reduces background at street level specially under stable conditions; z 0 is given the WRF value (z 0 = 1.0), which is much lower than its locally estimated value (i.e. z 0 = 2.2, see 2) that enhances dispersion decreasing concentration levels; lastly pollutant dispersion is not channelled within the street, so higher contributions of nearby streets may be expected. On the other hand, CALIOPE-Urban underestimations at 213 and 309 Industria Street and Gràcia-Sant Gervasi may be due to an unrealistically low AADT level on the street segment close to the site. We work with AADT data that is based on the outputs of the traffic model used by Barcelona City Council that may be underestimating traffic. Another explanation may be an 5 underestimation of local background levels within the area mostly during the afternoon. The afternoon underestimations in the mesoscale system could be caused by an overestimation of the mixing that produces a too low background NO 2 concentration level. This issue is difficult to correct because background concentrations used in the system are dependent on mesoscale concentrations, which are underestimated during daytime. In Table B1 in the Appendix B, same statistics are computed for daily mean results, finding similar results as in the hourly analysis. In addition, the analytical version of CALIOPE-Urban is 10 shown to produce similar results for hourly concentrations to the numerical version in Table B2 in the Appendix B. This result may be interesting for forecasting applications at urban scale that require high-resolution because the analytical dispersion algorithm spends approx. half the time computing in comparison to the numerical dispersion algorithm as shown in Table 1.     Considering all sites, CALIOPE-Urban shows a much better correlation coefficient (0.70 vs 0.36) than CALIOPE due to its good performance at traffic sites. Compared to CALIOPE-Urban-nl their correlation is similar. If we consider only urban background sites, CALIOPE shows a greater correlation coefficient than CALIOPE-Urban (0.66 vs. 0.54) and a MB closer to 0.
In addition, CALIOPE-Urban-nl gives a better correlation than both systems. A potential explanation for this result is related to the error compensation shown in the temporal evaluation (Sect. 4.1). CALIOPE and CALIOPE-Urban-nl may compensate the 10 underestimation during daytime with the overestimation during nighttime. In contrast, CALIOPE-Urban may not compensate the daytime underestimations with overestimated night values because the background is reduced due to low vertical mixing effect during nighttime (stable) hours. An enhanced daytime NO 2 background contribution would improve CALIOPE-Urban accuracy at urban background sites.
For traffic sites, CALIOPE shows a strong underestimation (MB = -25.57 µg m −3 ) and CALIOPE-Urban gives MB levels 15 closer to 0. CALIOPE-Urban underestimations may be influenced by afternoon underestimations and a misrepresentation of traffic emissions in some areas of the city. In contrast, CALIOPE-Urban-nl gives a high MB and the highest RMSE among the three systems. This tendency to over estimate near traffic of CALIOPE-Urban-nl may be due to the reasons stated in Sect.
4.1. In general, closer to intense traffic CALIOPE-Urban is very sensitive to emissions and its dispersion characterizes well the spatial variability for the study period. Reproducing spatial gradients near intense traffic is crucial in a city like Barcelona 20 given its high vehicle density and NO 2 concentration levels.  Figure 9 shows the difference between CALIOPE and CALIOPE-Urban results and measurements (top panels) and scatter plots at all sites (bottom panels) distinguished with colors by site type (e.g., traffic site, urban background site). In Fig. 9a the concentration difference map of CALIOPE shows an overall underestimation, represented by blue dots. This underestimation is found to be systematic in traffic sites in the scatter of Fig. 9c (purple dots), where modelled values barely exceed 50 µg m −3 while most of the observed values at traffic sites are above that value. In contrast, the CALIOPE-Urban 5 difference map (Fig. 9b) shows a more mixed picture with a broader representation of white dots (bias close to 0) but also more  (Fig. 9d). In CALIOPE-Urban's difference map, we see a spatial pattern with average bias close to 0 in the city centre, where traffic is denser and close to the highways surrounding the city. The appearance of red dots may indicate that CALIOPE-Urban overestimates close to high trafficked areas while CALIOPE underestimate in these areas. This may be due to an overestimation of traffic 5 emissions or background concentrations in these areas. In contrast, in locations where traffic is not very intense (see Fig. 2 for NO X emissions) CALIOPE-Urban shows systematic underestimations. This result may be derived from the systematic underestimation of midday NO 2 concentrations in low traffic areas as shown in Sect. 4.1.

Major uncertainty sources
Here we discuss potential sources of error in our model by analyzing episodes when the model was skillful compared with 10 episodes when the model was not. Our analysis solely considers the meteorological and background concentration inputs as potential sources of error. While road traffic emission estimates may introduce large errors, we lack observations of traffic counts at the measurement site locations to properly assess them.
We calculated daily the RMSE of the hourly modelled NO 2 concentrations versus the observed values in the six sites described in Sect. 3.1 during the period April and May 2013. For each site we picked the ten days with highest RMSE as 15 potential candidates and ten days with the lowest RMSE. We conducted this analysis for both CALIOPE and CALIOPE-Urban, finding that both systems share to a large extent the days with skill (4 out of 5 days) and without (3 out of 5). This result shows that the coupled system performance is highly dependent on the mesoscale model performance. To explore errors potentially caused by R-LINE inputs, in Fig. 9 we compare the five days with less skill (i.e. 11, 16, 17 April and 7,8 May) and the five days with more skill (i.e. 7, 20, April and 18, 19, 25 May) with observations for wind speed (ws), street level NO 2 and 20 background NO 2 .
On skillful days, winds are relatively strong and well represented in WRF (Fig. 10a). Poor skills appear when the observed wind speed is low. Because WRF largely underestimates wind speeds (Fig. 10b) and NO 2 concentrations are underestimated under calm conditions (Fig. 10d), other processes (e.g. atmospheric stability) may have a greater importance in this case. In our coupling under very stable atmospheric situations, dispersion is reduced and background injection from the overlying atmo-25 sphere is limited. This control mechanism adapts the system to specific street conditions, regulating dispersion and background injection. For these days, an extended observational dataset would be needed to better understand the model behaviour.  To analyze the background concentrations from the mesoscale simulation as a potential error source, we compared NO 2 observations from the Ciutadella urban background station with hourly modelled concentrations averaged over the six sites.
The results shown in Fig. 10e,f represent concentrations provided by upwind CMAQ grid cells depending on wind speed and direction (blue) as described in Sect. 2.3.3 downscaled to surface level using the vertical decay method (green). As expected, observed NO 2 concentrations on days with calm conditions and therefore poor skill are higher than on those with enhanced 5 ventilation and better skills. The background model reproduces well the variation during both types of days but overestimates concentrations during nighttime (19)(20)(21)(22), particularly during days with calm conditions. This problem is partially corrected by using the background vertical decay method as seen in Fig. 10f

Hourly variation of street NO 2 concentrations
Hourly street NO 2 concentrations are expected to vary spatially and temporally with higher values close to intense traffic sites during rush hours. Figure  intensity is higher at these hours of the day and the atmosphere tends to be stable, making pollutant dispersion more difficult.
On the other hand, lower concentrations are found at 0 UTC due to the lower traffic intensity and at 12 UTC. At 12 UTC traffic intensity is considerably higher than at 0 UTC but the atmosphere is more convective and pollutant dispersion is enhanced.

Conclusions
This study describes the development of a coupled regional to street scale modelling system, CALIOPE-Urban, which provides high spatial and temporal resolution (up to 10 m × 10 m, hourly) NO 2 concentrations for Barcelona. It couples the mesoscale air results. For traffic sites, the coupled system shows better agreement in highly trafficked areas where local dispersion plays a crucial role. Regarding the diurnal average cycle in the observation sites, both systems follow the overall daily cycle in the observations but CALIOPE-Urban predicts better morning peaks, and corrects the afternoon levels at traffic sites as well as the systematic nighttime overestimation produced by the regional system. The vertical mixing of rooftop background concentrations to surface levels based on atmospheric stability and urban geometry appears to be a good method to correct the strong 25 positive bias of the mesoscale model under stable atmospheric conditions during the evening.
Spatially, CALIOPE-Urban performs better than CALIOPE at the dosimeters located close to traffic. This result is because R-LINE explicitly resolves road traffic emission dispersion simulating the high gradients of NO 2 observed levels that occur within a mesoscale system grid cell. CALIOPE-Urban gives more overestimation close to high trafficked areas. This behaviour may be produced by an overestimation of traffic emissions in these roads or by underestimating dispersion. For dosimeters 30 located more than 10 m away from traffic both systems perform reasonably well. The higher the traffic in the surrounding area, the better is CALIOPE-Urban performance compared to the regional system.
When exploring the main error sources, overall both systems produce results that are either accurate or inaccurate on the same days. This fact suggests that coupled system results are highly influenced by the regional system results. low PBL height than under more convective conditions, with stronger winds and higher PBL heights. Another potential source of uncertainty is the integration within HERMESv2.0 of COPERT IV instead of COPERT V, which considers diesel NO X exceedances derived from diesel-gate for EURO 5 and EURO 6 diesel cars (Brown et al., 2018). In a future work, we plan to update HERMESv2.0 with the new emissions factors released by COPERT V and examine the influence of traffic emissions in CALIOPE-Urban results.

5
For high resolution air quality forecasts, we show that CALIOPE-Urban using either the numerical or the analytical dispersion algorithm gives good results. However, an entire city system execution using the analytical configuration takes approx. half the time compared to the numerical one. Hence, the analytical dispersion algorithm may be a suitable option for forecasting applications when sources, such as roadways, and receptors are located near the ground.
We show that traffic monitoring stations in Barcelona do not represent the highest NO 2 concentrations in the city. We find 10 the highest levels in heavily trafficked street canyons that are not well ventilated and near highways in the city surroundings.
As a consequence, we consider that additional monitoring sites located in these areas may better characterize the range of NO 2 concentration levels in Barcelona and give a better representation of human exposures.
This study has demonstrated that CALIOPE-Urban improves the accuracy of model outputs estimating NO 2 concentrations in Barcelona compared to CALIOPE. The methodology is replicable in cities where a mesoscale chemistry transport model 15 provides NO 2 simulations if urban geometrical data is available. The next step is to implement CALIOPE-Urban in the operational forecasting system for Barcelona to provide NO 2 concentrations at street level, and explore emissions impacts due to improved NO X emissions estimates.
Code availability. CALIOPE-Urban source code is available for non-commercial use. Contact Oriol Jorba (oriol.jorba@bsc.es) and Jaime Benavides (jaime.benavides@bsc.es) for agreement details. Observational data in this work has been provided by co-authors from Institute Appendix B: Extended performance evaluation Appendix C: Description of model evaluation statistics Here we define the model evaluation statistics used to compare observed measurements (obs) with modelled concentrations (mod): the geometric mean bias (GeoMean), the fraction of model results within a factor of two of observations (FAC2), the geometric standard deviation (GeoSD), the correlation coefficient (R), the mean bias (MB) and the root mean square error (RMSE). Vardoulakis, S., Fisher, B., Pericleous, K., and Gonzalez-Flesca, N. "Modelling air quality in street canyons: A review". Atmospheric Envi-