An urban large-eddy-based dispersion model for marginal grid resolutions: CAIRDIO v1.0

The capability for high spatial resolutions is an important feature of accurate numerical models dedicated to simulate the large spatial variability of urban air pollution. On the one hand, the well established mesoscale chemistry transport models have their obvious short-comings attributed to their extensive use of parameterizations. On the other hand, obstacle resolving computational fluid dynamic models, while accurate, still often demand too high computational costs to be applied on a regular basis for entire cities. The major reason for the inflated numerical costs is the required horizontal resolution to meaningfully 5 apply the obstacle discretization, which is most often based on boundary-fitted grids, like e.g. the marker-and-cell method. Here we present a large-eddy-simulation approach that uses diffusive obstacle boundaries, which are derived from a simplified diffusive interface approach for moving obstacles. The diffusive interface approach is well established in two-phase modeling, but to the author’s knowledge has not been applied in urban boundary layer simulations so far. Our dispersion model is capable of representing buildings over a wide range of spatial resolutions, including grid spacings equal or larger than typical building 10 size. This opens up a very promising opportunity for application of accurate air quality simulations and forecasts on entire mid-sized city domains. Furthermore, our approach is capable of incorporating the influence of the land orography by the additional optional use of terrain-following coordinates. We validated the dynamic core against a set of numerical benchmarks and a standard high-quality wind-tunnel data set for dispersion-model evaluation.


2011)
. In contrast to geometry-aligned discretizations, which preserve a high degree of accuracy near obstacle walls but require high resolutions, this approach suits more to the integral aspect of building shapes and configurations at marginal resolutions.
However, by increasing the grid resolution, the approach transforms seamless into a traditional obstacle resolving Cartesian approach, as the interface eventually becomes sharply defined and imposes Neumann-boundary conditions on the pressure.
The paper is organized as follows: Section 2 provides a detailed model description, including the spatial discretization. Section 100 3 is dedicated to the evaluation of the model using idealized test cases and a more complex and realistic wind-tunnel dispersion experiment. Finally, in Section 4 the benefits and limitations of the approach are summarized and concluded by an outlook for potential future applications. The physical domain consists of an interspersed, stationary solid phase representing the buildings, and a mobile fluid phase.
The governing equations for the mobile phase are deduced from a simplified two-phase model neglecting restoring forces. In a two-phase model, phase-fraction functions α 1 , α 2 with 0 ≤ α 1 ≤ 1 and α 1 + α 2 = 1 are used to formulate the set of equations of both phases individually. The equation of motion of an incompressible phase is adopted from Drew (1983) (eq. 41, 45 therein). By setting the interfacial force density and surface tension to zero, the simplified momentum equation of the mobile 110 phase (indicated with α 1 ) is written as: ∂ t (α 1 ρu) = −∇ · (α 1 ρu ⊗ u) − α 1 ∇ (α 1 p) + p I ∇α 1 + ∇ · (α 1 T) + α 1 ρb (1) p I is the interface pressure, reflecting Newton's third law of motion near a fixed wall. As in Kemm et al. (2020), it will be assumed that it is in equilibrium with the surrounding fluid pressure p. T is the stress tensor, which in LES averaging contains the contributions from subscale and surface fluxes. In this model, the implicit approach is used for spatial filtering 115 (Schumann, 1975). Viscous stresses are neglected due to the high Reynolds numbers typically encountered in atmospheric flows. b corresponds to the sum of external body forces, which also contains inertial forces in a rotating frame of reference.
The interface in our case is static, as buildings are assumed not to be affected by the flow. This makes it possible to multiply the equation of motion with α −1 1 ρ −1 , to obtain the tendency equation in single-flow denotation. The density is not allowed to change in time and is denoted with ρ ref for the reference state, which makes sense for an incompressible fluid. The full set of 120 model equations reads: α −1 1 ∇ · (α 1 u) = 0. (3) In the momentum equation (eq. 2), the body-force term was replaced by the Coriolis term, containing the Coriolis parameter f for a mean geographic latitude, and with the buoyancy term using the Boussinesq approximation. g is the gravity-acceleration vector in the local frame of reference, Θ v the virtual potential temperature defined by Eq. 6, and the overbar denotes for the horizontal mean state. Eq. 3 is the continuity equation derived by the same arguments from the original equation in Drew (1983). 130 The transport equation for scalars, like Θ v and specific humidity Q v , contains source terms from parameterized surfaces fluxes.
These have to be multiplied with α −1 1 . The scalar field k h is the eddy-diffusion for heat, which arises from the spatial filtering. Note that in our case, the combinations α −1 1 ∇ · α 1 and α −1 1 α 1 ∇ can be identified as particular versions of the divergence and gradient operator, which incorporate diffusive boundaries. It is also interesting to mention that the stencil of α 1 (face-centered) differs from α −1 1 (volume-centered) on a staggered grid.

Computation grid and diffuse boundaries
The computation uses a structured hexaedral grid, with the option for local grid stretching in all 3 dimensions. The velocity components are defined at the cell faces, whereas scalar fields are cell centered, classifying the grid structure as Arakawa-C type. Vertical coordinate transformation allows for a curve-linear grid in the physical space, which can be used to follow a smoothly varying terrain function: z is the mean-sea level height and h(x, y) the terrain-height function. The first grid level matches the terrain function, and all elevated levels are given by adding a constant vertical increment to the first level.
The pressure gradient in terrain-following coordinates expands to: Using Eq. 9, the advective tendency of a scalar q can be written as: Similarly, the divergence-free criterion follows in terrain coordinates: Combining Eq. 11 and 8, the metric terms in the Laplace operator are obtained, which is needed in the pressure equation: − ∂ z (∂ y h∂ y ) + ∂ z (∂ x h) 2 ∂ z + ∂ z (∂ y h) 2 ∂ z

160
As u, v, and w are maintained as the prognostic model variables and g is invariant under the given coordinate transformation, no metric terms arise in the buoyancy term. However, the horizontal averaging requires a remapping of Θ v fromz = const to z = const.
The spatial discretization uses a finite volume method, which allows a consistent treatment of the diffusive obstacle interface. 165 The conservation of a scalar q under phase partitioning is formulated using the Gauss theorem: Here, subscript m refers to the mobile phase and subscript s to the solid phase for the building-interior. χ m is the volumefraction function of the mobile phase and η m the area-fraction function over which the flux of the mobile phase is integrated.
As already mentioned, the stationary solid phase is dropped, as it is ∂ t χ m = 0, ∂ t q s = 0 and u s = 0. This simplifies the conservation form and finally leads to the particular differential form of the transport equation for the mobile phase: The subscript m was only briefly introduced here and will be dropped again, as only one phase, the mobile phase, is of interest.

175
Using the Cartesian grid structure, the flux-divergence can be put in a discrete form: ∆V and ∆A x,y,z are the cuboid volumes and face areas, respectively. The superscripts R and L refer to the left and right cell face for each dimension, respectively. Note that for f z , the contra-variant flux has to be used.
The pressure gradient components on the cell faces are discretized with second-order accuracy. For example, the gradient components on the x-orientated cell faces, and z-orientated faces are: where L z→x is the linear interpolation operator from a z-face to an x-face, and ∆ x is the cuboid size used for differencing the terrain function.

190
For a grid-conforming surface, the area fractions η x,y,z = 0, and it follows that the required Neumann-boundary condition is satisfied. Furthermore, if a grid cell is surrounded by grid-conforming surfaces, the pressure value inside is decoupled from any exterior grid cell, which matches the physical meaninglessness of such a value. For partially open semi-permeable cell-faces the boundary condition is imposed only on a fraction of the cell face area.
The scaling fields for the cell volumes and face areas intersected by buildings are derived from geometric building data sets. As an alternative to terrain-following coordinates, it is possible to use diffuse boundaries for the terrain, in which case, the subsurface is represented by an obstacle too. Per definition, the volume-fraction field χ is expressed as the fraction of the building-free volume in each grid cell. For numerical reasons, however, χ is limited to a value of about 0.01. For well resolved buildings, the area-scaling fields η x,y,z would be derived by calculating the intersections of the buildings with the grid-cell 200 faces. For under-resolved buildings, this method is very sensitive to the grid alignment, and in certain cases, buildings are missed, if they do not intersect with the cell faces, but nevertheless block the flow. Figure 1a) shows two possible scenarios, where grid cells are intersected by a building, which obviously blocks the flow in one dimension entirely but does not intersect with the cell faces of the blocked flow direction. In order to reliably capture such bottlenecks, a modified method to calculate the area fractions is used. The grid-cell volume is partitioned in slices, with the slicing planes displaced along the dimension 205 considered (e.g. x−dimension for the yz−faces). The minimum value of the free-volume fractions of these slices defines a so-called cell-area-scaling factor, which is assigned to the face being closer to the obstacle. For a more robust capture of nonparallel building walls, the slicing can be repeated several times with a slight rotation of the plane normal. If some cell faces are assigned to both values from the adjacent left and right grid cell, respectively, the minimum of both values is taken. For the cell-faces left without assignment, the geometric intersection with buildings is calculated as suitable for the resolved case.
210 Figure 1b shows on a practical example that the method preserves the blocking effect of a triangular-shaped building, even at spatial resolutions, which would be too coarse to resolve the building walls. Scaling factors for a complete building, now depicted as fields calculated for 3 different grid resolutions. χ is the volume-scaling field, and ηx, ηy are the horizontal components of the area-scaling field.

Advection scheme
The main task in the discretization of the advection term in a finite-volume framework is the reconstruction of the cell-averaged fluxes on the faces to calculate the balance according to Eq. 15. It is noted, however, that even in the incompressible case, the 215 flux-balance form differs from the required advective form, as there is a residual divergence originating from the approximate solution of the pressure equation.
The reconstruction for a scalar reduces to several high-order 1-dimensional reconstructions of the cell-averaged value q on the cell faces. The fluxes are then obtained by multiplication with the exact momentum components, and are of the same spatial 220 accuracy as the reconstructed scalars. For the reconstruction itself, upwind-biased stencils are used, requiring two reconstructions on each cell face for the two flow directions. Since the grid-spacing is not equidistant, the reconstruction coefficients have to be calculated in advance of the simulation as follows: Requiring the reconstruction to be n th -order accurate, n cell averaged values are needed. Assuming the wind is blowing from the left (positive direction), the set of values encompasses q(j − r + s), s ∈ [0, n − 1], with n mod 2 = 1, r = (n − 1)/2 + 1 and j being the target cell-index for the flux-balance calculation. For the 225 negative direction, it is r = (n − 1)/2. In total, n + 1 different values are needed.
The reconstruction itself is performed using the spatial derivative of the Lagrange polynomial, which fits the primitive function evaluated at the cell faces (see e.g. Shu, 1998): The coefficients have to be evaluated at the cell faces j − 1/2 for the positive upwind direction, and j + 1/2 for the negative 230 one. The differences in parenthesis can be expressed in terms of the grid spacing. Instead of the geometric grid spacing, the effective grid spacing ∆ ef f x , which is derived from the scaling fields, is used as a pseudo grid. This ensures a decreased weighting of interpolation nodes within volume-compromised cells and prevents the scheme from interpolating the degenerated values inside closed cells.
x j−1/2 − x j−r+p−1/2 =: While the formulas permit reconstructions of arbitrary order, a 5 th -order reconstruction was found to give superior resolution to a 3 rd -order one, while still being computationally efficient. For any increase in the order, additional ghost layers need to be communicated. Additionally, limiting of the reconstructions is applied using a total-variation diminishing method of Sweby (1984). This will ensure monotonicity for scalar advection.

245
The two limited reconstructions q + and q − are finally combined to give the numerical flux based on the propagation velocity u: For momentum transport, the routine for scalar cell-centered advection is re-used to advect a left-faced u l and right-faced value u r for each momentum component (Hicken et al., 2005;Jähn et al., 2015), resulting in a total of 6 routine calls. In 250 contrast to scalar advection, monotonicity is generally not desired, as it has a detrimental impact on energy conservation and will interact with the subgrid model. To prevent wiggles near building walls and to increase numerical stability, only local limiting is applied at obstacle boundaries. For a reconstruction face at index position j located near a diffusive boundary, a local weight can be taken as α + x = |(η x (j − 1) − η x (j)| for the positive upwind direction, and α − x = |(η x (j + 1) − η x (j)| for the negative direction. If u + Lim is the limited reconstruction of u + , using the aforementioned weight α + x , the following merging 255 results in a weighted limiting of the velocity component perpendicular to an adjacent building wall: The scheme, if necessary, degrades only to first-order accuracy at obstacle boundaries, while in the free boundary layer, no additional numerical dissipation is introduced.

260
The final momentum tendencies are obtained by interpolation of two centered tendencies on the face: Note that it is combined the right-faced advective tendency from the left grid cell and the left-faced tendency from the right grid cell (superscripts L and R, respectively). Spatial accuracy of momentum advection is limited to the order of this interpolation procedure, which is of second order. In the incompressible-flow equations, the pressure gradient term is not directly coupled to a prognostic pressure equation.
The projection method of (Chorin, 1968) is used to split the solution procedure in two steps. The first step integrates all the momentum tendencies, except the stated pressure gradient term, explicitly in time to obtain a predicted velocityũ: The final velocity after one integration step u t1 has to fullfill the continuity equation: The not yet known corrective tendency is the neglected pressure gradient ∂ tũ = − 1 ρ ref ∇p| t1 . Applying the divergence operator on both sides gives:

275
Since, as mentioned, the left side has to be zero, one obtains the well-known Poisson equation for pressure, which is to be solved algebraically: The final state is now composed of the fractional tendencies: Equation 30 is also an example of a first-order accurate in time Euler method. Higher-order multistage Runge-Kutta (RK) methods have the advantages of increased temporal accuracy and larger time stepping. The numerical stability of the integration in most practical examples is constrained by the advective and pressure-gradient tendency. Therefore, all the remaining terms are considered as the minor tendencies, and are evaluated only at every first stage. Different RK methods were tried in our model framework. It was found that the 2 nd -order midpoint rule (MP-RK2) and the strong-stability preserving 3 rd -order with one final pressure solve. Based on this, SSP-RK3 will be used for all of the applications in this paper and in future.

Pressure solution
The discretization of the Laplace operator P in the Poisson equation (Eq. 29) is obtained by the product of the divergence matrix D and gradient matrix G defined by the discrete forms. D is defined by the flux-balance in Eq. 15. Combining the operators to the pressure equation, the following sparse linear system is obtained: u, p and b are the one-dimensional expanded arrays of the corresponding structured fields. Eq. 31 is solved with a geometric multigrid method in parallel using domain-decomposition in two dimensions. The multigrid algorithm consists of applying a smoothing method of choice which is accelerated by coarse-grid corrections. Therefore, a hierarchy of coarse grids is employed (Brandt and Livne, 2011). 3-d coarsening of the grid is carried out by agglomeration of 8 grid cells to form a coarse grid cell 300 of the next-level grid. For odd grid sizes, a plane of grid-cells with respective orientation is left uncoarsened. This particular multigrid used in combination with finite volume discretizations is often referred to cell-centred multigrid in literature (Mohr and Wienands, 2004).
Particular challenges to the multigrid algorithm include grid stretching and non-smoothly varying coefficients associated with the diffusive boundaries, both resulting in coefficient anisotropy. An odd grid size results in coefficient anisotropy of the coarse 305 grid operators. Furthermore, Neumann boundary conditions result in a reduced smoothing efficiency near boundaries. In such cases, plane smoothers are much more adequate than their point-wise pendants (Llorente and Melson, 2000). Nevertheless, most of the difficulties were overcome by applying less elaborated methods in the current model.
Based on smoothing analysis, Larsson et al. (2005) give a condition for the optimal location of an uncoarsened plane in the case of odd grid sizes. For an odd grid size in any direction, the Galerkin coarse-grid approximation (GCA) is much more 310 robust than re-discretization of the Poisson equation. In GCA, the coarse-grid operator is formed algebraically by applying interpolation operators from the left (restriction) and right (prolongation) on the original matrix. The original stencil size of 7 points (without using terrain-following coordinates) is only preserved for piece-wise constant interpolation for both restriction and prolongation. In this case, it is necessary to multiply with a factor of 1/2 to compensate for the poor approximation quality (Braess, 1995). Trilinear prolongation would give an adequate approximation quality for constant-coefficient operators, but it 315 failed in our case of highly discontinuous media. As a compromise, the trilinear interpolation operator can be modified to approximate piece-wise constant interpolation near diffusive boundaries. As a drawback of trilinear interpolation, the coarse-grid stencil encompasses 27 points which increases numerical costs of coarse-grid corrections.
The smoothing method employed on each grid has to be efficient for moderate coefficient anisotropies. Yavneh (1996) found that successive over-relaxation (SOR) is generally superior to Gauss Seidel smoothing (even for isotropic coefficients), and he 320 also derived approximately optimal over-relaxation factors for SOR with red-black ordering applied in multigrid for the solution of anisotropic elliptic equations. An advantage of the red-black ordering is the complete vectorization and parallelization of the algorithm, while a disadvantage is the not optimal cache efficiency, as the computation grid is accessed twice in each iteration (Di et al., 2009). A more cache efficient SOR method provides the standard lexicographic ordering, which however Suitable alternatives for anisotropic problems to SOR are sparse-approximate inverse (SPAI) matrices as smoothers (Tang and Wan, 2000;Bröker and Grote, 2001;Sedlacek, 2012). Depending on the approximation quality of SPAI, which can be controlled by the sparsity pattern and consequently the number of allowed non-zeros, the efficacy of the smoother can be flexibly enhanced in order to tackle the difficult Poisson problem with varying coefficients. Since the Poisson matrix is not time dependent, it pays off to invest computational resources in a good approximation, as it is only necessary once at the beginning of the 330 simulation. x

Lateral boundary conditions
Before each pressure correction step, global boundary conditions for the next time step are already updated for the intermediate 335 velocity fieldũ. This approach naturally leads to a homogeneous-zero Neumann boundary condition for the pressure, which is shown in the following. We require that the updated velocity field satisfies global mass conservation: Using Eq. 32 as the source term of the pressure equation (Eq. 29) and applying Stokes theorem, one arrives at a Neumann condition for the pressure: Here n is the unit normal vector on the boundary surface. Since the tendencies of the already updated boundary values are zero, no projection is applied, which corresponds to the homogeneous-zero condition: Note that by already updating the boundary condition forũ, the pressure boundary condition was homogenized by attributing 345 the in-homogeneity to the source term ∇ ·ũ. An important remark to Eq. 34 is concerning the special case, when terrainfollowing coordinates are used. By requiring ∂ t ω = 0 for the contra-variant vertical velocity, the pressure boundary condition for the top-and bottom-domain boundaries is given by: In this most general case, the projection still allows variations in the velocity, and it requires the solution of an additional 350 linear equation (or an additional grid plane in the pressure equation). In order to keep the boundary velocity vector fixed and to prevent numerical expenses, it is used the trivial condition at the top and bottom boundaries: In the discrete gradient operator, homogeneous Neumann boundary conditions are implemented by setting all coefficients associated with the node where the condition applies to zero.
The not yet specified boundary condition for the velocity field has to be compatible with a dynamic mesoscale forcing and also with the integrability condition of Eq. 32. The workflow is therefore to first dynamically determine inflow and outflow 360 regions, impose separate appropriate boundary conditions, and then add a correction flux to the outflow regions to balance the total inflow-region flux. The outflow regions are determined by C ⊥ > 0, where C ⊥ is a normalized convective transport speed out of the domain. C ⊥ is evaluated at interior grid points and time step t. For simplicity, it is taken C ⊥ = ∆t ∆x u · n, however more elaborate formulations exist. The transport velocity is further bounded to 0 ≤ C ⊥ ≤ 1. This ensures numerical stability of the prognostic radiation equation applied to outflow ghost cells. As proposed by Miller and Thorpe (1981), first 365 order upwinding is used: The order of the spatial indexing l is in normal direction to the boundary and not to confuse with the standard interior indexing. The index l + 1 corresponds to the first ghost cell. Figure 3 shows that the outflow boundary condition with the proposed convective transport speed is well suited to our incompressible model even for highly unsteady flows, like the depicted vortex 370 street in the wake of a cylinder. Individual vortices are not visibly reflected at the boundary, and also in the temporal mean, based on Fig. 3b, the influence of the boundary is not discernible.
Instead of the flexible inflow-outflow condition, a Rayleigh damping layer can be used as another prognostic tendency near the domain top: Eq. 39 can be applied to gradually relax any prognostic variable to a prescribed horizontal mean state at the top boundary.
R is a ramp function with values between [0, 1], τ the damping time scale and q 0 the prescribed boundary value.
The final step is to impose global mass conservation, for which a correction fluxṁ d is integrated for each dimension d 380 separately: where A DIR is the surface area on which a Dirichlet condition is imposed, and A RAD the remaining surface area with the radiation condition imposed. It is further d A d DIR + A d RAD = ∂V the total domain-bounding area. The correction fluxes are converted to a corresponding correction speed u dṁ , which is added to the boundary-perpendicular velocity component on 385 A RAD . The mass flux is distributed over the radiation-condition area by using for example C ⊥ in a weighting function: Attention must be paid to the sign of the correction velocity, as Eq. 41 is multiplied by a negative sign for right-hand sided boundaries. When using terrain-following coordinates, it is the contra-variant vertical flux perpendicular to the bounding surface and used in Eq. 40. The correction speed is simply added to the covariant component w.

390
The flexible boundary condition can principally be applied to any other scalar quantity, however, specifying Dirichlet conditions for advected scalars was found to be sufficient. For numerical simplicity and efficiency, a static Smagorinsky subgrid model is used (Deardorff, 1970). Since the atmospheric model is operated in the limit of infinite Reynolds numbers, the principal purpose of the subgrid model is to stabilize the numerical simulation through an additional amount of energy dissipation at the shortest wavelengths. Generally speaking, the magnitude of the subgrid fluxes shall be only a small fraction of those of the resolved fluxes, so that any potential non-physical assumptions in the subgrid model have little influence on the resolved fields. This is a prerequisite for large eddy simulations, 400 and is one of the main reasons, why the static Smagorinsky model, despite its obvious short comings, is still a popular choice.
The Smagorinsky model relates the subgrid turbulent fluxes to the resolved rates of strain s x,y : The subscripts x, y here refer to the spatial components. 405 The terms in Eq. 42 and Eq. 43 are cell averaged values, and overbars are neglected for convenience. The often additionally mentioned anisotropic residual-stress tensor is ignored in the given incompressible case. x,y is the eddy viscosity, which for numerical efficiency, is also diagnosed from the rates of strain instead of solving an additional prognostic equation for the turbulent kinetic energy. x,y is denoted in tensor form, as an anisotropic mixing length l x,y is used to reflect grid anisotropy: |S| is the Frobenius norm of the stress-rate tensor. c s is the Smagorinsky constant, which is fixed in a static model. Tests with a boundary layer simulation revealed that in combination with the 5 th -order upwind discretization values of 0.1 < c s < 0.15 give good results. Finally, the anisotropy enters via ∆ x,y which is related to the grid spacing and modified by the mean distance to walls: h x is half of the mean distance between walls orientated in the x-direction.
The function f s introduces the influence of the stratification on the eddy viscosity. It is assumed 420 with the Richardson number Ri For scalar diffusion the eddy viscosity is divided by the turbulent Prandtl number, which is assumed to be P r = 2/3 here.
If not mentioned otherwise, the appearing spatial derivatives are discretized with second-order differences. To obtain the The diffusive fluxes are given by: For diffusion of the u-component, the fluxes in the 3 spatial directions are f x,x , f x,y and f x,z . Those of the other components are obtained by permuting the subscripts. For a scalar quantity q, the required fluxes are f x,x , f y,y and f z,z , wherein the rates of strain are replaced by the gradient components.
The vertical transfer coefficient C z is given by transforming the logarithmic-wind law: k = 0.4 is the Von Kármán constant, z 0 the surface roughness length, and z the height difference from the modeled surface to the grid level where the parameterization is evaluated. In order to alleviate the often encountered problem with the log-layer 440 mismatch (Yang et al., 2017), it is advantageous to take the second or third grid level above respective surface, since close to the surface turbulence is not adequately resolved.
The momentum sinks from horizontal surfaces are: Analogously, the source terms for heat and moisture from horizontal surfaces are: Θ s is the surface potential temperature, and Q s v the surface specific humidity. A z is the total exposed horizontal surface 450 within the grid cell, and ∆V the effective cell volume. f m and f h are stability functions, and z 0 the surface roughness length.
f m is based on the Bulk Richardson number, which is the fraction of buoyant to shear energy production and calculated as: In stable conditions, RiB > 0, and the stability functions are empirically determined by Doms (2011) for land surfaces: Conversely, in unstable conditions RiB < 0, and Sources and sinks from vertical building walls are treated similarly, with the exception that the stability function is set to 460 unity. For x-orientated surfaces, z is replaced by half of the average distance between surfaces in the equation for the transfer coefficient. Analogously, A z is replaced by A x for the total projected surface area with x-orientation.
The surface fields Θ s and Q s v can be considered as part of the external forcing and have to be provided either by the hosting mesoscale model or field-interpolated measurements.  (2017) gives an overview of various turbulence generation methods to provide turbulent inflow conditions. Among the different methods, a turbulence recycling scheme was implemented in the model, as it is computationally efficient and can be applied to a wide range of different domains and flow types. One peculiarity of the used scheme is that the recycling plane can be placed at an arbitrary distance to the inflow boundary within the computation domain, and all 4 horizontal boundaries are 470 considered as potential inflow boundaries, unless periodic boundary conditions are specified. The only requirement is that the plane distance is at least several integral length scales to prevent a spurious periodicity of the recycled turbulent features. In some cases, it may even practical to place the recycling planes near the respective opposite boundaries, if the inflow conditions shall be those from an urban boundary layer. In this case, no extra development fetch is needed.
At each model time step, a horizontal filter is applied on the velocity components within the recycling plane. By setting 475 the characteristic filter width w r to the integral length scale, the mesoscale variations are spared, which is important in case of spatially and temporally varying boundary conditions. Vertical filtering is not feasible due to the strong vertical wind shear within the boundary layer. The filtered velocity component is subtracted to obtain the small-scale fluctuation component: Equation 59 assumes a boundary perpendicular to the flow in x-direction. Next, the turbulent intensity is re-scaled to the 480 target value, after which the fluctuation field can be added to the inflow boundary field u in of the large scale flow u ls .
u in (z, y) = u ls (z, y) + min a max , ||u tar || 2 (z) ||u || 2 (z) u (z, y) a max is used to limit the artificial amplification of turbulence shortly after model initialization.
The filtering operation, as well as the calculation of the turbulent intensities require communications in the parallel im-485 plementation. In order to keep a maximum degree of parallelism in the model, decentralized filtering is used, instead of the much simpler method of transferring all the data to a root node. Before filtering, additional ghost cells are exchanged, whose communication effort can be substantial for large filter widths. In this regard it is advantageous and adequate to use an efficient box filter. A final communication is potentially necessary to transfer the data to the inflow boundary, if the recycling plane is located on another subdomain. Overall, despite the communication overload, the computational costs of the recycling scheme 490 were found to be only 1% to 2% of total costs on average.

Model evaluation
A series of simulations of different complexity is carried out in order to assess the model accuracy and demonstrate the capabilities of the numerical core. The first model experiment is an advection test taken from Calhoun and LeVeque (2000).
A tracer is advected within a rotating annulus. Due to to the large number of intersected grid cells in manyfold ways, this 495 test can be considered as challenging for the empirically determined order of convergence. In the second test, which is also reported in Calhoun and LeVeque (2000), a wave of a test tracer is advected through an irregular obstacle field. The stationary wind field is a numerically-approximated potential-flow solution obtained with a single projection step. A sensitivity study is performed to determine the grid-spacing sensitivity of the advection routine in combination with the projection method. The third basic example to evaluate the dynamic core is the rising bubble simulation of Wicker and Skamarock (1998). While the 500 benchmark simulation is compressible, it will be determined how well the anelastic approximation in this model compares to it. The actual evaluation part is completed with the simulation of the wind-tunnel experiment "Michelstadt" (Berbekar et al., 2013). It provides a high-quality standard dataset for LES-type dispersion model evaluation. As a last example, which is orientated towards a more practical application, the models capability is demonstrated to simulate meteorology and airpollution dispersion under non-neutral stratification over non-uniform terrain.

Annulus advection test
In the advection test described in Calhoun and LeVeque (2000), a circumferential flow field inside an annulus with the inner radius R 1 = 0.75 and outer radius R 2 = 1.25 is defined by the following potential-flow function: r = x 2 + y 2 , and (x, y) are the coordinates of the grid-corner points. By differencing Ψ, the velocity components on the 510 grid edges are obtained: Analogous to the 3-D case, η x ∆y and η y ∆x are the effective lengths of the cell edges normal in x-direction and y-direction, respectively. It can be easily shown that the resulting velocity field satisfies ∇ · u = 0. The initial concentration of the advected 515 tracer is given by φ is the azimuth defined in a mathematically positive sense. One revolution of the tracer takes 5 s of simulation time, after which the analytical solution is identical to the initial state.
The integration step size is halved at each doubling of the resolution, starting at dt = 0.016 s.  Table 1. L ∞ is the maximum error magnitude, which convergences with an average rate of R ∞ = 0.8. In contrast, Calhoun and LeVeque (2000) did not observe convergence in the L ∞ norm for the Péclet numer P e = ∞ case. In the L 1 norm (mean absolute differences), the schemes accuracy is closer to second order with an average convergence rate R 1 = 1.73. The convergence rate is not significantly affected by the order of accuracy of the reconstructions, which is 5 th by default. This suggests that the flux limiting has a profound impact on the results of this test case.   Fig. 4

Advection trough an obstacle field
This 2-D test initially consists of an approximation of a potential flow solution for an irregular obstacle field. Therefore, one step of the explained projection method is applied on the initial wind field defined by u = 1 and v = 0. The resulting potential flow field is used to advect a test-tracer front, which is characterized by the left inflow boundary condition: For the transversal-flow direction, periodic boundary conditions are used.
The reference simulation is carried out on a domain with 200×100 grid cells and a uniform grid spacing of 1 m. The obstacles are circular with radii ranging from 5 m to 10 m. The simulation is repeated on coarser grids with dimension sizes of 100 × 50, 50 × 25, and 25 × 13, respectively. Figure 5 shows the simulation results at t = 150 s. The obstacles are well resolved on the 540 grid with 1 m spacing, but become more and more diffuse with decreasing grid resolution towards 8 m for the coarsest grid.
The initially planar wave is delayed and deformed by the obstacles. The qualitative impression is that the shape of the wave is not really sensitive to the grid resolution. Even in the most diffuse case, the position and shape of the wave matches that of the higher resolved simulations well. The wave dispersion can be quantified by considering the washout curves shown in Fig. 6, which are the spatially averaged concentrations at the outflow boundary versus time. For the finest grid, the concentrations at 545 the outflow boundary start to rise after t = 145 s and peak at about t = 185 s. At t = 250 s most of the wave is advected out of the domain. Remarkably, as already found by (Calhoun and LeVeque, 2000), the curve is not sensitive to the grid resolution up to 4 m, which is at the transition where obstacles start to become diffuse. At the coarsest resolution of 8 m, the peak is slightly broader, peak concentrations are lower and the peak occurs earlier by about 5 s. At this grid resolution, the intrinsic diffusion of the advection scheme becomes important, as the resolution capability is around 6 grid points which is barely enough to resolve 550 the wave.

Rising thermal
In the rising thermal simulation described in Wicker and Skamarock (1998) Wicker and Skamarock (1998). However, in our simulation the thermal is more compact, as 565 it is confined between x = ±2300 m and the peak height is at about 8100 m. In the original simulation it is confined between about x = ±2600 m and below 8500 m. Whether this slight discrepancy stems from the Boussinesq approximation or the different numerical schemes used, can not be finally clarified. By scale analysis, the magnitude of the buoyant acceleration is about one or two orders less than that of the inertial acceleration in the given example, so the Boussinesq approximation may indeed apply well here. The 5 th -order upwind scheme used here is much less diffusive than the 3 rd -order scheme used by

Michelstadt wind-tunnel experiment
The wind tunnel experiment "Michelstadt", which was carried out in the wind tunnel WOTAN (Lee et al., 2009) of the University of Hamburg, is used to evaluate the model accuracy and reliability against a physically based dataset. The benefits of using wind tunnel data for numerical model evaluation are the accurately controlled and determined approaching-flow conditions, the 580 high spatial and temporal resolution of measurements, and the high statistical significance of the data compared to field data (Schatzmann et al., 2017). the horizontally and temporally average state.
A first precursor simulation is run with periodic boundary conditions to obtain laminar and turbulent inflow profiles for the full vertical domain extend used to drive the actual experimental simulations containing the buildings and test-tracer release points. In this precursor simulation, which uses a shorter domain length of about only 1 km, buildings are not present and the entire rigid bottom-domain boundary is covered with roughness elements, with the arrangement adapted from the experiment.

615
In order to model the effect of the lateral wind-tunnel confinement, rigid walls are placed at the flow-perpendicular boundaries.
The modeled statistics of the established neutrally-stratified boundary layer flow are re-scaled to the reference wind speed of 6 m s −1 at 50 m height. The obtained horizontally averaged vertical profiles are vertically interpolated for the coarse-grid simulations. In the experimental simulations containing the model city, turbulent approach-flow conditions with the correct target intensities are generated using the turbulence-recycling scheme. The recycling plane is placed well downstream of the 620 model city and after a short pattern of roughness elements near the outflow boundary (Fig. 8a). This allows the application of a much shorter additional domain fetch for turbulence recycling, as the total recycling fetch can be much larger using the upstream domain. This particular positioning of the recycling plane can be justified by the similar parametric roughness length of the elements and the model city. Apart from that, the extracted fluctuation intensities are always re-scaled to the values of the target wind field.
3.4.1 Inflow profile Figure 9 shows the modeled and re-scaled horizontally averaged statistics of the boundary-layer flow using the precursor periodic domain. The experimentally obtained profiles are included for comparison. In the modeled mean horizontal wind speed, the roughness layer extends up to 20 m above ground, and is proceeded by the logarithmic Prandtl layer. The slope of the modeled wind profile in the semi-logarithmic depiction matches the observed profile well. The modeled turbulent intensities

Horizontal wind evaluation 635
The modeled horizontal wind is evaluated on the reference grid with 5 m horizontal resolution. Figure 8b gives a first qualitative impression of the agreement, as it shows the time-averaged horizontal wind vectors both for the model and the measurements within the plane of 2 m height. Overall, the model is capable of reproducing the measured flow pattern. However, the modeled wind direction does not always match the corresponding measured vector well, most notably near some intersections. The scatter plots for wind speed (Fig. 10) and wind direction (Fig. 11) give a more quantitative and conclusive picture. The accuracy 640 of modeled wind speed is high, when taking the normalized mean standard error (NMSE) as a proxy, whose value is consistently below 0.1. The fractional bias (FB) shows only a very slight underestimation of wind speeds (0.02 -0.07). This is not very surprising, since the model resolution of 5 m is not high enough to fully resolve the small-scale circulations within the urban canopy.  (c) (d) Figure 11. Scatter plot of modeled vs. measured horizontal wind direction within planes at different heights.

Dispersion simulation evaluation
The dispersion of point-source emissions is evaluated in the continuously emitting mode. The resulting time-averaged modeled concentrations are interpolated to the detector sides and paired with corresponding measurements. As proxy for the quality of model results in comparison to the measurements, the normalized mean standard error (NMSE), the fractional bias (FB) and the fraction of within factor 2 (FAC2) are calculated. Based on the guidelines presented in Baumann-Stanzer et al. (2015), acceptance criteria for a valid simulation are NMSE < 6, |FB| < 0.67 and FAC2 > 0.3.

650
The simulations with the approach flow direction of 0 • are used to optimize the model configuration with respect to these test criteria, while the reverse flow direction is used to validate the robustness of model results using the same parameter configuration. One important model tuning parameter for example is the Smagorinky constant, which was set to c s = 0.15 for all simulations. However, for the simulations with 40 m and 80 m resolution the vertical mixing length was increased to 20 m within the urban canopy to balance the underrepresented building-induced vertical mixing at such coarse resolutions.

655
Among the tested sources, S2 is the only one emitting into a quite open area, whereas all other sources are placed within street canyons or courtyards. Thus, S2 is probably the least difficult to simulate. Figure 12 gives an impression of the simulated time-averaged plumes resulting from S2 at the height of detectors. Figure 12a shows the reference simulation with the highest grid resolution. The overall qualitative agreement with measurements seems very good, except for a street canyon in roughly   for all other cases simulated. Based on it, it can be concluded that while moving to ever coarser grid resolutions the quality of model results generally declines (mostly evident in NMSE value), but not to an extend to compromise model reliability 680 at 40 m resolution, where buildings are represented diffusely. Only one source located within a courtyard proofed to be more problematic at 40 m grid spacing, which is due to the difficulty in modeling diagonally orientated building walls as impermeable as they are. Using the 80 m grid, the model still performs acceptable for some of the sources. Figure 14 shows scatter plots of the data pairs collected from all simulated cases with a given resolution, whose statistical results are again summarized in Tab where it is much larger (FB= −0.35). In this regard, the increased vertical mixing length showed to be an effective tuning option in combination with the 40 m resolution to keep the bias comparatively low (FB= −0.17). It has to be kept in mind that this test uses isolated point sources. When applied to a more realistic scenario with traffic emissions modeled for example by 690 emission lines, it can be hypothesized that scattering of the data would be of less concern, since the pollution is more widely distributed horizontally. Therefore, ultimately, the most important reliability measure in our view is the FB value, as it is a proxy whether the model will under-or over-estimates air pollution. In this regard, the 80 m resolution is the least accurate,  and it is currently not aspired without the use of a more sophisticated mixing parameterization. Finally, when comparing all model results with those from the 180 • -approach flow test only, it was found that this test with the model parameters adopted 695 from the 0 • -approach flow runs proofed to be not significantly less reliable. This is largely attributed to the simplicity of the model, as it requires little free parameter tuning.

Urban area within idealized basin
In this simulation, additional model features are combined for a more realistic test case, compared to the previous ones. For simplicity, this test is quasi 2-D. The third dimension is needed only for a realistic 3-D turbulent mixing, as turbulent mixing 700 behaves fundamentally different using two dimensions only.
The domain spans 2880 m in the lateral direction (x-dimension), 640 m in the streamwise direction (y-dimension) and 840 m in the vertical one (z-dimension). The horizontal grid spacing is 20 m and grid stretching is applied in the vertical dimension.  The grid spacing for the lower most levels is 7 m, and is increased beyond with approaching 50 m for the upper most levels. An urban area represented by rectangular buildings is placed at the bottom of an idealized basin whose cross-section is modelled 705 by the following analytic expression for the terrain-height function: The terrain-height function is constant in the streamwise direction, and also the building pattern is repeated in this direction.

710
For all rigid boundaries, including the diffusely discretized buildings, a surface roughness length of z 0 = 0.1 m is assumed. The initially neutral stratification of the atmosphere (Θ v = 280 K) is forced towards stable or unstable conditions by prescribing a constant surface temperature, which, in relation to the initial state of the atmosphere, is cooler or warmer by 5 K. For the heating case, also vertical building walls and roofs have an increased surface temperature, whereas for the cooling case, their temperature is kept at Θ v = 280 K. The wind field is initialized with an uniformly constant horizontal wind speed of v =  Figure 15 shows maps of Θ v and the tracer concentration fields for both the convective and stable case at the instantaneous times of 6 h and 18 h, respectively. The fields are averaged along the y-dimension to depict the basin cross section. In the convective case, 4 pronounced rotors are symmetrically aligned across the basin, with three main near-surface convergence zones: One 725 located directly over the city, and the remaining two on the basin edges. Heat transfer is locally enhanced either by higher surface wind speeds over the basin slopes or through the additional surface area of the buildings within the city. As a result, a pronounced heat island can be observed over the city, which is responsible for the lifting of the air pollution plume originated from the city inside the strong updraught. The air pollution eventually mixes throughout the height range of the simulated boundary layer. Near-surface air pollution in the city remains comparatively low and concentration gradients are weak. In the 730 stable case, only a shallow boundary layer develops. However, the basin is filled with cooler air and a pronounced inversion layer is present. Below the inversion layer, an erratic shallow circulation pattern with weak horizontal winds is present. Also noticeable are weak katabatic winds blowing down-slope towards the city, where the coldest air gathers. The inversion layer keeps air pollution trapped, and therefore the concentrations increase with advancing simulation time. This phenomenon of increased air pollution inside city basins is commonly observed during winterly high-pressure periods. While the air pollution 735 within the basin is distributed evenly horizontally it decreases with height, the highest concentration being present at street level in the city.
In this example, it is demonstrated that the effects of a wavy surface orography interact in a complex way with the thermal effects resulting from surface-heat transfer. This is especially true for the convective case, where both the urban heat island effect and the lee effect of the slope have a large influence on the location of the zones of convergence. In any case, the 740 incorporation of terrain effects from the surroundings is crucial for other more accurate dispersion simulation, which highlights the importance of a holistic simulation approach. From a numerical point of view, the use of a curve-linear grid is advantageous over a standard Cartesian approach, as the vertical grid stretching can be applied just in the same way as for a flat domain. This results in a much lower number of grid boxes for the same near-surface resolution.

745
In this paper, a new large-eddy based dispersion modeling approach for urban application was presented. The model uses diffusive obstacle boundaries in the framework of a finite volume discretization to represent building walls at a wide range of spatial resolutions. Diffusive obstacle boundaries allow for a consistent implementation of buildings in the model code, as they are essentially described by a scalar field for the volume-scaling factors and a vector field for the area-scaling factors. Using these fields to discretize the differential operators, boundary conditions are incorporated naturally and the governing equations in particular the diffusive obstacle treatment. When compared to data of the "Michelstadt" wind tunnel experiment, the model 765 simulated reliably complex wind fields and embedded tracer dispersion. As a result, even for spatial resolutions beyond 20 m, at which buildings can only be represented as increasingly diffuse features, the sensitivity study researching grid-spacing of the dispersion test showed promising results for a future study with more realistic emission distributions and real mid-sizes cities.
In near future, also the coupling with mesoscale meteorology will be addressed. From previous and accompanying air-quality studies, simulations with the regional CTM COSMO-MUSCAT are available for different German cities, including Berlin and 770 Leipzig, for which also comprehensive measurement data are available for model evaluation. In this framework, a promising application could be a more comprehensive and holistic model evaluation with field data, as additionally to airy monitoring, mobile measurements may become available for the city of Leipzig. Potential model improvements worth addressing in the future are the parameterization of air pollution sinks and the implementation of a simple urban atmospheric chemistry. Alternatively, it would be interesting to include diffusive obstacle boundaries in MUSCAT and to investigate whether there is a benefit 775 in the application on urban air pollution modeling.
Code and data availability.  Table A2. Statistical results derived from the combined data of all simulated sources at the given spatial resolution. The cases superscripted with a star are the combined results of the 180 • approach-flow cases only.