Mesoscale nesting interface of the PALM model system 6.0

In this paper, we present a newly developed mesoscale nesting interface for the PALM model system 6.0, which enables PALM to simulate the atmospheric boundary layer under spatially heterogeneous and non-stationary synoptic conditions. The implemented nesting interface, which is currently tailored to the mesoscale model COSMO, consists of two major parts: (i) the preprocessor INIFOR, which provides initial and time-dependent boundary conditions from mesoscale model output and (ii) PALM’s internal routines for reading the provided forcing data and superimposing synthetic turbulence to accelerate 5 the transition to a fully developed turbulent atmospheric boundary layer. We describe in detail the conversion between the sets of prognostic variables, transformations between model coordinate systems, as well as data interpolation onto PALM’s grid, which are carried out by INIFOR. Furthermore, we describe PALM’s internal usage of the provided forcing data, which besides the temporal interpolation of boundary conditions and removal of any residual divergence includes the generation of stability-dependent synthetic turbulence at the inflow boundaries in 10 order to accelerate the transition from the turbulence-free mesoscale solution to a resolved turbulent flow. We demonstrate and evaluate the nesting interface by means of a semi-idealized benchmark case. We carried out a large-eddy simulation (LES) of an evolving convective boundary layer on a clear-sky spring day. Besides verifying that changes in the inflow conditions enter into and successively propagate through the PALM domain, we focus our analysis on the effectiveness of the synthetic turbulence generation. By analysing various turbulence statistics, we show that the inflow in the present case is fully adjusted after having 15 propagated for about 2 to 3 eddy turn-over times downstream, which corresponds well to other state-of-the-art methods for turbulence generation. Furthermore, we observe that numerical artefacts in the form of grid-scale convective structures in the mesoscale model enter the PALM domain, biasing the location of the turbulent upand downdrafts in the LES. With these findings presented, we aim to verify the mesoscale nesting approach implemented in PALM, point out specific shortcomings, and build a baseline for future improvements and developments. 20


Introduction
The simulation of urban flows under realistic conditions poses a multiscale problem where evolving synoptic scales interact with building-and street-size scales. While the continuing growth of available computational resources enables large-eddy simulation (LES) to be applied to more and more realistic scenarios at regional scales (Schalkwijk et al., 2015), it still remains unfeasible to simulate mesoscale processes at resolutions fine enough to represent small-scale turbulence generated by urban is not clear what happens e.g. for diagonal inflows, making the turbulence recycling difficult to apply for evolving inflow conditions.
In contrast to recycling methods, volume-forcing approaches do not necessarily require homogeneous inflow conditions.
To accelerate the spatial development of a turbulent flow, Muñoz-Esparza et al. (2014) developed and implemented the socalled cell-perturbation method into the Weather Research and Forecasting Model (WRF, Skamarock et al., 2008), where 5 box-like perturbations are added onto the potential temperature within a defined region close to the inflow boundary. This was further developed by Muñoz-Esparza et al. (2015) and Muñoz-Esparza and Kosović (2018) by scaling the thermal perturbation amplitude depending on the stability regime. With this approach, Muñoz-Esparza and Kosović (2018) could significantly reduce the required fetch length to obtain fully adjusted turbulence, even under neutral and stable stratifications. The cell-perturbation method has shown promising results when applied in nested WRF-LES simulations of a full diurnal cycle for a real-world 10 setup , as well as in simulations for ocean-island interactions (Jähn et al., 2016) and coastal sea breeze events (Jiang et al., 2017). Furthermore, prescribing WRF output data as boundary conditions in a PALM simulation, Lee et al. (2019) demonstrated the ability of the cell-perturbation method in a densely built-up urban environment, where the required buffer zones could be significantly reduced compared to a non-perturbed simulation. To test a more direct method of turbulence generation , Mazzaro et al. (2019) extended the original cell-perturbation method by adding scaled perturbations 15 onto the velocity components at the near inflow region. This approach showed a comparable performance compared to the original version with a faster spatial development close to the inflow boundary but a longer adjustment fetch required to achieve an equilibrium state.
An alternative to volume-forcing approaches are so-called filtering approaches, where spatially and temporally correlated perturbations are imposed only onto the velocity components at the lateral boundaries (e.g. Klein et al., 2003;Xie and Castro, 20 2008). Gronemeier et al. (2015) have originally implemented the synthetic turbulence generation method by Xie and Castro (2008) into PALM and found good agreement of the turbulent flow development in an urban environment compared to using the turbulence recycling method according to Kataoka and Mizuno (2002). The main challenge of this approach, however, is to adequately infer Reynolds-stress components, as well as turbulent length and time scales of the flow to generate appropriate inflow turbulence. 25 Beside the necessity to add perturbations at the boundaries, modellers should also be aware that numerical artefacts from the mesoscale model may propagate into the LES; e.g. Mazzaro et al. (2017) and Muñoz-Esparza et al. (2017) showed that under-resolved flow structures propagate from a mesoscale WRF simulation into the LES. Honnert et al. (2011) found that in convection-permitting mesoscale simulations, resolved-scale convection on the grid scale can develop when the boundarylayer depth approaches the horizontal grid resolution. When boundary-layer convection is explicitly resolved in mesoscale 30 models, this is often referred to as the grey zone, or terra incognita (Wyngaard, 2004), where both, mesoscale and LES model assumptions break, i.e. the grid spacing is already too small compared to the dominant length scales of the flow to justify usage of fully parameterized boundary-layer schemes but still too large to reliably resolve convective structures. Ching et al. (2014) and Zhou et al. (2014) showed that the strength and spatial scale of the resolved-scale convection strongly depends on the horizontal grid resolution, while Shin and Dudhia (2016) also confirmed a dependence on the applied boundary-layer 35 scheme. With a grid nesting of a turbulence-resolving WRF simulation into a mesoscale WRF simulation, Mazzaro et al. (2017) showed that such under-resolved flow structures propagate into the LES, delaying the spatial development of turbulence. For a strongly convective case, Mazzaro et al. (2017) further showed that first-order statistics in the LES are not significantly affected by imposed under-resolved convection from the parent mesoscale simulation when the flow has been fully adjusted, though variances, turbulent vertical fluxes, and length scales in the LES tend to become larger for stronger under-resolved 5 mesoscale convection. Further, they showed that the signals of the imposed up-and downdrafts from the mesoscale model vanish after about 40 km downstream of the inflow boundary, even though they also noted that under less convective conditions the signals may even persist longer. However, this implies that in case of under-resolved convection in the mesoscale model, the turbulent transport in the LES as well as the location where up-and downdrafts occur, depend on the mesoscale model setup, i.e. horizontal resolution, boundary-layer parameterization, etc. In our test scenario we also found under-resolved roll-like 10 convective structures that propagate into the LES domain.
Another issue that emerges when nesting LES in mesoscale models concerns the representation of the atmospheric boundary layer. Due to different treatment of turbulent transport, i.e. boundary-layer parameterizations in the mesoscale model versus an explicit representation of turbulent eddies in the LES model, the vertical transport of energy, water, and momentum may differ considerably. In situations where this is the case, the mean state of the LES solution, which is generally more credible due 15 to a wider range of explicitly resolved turbulent scales, will be pushed towards the mesoscale solution including any possible model biases.
In this paper, we present the mesoscale nesting interface of the PALM 6.0 model system. It provides time dependent spatially heterogeneous boundary conditions for PALM obtained from the mesoscale model COSMO (see for instance Baldauf et al., 2011) and includes a synthetic turbulence generation method to accelerate generation of turbulent fluctuations at the 20 model boundaries. COSMO has been developed by the Consortium for Small-scale Modeling 1 and currently serves as the operational regional weather forecasting model at the German Meteorological Service (Deutscher Wetterdienst, DWD). PALM's mesoscale-nesting interface consists of two major parts: (i) the preprocessor INIFOR, which derives initial and boundary conditions from mesoscale model output, and (ii) PALM's internal routines for reading these forcing data and superimposing synthetic turbulence. In particular, we impose synthetic turbulent structures at the lateral boundaries following Xie and Castro 25 (2008), while the required turbulence statistics are parameterized based on mesoscale model output. At the moment, INIFOR is tailored towards the COSMO model, but extensions to WRF and ICON (Zängl et al., 2015;Reinert et al., 2020) are planned in the future. Note that the scope of this paper is on the description of our nesting approach and on the demonstration of its effectiveness. We defer in-depth analyses and comparisons to other methods to further publications.
This approach provides a one-way nesting capability of PALM into a mesoscale simulation, where boundary conditions 30 are only set for child model. At this point, we want to distinguish the mesoscale nesting interface from PALM's self nesting capabilities . While self-nesting allows a two-way coupling of a PALM child domain within a parent PALM domain, the mesoscale nesting interface presented in this paper realizes a one-way or offline nesting of PALM within a mesoscale model. That means, while PALM obtains time-dependent boundary conditions from the mesoscale model, informa-1 http://cosmo-model.org This paper is structured as follows. We describe the mesoscale nesting approach in Sect. 2, including the relevant mesoscalemicroscale model differences, the resulting transformation and interpolation methodology implemented in INIFOR, as well as the synthetic inflow turbulence generation method with its underlying turbulence parametrizations. We demonstrate and verify 5 our mesoscale nesting approach in a semi-idealized benchmark simulation of a convective boundary layer under evolving synoptic conditions. We describe the setup in Sect. 3 and analyse the case thereafter in Sect. 4. We conclude this paper with a summary of our findings and an outlook to future developments in Sect. 5.

Mesoscale nesting interface
The PALM model is nested into the mesoscale model by prescribing initial conditions and time-dependent Dirichlet boundary 10 conditions derived from output of the parent mesoscale model. Boundary conditions for PALM are given for the top and lateral domain boundaries. The boundary conditions at the surface are provided by PALM's urban-and land-surface model, which are initialized from the mesoscale data.
The boundary conditions enter PALM via the mesoscale-nesting interface, which consists of two major components: (i) the pre-processor INIFOR and (ii) PALM's internal boundary condition routines. The workflow of a model run using the 15 mesoscale-nesting interface is illustrated in Fig. 1. First, the forcing data are interpolated in a pre-processing step using INIFOR and stored in a netCDF driver file. In analogy to the static driver , which contains all static geospatial information such as topography, building and surface parameters, etc., we refer to this forcing file as the dynamic driver. During the simulation, PALM successively reads and processes the dynamic driver data. This includes temporal interpolation of the boundary data, removal of any residual divergence, as well as the superposition of synthetic turbulent fluctuations (see Sect. The required prognostic variables for which the dynamic driver provides initial and boundary conditions are listed in Tab. 1 next to their equivalents in the COSMO model output. Note that we use upper-case letters to denote COSMO's dependent  (u, v, w, θ, and q v ) at model start, which can be supplied either as one-dimensional vertical profiles (level-of-detail, LOD = 1) or as threedimensional fields (LOD = 2). Since the initial atmospheric data is already interpolated onto the PALM's Cartesian grid by INIFOR (see Sect. 2.2), it can be directly copied onto the respective internal PALM arrays after it is read from the dynamic driver. Further, the dynamic driver contains the initial state of soil moisture (m soil ) and temperature (t soil ), again either as 5 one-dimensional profiles (LOD = 1) or as horizontally heterogeneous three-dimensional data (LOD = 2). To allow for different number of soil layers in the PALM domain depending on the local soil type, we decided to linearly interpolate the provided soil data during soil-model initialization rather than in a preprocessing step done by INIFOR as it is done for the initial atmospheric quantities. At this point, we note that the provided initial soil data only contains values aggregated over a mesoscale grid cell, which in reality may feature surfaces with various soil types and different land use across which soil moisture and temperature 10 can vary significantly.
Hence, we recommend to run the soil-model spin-up mechanism as described in Maronga et al. (2020) to obtain individual soil moisture and temperature profiles that are in equilibrium with the local conditions at each model surface. In case of self nesting, where fine-resolution child domains are nested within a coarser-resolution outer parent domain, it is sufficient to provide initial mesoscale model data for the outermost parent domain only, while the respective initial data is propagated 15 to the nested child domains as described in Hellsten et al. (2020). However, the user may also provide a separate dynamic driver for PALM to initialize atmosphere and soil quantities in the respective child domain, which is useful, for instance, if high-resolution initial soil data for a limited area is available.
In addition to the initial state, the dynamic driver provides the time-dependent boundary conditions for PALM's atmospheric prognostic variables (u, v, w, θ, and q v ) at the top and the four lateral boundaries at certain points in time (hourly data is 20 provided from COSMO output). These time-dependent boundary data are read from the dynamic driver and are linearly interpolated onto the respective model time level, while the data is copied onto the respective model boundaries. In order to save memory, only the boundary data at LES time levels t i and t i+1 are read, with t i ≤ t s < t i+1 , while t s being the actual simulation time in the model. The boundary data can be provided either as one-dimensional vertical profiles (one value for the top boundary) that are identical at each of the lateral boundaries (LOD = 1), or as individual two-dimensional x-z (north and south lateral boundary), y-z (east and west boundary) and x-y (top boundary) cross-sections.
The velocity boundary conditions and the associated mass-flux fields obtained from a compressible mesoscale model such as COSMO do not generally satisfy the divergence-free condition of incompressible models such as PALM. To overcome this, 5 we correct the velocity w t b at the top boundary similar to the approach described by Hellsten et al. (2020). The correction is calculated from the integrated mass flux through the lateral and top boundaries as where u b denotes the velocity vector at the boundary, n the boundary normal vector and Ω the surface area of the model boundaries. We obtain divergence-free boundary conditions by using the corrected vertical velocity instead of the preliminary boundary condition w t b (x, y) at the top boundary. With this correction, we satisfy the mass-flux continuity globally. Local continuity is continuously maintained by the pressure correction step (cf. Appx. A).
We do not currently use any damping zones near the lateral boundaries to relax the solution towards the boundary conditions as is done for instance in the WRF model. There, a relaxation term according to Davies and Turner (1977) is added in the 15 momentum equations near the boundaries to suppress reflections of waves. For the present study, we tested the effect of such damping zones on the flow (not shown) but we could not observe any significant differences in the results nor any wave reflection. However, this may change in the future when PALM gains support for a compressible system of equations, which would add sound waves to the solution. 20 In the following, we describe the relevant model properties and point out the relevant differences, which yield the conceptual steps that need to be carried out by INIFOR. Here, we omit in-depth descriptions of both models and refer the reader to additional publications. In particular, more information about the formulation and numerics of COSMO can be found in the model documentation by Doms and Baldauf (2018). Details and verification studies of COSMO-DE -a particular model configuration used at DWD -have been published by Baldauf et al. (2011). For details about the PALM model system, please 25 see the descriptions by Maronga et al. (2015Maronga et al. ( , 2020 and the publications cited therein.

Model differences
PALM and COSMO differ in a number of ways, between which INIFOR needs to translate in order to derive PALM forcing data. The first difference lies in the physical formulation of the models. COSMO is a non-hydrostatic limited-area atmospheric model. It is based on fully compressible equations, which are formulated in terms of the three spherical wind components, absolute temperature and pressure, density and multiple water phases. PALM, on the other hand, solves incompressible equations for the moist atmosphere, where either the Boussinesq or an anelastic limit of the Navier-Stokes equations may be used. The model is formulated in terms of the three Cartesian wind components, the potential temperature and the water vapor mixing ratio. The continuity equation in the anelastic and Boussinesq approximations reduces to divergence constraint ∇ · (ρv) ≡ 0.
This restriction is not present in COSMO's compressible formulation and this difference is accounted for in PALM's side of the nesting interface by Eqs. (1) and (2). Furthermore, at horizontal grid spacings of several kilometres, turbulence in COSMO is fully parameterized such that its flow fields are essentially free of turbulent fluctuations. PALM, on the other hand, explicitly resolves the energetic part of the turbulent spectrum.

5
Secondly, owed to their different domain extents, both models use different approximations of Earth's surface and, as a result, use different coordinate systems. COSMO represents the planet as a perfect sphere with radius R = 6371.229 km and terrain layered on top of it. Consequently, it uses a spherical coordinate system, in particular a rotated-pole system as depicted in Fig. 2a. The origin of COSMO's coordinate system is rotated to the region of interest in order to minimize grid heterogeneity.
The rotation is defined in terms of the location of the rotated North Pole with the restriction that the prime meridian continues 10 to intersect with Earth's axis of rotation and, thus, with the geographical North and South Pole. In contrast, PALM is designed as a tool for simulating the atmospheric boundary layer, which implies domain sizes that are small compared to Earth's radius.
Hence, Earth's surface is represented as a tangential plane and the governing equations are formulated in a Cartesian frame of reference with the z coordinate aligned with the uniform gravitation vector field and the y coordinate facing north.
Lastly, COSMO and PALM use different numerical grids, which requires interpolation. Both models discretize their respec- Both are based on the Arakawa-C-type grid structure, where scalars are defined at the mass points at the cell centre and velocity components are staggered one half grid cell. In the vertical, both models allow for grid stretching. COSMO uses a hybrid zcoordinates, with levels in the lower region following the terrain and gradually approaching an upper region with horizontally 20 homogeneous spacings. The grid is constructed starting with the staggered velocity points, the so-called half layers. The full layers, where mass points are located, are defined as the arithmetic mean of two neighbouring half layers. PALM, on the other hand, uses a horizontally uniform grid that may contain both, parts with vertically stretched as well as equidistant grid spacings.
With PALM, typical grid spacings are on the order of 100 m to 1 m, while COSMO is designed for horizontal grid spacings on the order of 10 to 1 km. terrain. The particular rules governing the vertical grid generation can be found in DWD's database manuals (Baldauf et al., 2011;Baldauf et al., 2018) and in the COSMO model documentation (Doms and Baldauf, 2018), but in the context of data interpolation on that grid, knowledge of the underlying rules is not necessary. Rather, the vertical grid is completely defined by 10 the three-dimensional field of the half-layer heights, which is static in time and available in the model output.

INIFOR preprocessing
PALM and COSMO differ in their physical formulation, i.e. their prognostic and available output variables, the representation of Earth's surface, the coordinate systems, and the structure and resolution of the numerical grids used. To translate those differences, INIFOR needs to carry out the following conceptual steps: PALM rotated-pole system (λ p , ϕ p , z p ) COSMO rotated-pole system (λ r , ϕ r , h) Figure 3. Coordinate systems used in INIFOR. The PALM grid coordinates are first projected onto the PALM rotated-pole system (see Eq. (5)), which are then transformed to the COSMO rotated-pole system. In the following sections, we describe how INIFOR addresses each of these steps in detail.
Note that the data interpolation could be carried out in different coordinate systems. With INIFOR, we decided to interpolate in COSMO's rotated-pole system where the required interpolation neighbours are located on a rectangular grid leading to 5 simple and efficient interpolation rules. We obtain the COSMO coordinates for the PALM grid points using a two-step transformation as shown in Fig. 3. First, we project the PALM grid onto COSMO's geoid (see Sect. 2.2.2), resulting in a rotated-pole system similar to COSMO's but with a different rotated North Pole. Then, we transform from the rotated PALM system to the rotated COSMO system.

Conversion of prognostic variables
10 Differences in the model formulations require conversions between the sets of prognostic equations. In our case, this includes the computation of the potential temperature and the volumetric soil moisture. INIFOR converts both quantities before interpolating them onto the PALM grid.
As for the air temperature preprocessing, INIFOR replaces the absolute temperature T provided in the COSMO output by the potential temperature given by Here, P is the corresponding mesoscale pressure, p ref = 10 5 Pa is the reference pressure, and R d = 287 J kg −1 K −1 and c p = 1005 J kg −1 K −1 are the ideal gas constant and specific heat at constant pressure of dry air, respectively.

5
For soil data, preprocessing is slightly more involved because on sea or inland water cells, COSMO's soil data are not defined. Due to the coarser grid resolution shorelines or inland lakes do not necessarily correspond to the high-resolution surface input required by PALM. In order to provide soil data at each PALM grid point, the missing information is iteratively generated by horizontal averaging of soil data from neighboring land cells. Every iteration of this procedure generates new virtual land cells. By repeating this procedure using both the original and newly generated virtual cells, virtual shoreline moves 10 one COSMO cell per iteration. This procedure is currently repeated five times, before the field is used for interpolation. After the data extrapolation on the COSMO grid, the units of COSMO's soil moisture are converted to PALM's formulation. COSMO provides soil moisture as vertically integrated water density while PALM requires the volumetric water content. The conversion is given by 15 where WS k and ∆d k are the column-integrated soil moisture and thickness of the k-th soil layer in COSMO, respectively, and ρ w = 1000 kg m −3 is the approximate density of water.

Inverse map projection
There are multiple ways as to how the differences in the representation of Earth's surface can be resolved. Two options are illustrated in Fig. 4: (i) by a direct spatial transformation between coordinates of the rotated-pole coordinates and the tangential 20 Cartesian system, or (ii) by projecting the Cartesian system onto COSMO's geoid surface. The first approach, however, implies a change in the gravitational field: While the isosurfaces of the gravitational potential are concentric spheres (gravitation vectors point towards the geoid center), they are parallel planes in PALM's Cartesian system. As a result, a balanced stratified atmosphere in COSMO would produce baroclinic instabilities if it was directly transformed into PALM's Cartesian system.
With INIFOR, we avoid this effect by choosing the second approach and project PALM's system onto COSMO's geoid. This 25 corresponds to the inverse of a map projection.
In particular, we use an inverse plate carrée projection which linearly maps between the Cartesian coordinates (x, y, z) and the spherical coordinates (λ p , ϕ p , z p ) on a sphere of Radius R according to where the superscript p refers to PALM. This projection defines a rotated-pole system, the equator and prime meridian of which 30 pass through PALM's Cartesian origin (see Fig. 2c). By requiring the y-axis to point towards geographical North, we obtain a rotated-pole system of the same kind as COSMO's rotated-pole system where the prime meridian also intersects with the Earth's North Pole.

Coordinate transformation
When transforming the PALM to the COSMO rotated-pole coordinates, we consider the PALM system a rotated-pole system relative to the COSMO rotated-pole system, the same way the COSMO system is a rotated-pole system relative to the geo-5 graphical system. Because, as we discuss below, the definition and evaluation of the transformation from PALM's to COSMO's coordinates involves forward and backward transformations between rotated systems, we begin with the general forward and backward transformations. The forward transformation from a geographical (λ, ϕ) to a rotated-pole system (λ r , ϕ r ) is obtained from spherical geometry as (Baldauf et al., 2014) 10 ϕ r (λ, ϕ) = arcsin sin ϕ sin ϕ N + cos ϕ cos ϕ N cos(λ − λ N ) , where (λ N , ϕ N ) are the geographical coordinates of the rotated pole. The inverse transformation is given by ϕ(λ r , ϕ r ) = arcsin sin ϕ r sin ϕ N + cos ϕ r cos λ r cos ϕ N .
The definition of the PALM rotated-pole system starts with the specification of its origin in terms of its geographical coordi- In order to define the transformation to the rotated COSMO system, we need to translate the PALM origin into the corresponding rotated North Pole in the COSMO system. We do this by first computing the location of the rotated North Pole (λ N , ϕ N ) in the geographical system as and then, using the forward transformation in Eqs. (6) and (7), we obtain the rotated North Pole coordinates λ r N = λ r (λ N , ϕ N ) and ϕ r N = ϕ r (λ N , ϕ N ) in COSMO's frame of reference. Now the horizontal transformation between the PALM and COSMO is fully defined and we can transform PALM rotated-pole coordinates to COSMO's rotated-pole system using the backward transformation in Eqs. (8) and (9) using the PALM coordinates (λ p , ϕ p ) as the rotated-pole coordinates (λ r , ϕ r ) and COSMO's rotated-pole coordinates (λ r , ϕ r ) as the geographical ones (λ, ϕ).

5
Finally, the PALM domain may generally be shifted above sea level by specifying a non-zero domain base z 0 in order to vertically align COSMO and PALM orography. The COSMO heights (above sea level) of the vertical PALM levels at z p are then given by

Spatial interpolation 10
Having the COSMO rotated-pole coordinates for each PALM grid point available, we can interpolate COSMO fields by locating the appropriate interpolation neighbours and by computing the corresponding interpolation weights. We use the coordinate symbols laid out in Tab. 3 to describe the interpolation methodology. In particular, the COSMO rotated-pole coordinates are denoted by (λ r , ϕ r , h) while the Cartesian PALM coordinates are x = (x, y, z). Grid points are referenced with the indices i, j, k for the PALM grid while points on COSMO's grid are denoted by an additional hat. 15 Using this convention, a general interpolation scheme for an arbitrary scalar s on the COSMO grid can be formulated in Here, the indicesî l ijk ,ĵ l ijk , andk l ijk identify the l-th neighbour on the COSMO grid for the PALM grid point at x ijk and W l ijk are the corresponding interpolation weights, which satisfy N l l=1 W l ijk = 1. Since the scalar's values on the mesoscale grid are 20 known, the remaining task is to compute the values of those four fields. In INIFOR, we use bilinear and trilinear interpolation requiring only the four or eight closest neighbours, respectively, but the approach may be extended to higher-order schemes by including more points. INIFOR separates horizontal and vertical interpolation, which (i) simplifies the treatment of COSMO's terrain-following vertical grid and (ii) enables us to adapt the horizontal scheme to other grid structures in the future, such as the triangular horizontal grid of ICON, the Icosahedral Nonhydrostatic Model. (As of writing this paper, ICON is being used 25 as the operational global weather prediction model at DWD and ICON-LAM -its limited-area model variant -is designated to supersede COSMO as the regional model. For a description of ICON's grid see for instance the paper by Wan et al. (2013).)

Two-dimensional horizontal interpolation
In the case of bilinear interpolation, Eq. (12) reduces to two dimensions and we can drop the vertical indicesk and k and N l = 4. The indicesî l ij , j l ij of the four neighbours l ∈ {1, 2, 3, 4} arê The reference COSMO indicesÎ ij andĴ ij mark the bottom left neighbour point (see Fig. 5) and are obtained from the rotated-5 pole coordinates of the PALM grid point according tô where λ r 0 and ϕ r 0 mark the lowest longitude and latitude of the COSMO grid and ∆λ r and ∆ϕ r are the equidistant grid spacings in the respective directions.

10
Using the location of the neighbour grid points, we can compute the corresponding bilinear interpolation weights based on the nondimensional coordinates along the COSMO cells faces. The bilinear interpolation weights at the four neighbour points are given by 5 which lets us interpolate scalars using to Eq. (12).

Three-dimensional interpolation
where h ijk are the COSMO grid levels and l ∈ [1, 4] iterates over the four neighbouring COSMO columns. In the second step, the interpolated valuesŝ are interpolated vertically from the intermediate to the PALM target grid Below the lowest intermediate grid levelh ij1 of each column, all variables are extrapolated downwards as a constant. Beyond that, there is currently no further terrain adaptation made to blend the terrain from the coarse mesoscale to the fine microscale resolution.

Interpolation of velocities
The transformation in Eqs. (8) and (9) between the two rotated-pole systems (see Fig. 2b) 20 involves a rotation as the meridians of the original and rotated system are generally not parallel. This deviation angle, the so called meridian convergence, increases as we move away from the reference meridian and its distribution is given by Baldauf We obtain the Cartesian velocity components in the PALM system by rotating COSMO's spherical velocity components by 25 the local meridian convergence according to Since on the staggered Arakawa-C grid U and V are not defined at the same location, INIFOR first interpolates horizontal velocities onto COSMO's mass points and then rotates the interpolated velocity vectors using Eq. (20). Apart from this preprocessing, velocities are interpolated the same way scalars are. The resulting interpolation neighbours and weights for velocities however do differ from those of scalars because u and v on PALM's staggered grid are in turn defined half a grid cell away from PALM's mass points.

5
Averaging of profile data INIFOR provides the option to initialize and force PALM with three-dimensional atmospheric data (LOD = 2) or with averaged profiles (LOD = 1). The latter has the advantage that, for large-setups, INIFOR preprocessing is easier to handle in practice because less memory is required on the preprocessing machine and the resulting dynamic driver is greatly reduced in size because three-dimensional arrays are omitted. INIFOR produces profile data by first averaging along COSMO levels and then interpolating in the vertical direction.

10
Concretely, this is done carrying out the following steps. We first define the averaging region as a horizontal box bounded by the minimum and maximum rotated longitude and latitude of the PALM domain. Once all COSMO columns in the region are identified, we compute the average vertical grid levels of the terrain-following COSMO grid and then compute the vertical neighbours and weights for every PALM level relative to the averaged COSMO levels. The average profile (denoted in the following by the double bar) is then formed by scanning through the N c columns of the averaging region and adding the 15 vertically interpolated values on every PALM level according to with (î c ,ĵ c ) being the indices of the N c COSMO columns in the averaging region.

Program structure
INIFOR's program structure is organized around the set of netCDF variables that are to be computed for the dynamic driver. 20 The dynamic driver contains individual netCDF variables for each combination of prognostic variable and model boundary, e.g. netCDF variables for the u velocity component at the east, the south, the west boundary and so forth. Internally, INIFOR maintains a list with representations of each of those netCDF variables. Each is associated with the netCDF metadata required to handle data input and output, the computational task -averaging profiles or interpolating fields in 2D or 3D -as well as with the appropriate interpolation grids. The latter contain both grid point coordinates and interpolation neighbors and 25 weights. Generally, different variables that are defined at the same grid point type also share the interpolation parameters.
For instance, the horizontal (intermediate) interpolation grid for scalars is shared among netCDF variables for the initial soil moisture and temperature fields as well as the top boundary conditions for w, θ and q v . Consequently, interpolation grids with their corresponding interpolation parameters are computed once and reused each time step and shared among multiple output variables. As input data, INIFOR reads hourly netCDF files containing COSMO analyses. Each hourly input is processed separately and translated into one instantaneous boundary condition in the dynamic driver. Input data is not interpolated temporally in INIFOR but rather in PALM during the simulation as described in Sect. 2.

Superposition of boundary conditions with synthetic turbulence
With the generation of time-dependent boundary conditions from mesoscale model output in a preprocessing step and online processing of the boundary data, PALM is enabled to simulate more realistic scenarios considering time-evolving synoptic conditions. However, due to the nature of RANS models, turbulence is parametrized and thus the boundary values are free of any turbulent fluctuations. Mirocha et al. (2014) showed that without adding perturbations the turbulent flow needs several tens 15 of kilometers to sufficiently develop. In order to accelerate the spatial development of turbulence in PALM in our mesoscale nesting approach, we employed the synthetic turbulence generator by Xie and Castro (2008) where perturbations are added onto the u, v, w -components imposed at the lateral boundaries. In the following, the preliminary boundary values without any perturbations added are indicated by an overbar.
To obtain turbulent flow components u i,b on the lateral boundaries, spatially and temporally correlated disturbances u i are imposed onto the preliminary velocity components u i,b , a is the amplitude tensor that is calculated from the Reynolds stress tensor r. To consider cross-correlations between the velocity components, Lund et al. (1998) suggested a Cholesky decomposition to compute a recursively by Depending on characteristic length scales L and time scales T of the flow, which are defined individually for each velocity component in each spatial direction, u i computes as with ∆t being the actual LES time step and the two-dimensional spatially correlated disturbances The subscripts m and l indicate grid positions at the lateral boundary, N i = 2L i /∆x i , with ∆x i being the grid spacing. ζ i indicates a set of equally-distributed random velocities with zero mean and unit variance that are individually computed for 15 each velocity component. Finally, the spatial filter function computes as . With this approach, the imposed u i reflects the prescribed Reynolds stress as well as the spatial and temporal correlation according to L i and T i , respectively. At this point we want to note that Xie and Castro (2008) assumed an exponentially decaying autocorrelation function in Eq. 24, as well as for the formulation of the filter coefficients 20 b * i . Mordant et al. (2001) have shown that this is a valid approach for shear-driven flows. To our knowledge, it has not been proved yet whether an exponentially decaying autocorrelation function is an appropriate choice for stratified flows. Due to the lack of universally valid alternatives, however, we employed the formulation described above also for stratified flows despite its possible limitations.
From a mathematically point of view, the imposed fluctuations should have zero mean. Due to a finite sample of random 25 numbers and the finite number of discrete grid points, however, the fluctuations have mean values slightly different from zero in practice. In order to overcome this, Kim et al. (2013) proposed a correction for the boundary normal flow component in order to maintain constant mass flux through the boundary. In order to avoid that the perturbations imposed onto the non-normal components may have non-zero mean too, e.g. non-zero mean w− and v−component at the western model boundary, we correct the imposed turbulent velocity components as 5 with S being the surface area of the respective lateral boundary.
For non-stationary flows, an inflow boundary can become an outflow boundary and vice versa. Hence, the turbulence generator is applied at each lateral boundary simultaneously, while at opposite boundaries (west and east, as well as north and south) we use the same Ψ i and thus the same set of random numbers (velocities). By doing so, we save computational resources because the same set of Ψ i is already available on the west/east and north/south boundary according to our parallelization In order to overcome this and to balance the computational load over various processes, we parallelized the synthetic turbulence generator using the Message Passing Interface (MPI). To achieve this, we made use of the 2d domain decomposition used in PALM (Maronga et al., 2015). The imposed disturbances are computed locally on each MPI-process that belongs to 20 a lateral boundary. In order to avoid that only processes residing at the domain boundaries execute the computationally heavy code of Eqs. (25) and (26), while other processes are on hold, the computation of the filtered random numbers is divided in vertical direction by the number of processes in normal direction to the inflow boundary (e.g. on the left boundary, computation is distributed over npex parts where npex is the number of sub-domains, or MPI processes, along x), while the filtered random numbers are gathered on the boundary process, subsequently. In case of large L i /∆x i , this significantly reduces the required : nxr B +N x , meaning that on process A and B random numbers partly overlap, e.g. within the index range nxr A -N x +1 and nxl B -N x . In order to obtain the same ζ within the overlapping index range, we have to assure that the set of random numbers do not depend on the horizontal domain decomposition. This is achieved by linking the seed of the employed random-number generator to the grid index which is then independent on the domain decomposition. With respect to the computation of ζ, MPI-communication can thus be reduced to only exchange data to compute its mean and variance. Note that according to Eq. 25, random numbers are also required for locations outside the model domain to allow for the computation of the spatial 5 correlations, especially near the domain boundaries and for larger values of L i /∆x i . Therefore, the required random-number arrays are allocated with an offset so that all required values fit into the respective arrays. In case L i /∆x i increase or decrease during the simulation, the respective arrays are resized.
In order to create time-and height dependent synthetic turbulence, respective information about the Reynolds stresses, as well as turbulent length and time scales for the velocity components are required. For stationary flows these information can 10 be deduced from observations or from cyclic precursor simulations (Xie and Castro, 2008). However, for non-stationary flows with pronounced diurnal cycles and/or changing synoptic conditions running precursor simulations is practically not feasible.
Also, to take these information from the mesoscale-model output is also not possible since these detailed information are often neither available nor part of the operational output. Hence, to allow for an adjustment of the synthetic inflow turbulence to changing atmospheric conditions, we parametrize the Reynolds stresses based on the time-dependent mesoscale inflow profiles. We follow the set of parametrizations presented by Rotach et al. (1996) which they employed in stochastic dispersion modelling. Please note, the following set of parametrizations refer to the stream-and spanwise components of the Reynolds stress that are not necessarily parallel to the xor y-axis, respectively. In order to emphasize this, we indicate stream-and spanwise components with a tilde in the following. Rotach et al.'s (1996) parameterizations are based on parametrizations Brost et al. (1982) derived from observations in 20 stratocumulus-topped marine boundary layers, which often differ in their vertical structure and turbulence production compared to boundary layers over land. However, since Rotach et al. (1996) have successfully validated the set of parametrizations against observations over land for a wide range of stability regimes, we are confident that the chosen set of parametrizations can be universally employed. Based on the original formulation by Brost et al. (1982), the variance of the streamwise flow component r 11 is parametrized following Rotach et al. (1996) 25 who added a correction term (first term in Eq. 28) proposed by Gryning et al. (1987) to account for unstable near-surface stratification. Here, u * is the friction velocity, κ = 0.4 the von-Kármán constant, L o the Obukhov length, z i the boundary-layer depth and z being the height above ground. For neutral and stable situations the first term is ignored. Similarly, we estimate the variance of the spanwise flow component by adding a correction term to the original formulation proposed by Brost et al. 30 (1982): The profile of vertical velocity variance is taken from Gryning et al. (1987) as with w * being the convective velocity scale. The vertical transport of horizontal streamwise and spanwise momentum is estimated by Brost et al. (1982) as respectively. To our knowledge, there exist no comparable formulation to estimate r 21 in the literature. Hence, we decided to simply set r 21 = r 2 31 + r 2 32 , assuming isotropy of horizontal and vertical transport of horizontal momentum. To estimate the boundary-layer depth for a wide range of stability regimes, including buoyancy-and purely shear-driven boundary layers, we 10 calculated z i from a bulk Richardson number criterion according to Heinze et al. (2017) based on the bulk Richardson number Starting at the surface, z i is defined as the height where Ri b first exceeds the critical bulk Richardson number Ri b,c = 0.25, which revealed to be a robust criterion to estimate the depth of the layer with significant turbulent transports caused by the presence of the surface (Heinze et al., 2017). Here, u h denotes the horizontal wind speed from mesoscale model input, θ v the 15 virtual potential temperature, θ v,s the virtual potential surface temperature inferred from the second prognostic level above the surface, following Heinze et al. (2017), and g is the acceleration of gravity. In case of LOD = 1 input, z i is determined based on the mean profiles prescribed at the lateral boundaries, while in case of LOD = 2 input (xz− and yz− slices of boundary data), z i is determined locally at each (x, y)-boundary grid point and averaged horizontally afterwards.
In contrast to z i , which can be well estimated from the mean boundary-layer profiles, the friction velocity u * , used in 20 Eqs. (28)  for positive values of the mean surface sensible heat flux H 0 , else it is set to zero so that r 3,3 remains defined. Even in neutral or stable boundary layers a vertical velocity variance can be observed. Hence, in order to account the Reynolds stress parametrization for a wide range of stability regimes, we followed Rotach et al. (1996) who replaced w * in Eq. 30 by the mixed velocity scale w m (Troen and Mahrt, 1986) with with β = 0.6 according to Holtslag and Boville (1993).
Note that Eqs. (28), (29), (31) and (32) describe the flow characteristics in the streamwise and spanwise framework, which 5 is indicated by the tilde. In a mesoscale nesting with changing wind directions, however, the streamwise and spanwise flow directions do not necessarily coincide with the Cartesian grid axis which the prognostic velocity components relate to. Hence, the individual components of r are projected back onto the Cartesian grid by rotation about the vertical axis by the rotation angle defined by arctan (v/u).
Further, the synthetic turbulence generation requires information about the turbulent length and time scales. The turbulent 10 time scale of the flow is estimated according to Brost et al. (1982) with Parametrizations of turbulent length scales exist for the lower part of the boundary layer, (e.g. Flay and Stevenson, 1988;Salesky et al., 2013;Li et al., 2017;Emes et al., 2019), however, no parametrization of turbulent length scales that cover the entire depth of the boundary layer and all stability regimes exists to date to our knowledge. Hence, we calculate turbulent 15 length scales of the flow components according to Tennekes and Lumley (1972) using the parametrized Reynolds stress and the timescale: with i ∈ 1, 2, 3 indicating the streamwise, spanwise and vertical direction, respectively.
Assuming that turbulence is only present within the boundary layer, r, T , and L are faded for z > z i with Here, the fading function is designed so that Φ(z) rapidly decreases above the boundary layer.
In case of non-stationary flows, the turbulence parameters r, T , and L are adjusted hourly by default, but the frequency can be also modified by the user. We note that updating the turbulence parameters violates the temporal correlation expressed in Eq. (24). Hence, we performed a test simulation where we omitted the first term in Eq. (24) after updating the turbulence pa-25 rameters and we found no difference with respect to the spatial development of the flow (not shown), indicating that occasional violations of the temporal correlation have no significant effect on the development of the flow.

Simulation setup and statistical analysis
In order to test the implemented mesoscale nesting approach, we selected a particular weather scenario with a developing daytime convective boundary layer (CBL) that features advective conditions with moderate wind speeds and changing mean-30 wind direction. Moreover, the scenario is characterized by little to no cloud cover which is attributed to the fact that PALM We assumed a horizontally homogeneous and flat surface instead of the particular terrain, land use, and buildings at the 10 Berlin area. We made this idealization in order to be able to determine adjustment fetch lengths under time-dependent inflow conditions. Surface heterogeneities in terms of land use or terrain would modify the turbulent flow field locally, making it impossible to disentangle changes in the turbulent flow due to adjustment effects and surface heterogeneity. We employed the embedded land-surface model (see Maronga et al., 2020;Gehrke et al., 2020) to obtain sensible and latent heat fluxes at the surface, which we assumed to be fully covered with short grass. We chose this setup as a trade-off roughly reflecting the 15 prevailing land use in this area with distributed farm-and grassland, forest patches and urbanized environments. The soil was initialized with horizontally homogeneous profiles of soil temperature and soil moisture taken from COSMO. Since the soil properties in COSMO are aggregated over various surfaces and soil types, the soil conditions are not necessarily in equilibrium with the assumed grass surfaces as well as selected atmospheric conditions. Hence, we ran a two-day surface spinup as a precursor to the 3D simulation  in order to bring the soil into equilibrium with the atmospheric conditions 20 and to avoid spinup effects that may lead to varying heat fluxes at the beginning of the simulation. The incoming short-and longwave radiation was modelled using the Rapid Radiative Transfer Model for Global Models (RRTMG, Clough et al., 2005).
The simulated domain is located at 52.5 • N, 13.7 • E (PALM origin) and extends over 43.2 × 43.2 × 4.7 km 3 in the x-, y-, and z-direction, respectively, with an isotropic grid spacing of 25 m. Above z = 3 km -approximately 600 m above maximum boundary layer depth -the vertical grid was successively stretched up to 50 m vertical grid spacing. This allows for proper resolution of the convective boundary layer, though we note that the nighttime stably-stratified boundary layer is only poorly represented with the chosen grid spacing. The advection terms were discretized with a fifth-order upwind scheme (Wicker and 5 Skamarock, 2002); for the time-stepping we applied a third-order Runge-Kutta method according to Williamson (1980). We performed two simulations with different lateral boundary conditions. A first simulation, hereafter referred to as REF, where the boundary conditions were given as LOD = 1, i.e. horizontally-averaged profiles. Unless not further noted, we refer to this simulation in the following analysis. And a second simulation, where heterogeneous boundary values were prescribed (LOD = 2). This second simulation will be used to check whether the LES simulation results depend roll-convection artifacts 10 in the convection grey zone.
Hence, we calculated resolved-scale variances of the velocity components by Please note, in this study we will mainly focus on convective conditions, especially with respect to the spatial development of the flow. The nighttime stable flow is only poorly resolved at the given grid spacing, making it difficult to make reliable 25 conclusions concerning the flow adjustment. Here we will refer to follow-up studies.

Results
In the following section we show results from a mesoscale nested LES for a diurnal cycle. In order to better guide the reader through this section, we will first give a short outline of what to expect: First, we describe the boundary-layer structure and its development over the diurnal cycle in the LES as well as in COSMO. Subsequently, we discuss differences between the LES 30 and COSMO with respect to the boundary-layer representation and its implications in a nested simulation. In the following, triggered by imposed time-dependent synthetic turbulence, we focus on the spatial development of the turbulent flow within the  (André et al., 1978). where the turbulent flow is spatially fully developed. The averaging region for the inner-domain profiles is indicated in Fig. 12a.   Fig. 7 and 9), whereas PALM still simulates a positive sensible heat flux of > 100 W m −2 . In contrast, the surface latent heat flux in COSMO shows larger values in the afternoon compared to PALM, meaning that the bulk of the available energy is partitioned into the surface latent heat flux being not available for heating the boundary layer.
In addition, we performed a simulation where we prescribed the diurnal cycles of H 0 and LE 0 with values taken from 5 COSMO as shown in Fig. 10 rather than computing them using the land-surface model. Hereafter we refer to this simulation with PSF. This test was motivated to check whether the discrepancy between the inner-domain and the COSMO inflow profile is attributed to a possible misrepresentation of the surface energy balance attributed to the idealized setup with a homogeneous grass surface rather than a more realistic surface. However, except for minor differences with a slightly cooler boundary layer in the morning (see Fig. 9), a slightly shallower boundary layer at noon, and a slightly cooler boundary layer in the afternoon, As the COSMO profiles are mapped onto the inflow boundary, the more rapid evolution or the earlier stabilization of the boundary layer at 10 UTC and 16 UTC, respectively, also propagate into the PALM model domain. Figure 11 shows vertical cross-sections of the potential temperature and corresponding boundary-layer height. At 10 UTC and 13 UTC, according to the higher potential temperature at the inflow boundary compared to the inner part of the domain as shown in Fig. 9, the potential 20 temperature within the boundary layer and the boundary-layer height decrease with increasing distance to the inflow boundary.
This indicates that at these points in time, a deeper and warmer boundary layer is advected into the PALM domain. In contrast, at 16 UTC where the inflow potential temperature profile (see Fig. 9) already indicates a stable inflow with lower values of potential temperature, the potential temperature and boundary-layer height increase with increasing distance to the inflow boundary (Fig. 11c,d). Especially the spatial gradient of the boundary-layer height close to the inflow boundary indicates that 25 a significantly shallower boundary layer is advected into the PALM domain which further propagates downwind (Fig. 11d) later on. Also, with increasing distance to the inflow boundary, more and deeper convective updrafts, indicated by higher values of potential temperature, can be observed, while close to the inflow boundary only shallow convective updrafts occur.
This suggests that the stable stratification near the inflow boundary suppresses convection in the later afternoon. In particular the horizontal difference in the boundary-layer structure at 16 and 16:30 UTC shows that temporal changes on the inflow 30 temperature (and humidity, not shown) reach the inner part of the model domain with a time-lag. In this setup, it takes about 1 to 1.5 hours until the signals imposed at the inflow boundary reach the outflow boundary, meaning that the boundary layer becomes horizontally heterogeneous during the transition phase. We note that this is in contrast to the large-scale forcing approach by Heinze et al. (2017) where large-scale advection and nudging terms were considered at each location at the same  Richardson bulk criterion. The inflow boundary is on the right. Please note, the inflow direction is from the south-east, meaning that the xaxis does not correspond with the distance to the inflow boundary. The cross-section is taken at y = 32200 m (indicated by the black dashed line in Fig. 12 b) where the potential temperature is not affected by advection from the southern inflow boundary. Also note the different contour levels and colour bars in each panel, which we set to emphasize the horizontal heterogeneity of the simulated boundary layer at the different times of the day.  In contrast, at 16 UTC, where a more shallow and stable boundary layer accompanied with only small synthetic disturbances (see Fig. 13) is advected into the model domain, the amplitude of the up-and downdrafts is only small near the inflow boundary and increases farther downwind.

Characteristics of imposed turbulence
The parametrized Reynolds-stress components depend on the inflow profiles obtained from the mesoscale model input, i.e. 10 z i , as well as on u * , w * and H 0 , which were obtained from horizontal averaging over the entire model domain. While z i determines the depth of the boundary layer and, thus, the relevant window for the Reynolds stresses, the latter three determine the amplitudes of their components. becomes active (see the kink in the profiles), the Reynolds-stress components and length scales rapidly approach zero. For instance, at 13 UTC at about z = 2700 m the length scales approach zero, while the amplitude of the imposed disturbances are already nearly zero, meaning that only small perturbations are added above the boundary-layer top, which will be quickly dissipated. Qualitatively, the parametrized Reynolds-stress components resemble the variance profiles shown in Fig. 8, though r 11 (r 22 ) are under-(over)estimated compared to u u (v v ) around noon, respectively, and do not account for the secondary peak near the boundary-layer top. Furthermore, at 10 UTC (and 16 UTC) it strikes that the Reynolds-stress components 5 indicate a deeper (shallower) boundary layer compared to the velocity variances, respectively, which is in accordance with the horizontally heterogeneous structure of the boundary layer during transition periods (see Sect. 4.1), which again is owed to the fact that the onset and offset of convection is shifted between COSMO and PALM.
In summary, the parametrized Reynolds-stresses resemble the variances profiles created by the LES itself reasonably well during the course of the day. However, we emphasize that the imposed turbulence is only considered to be a rough description 10 to resemble the second-order statistics of a fully adjusted flow, while its spectral distribution, or higher-order moments are not accounted for. Figure 14 shows the spatial development of the resolved-scale TKE depending on the distance to the inflow boundary at different heights and points in time. Please note, the upper abscissa depicts the dimensionless distance to the inflow boundary using 15 convective scaling, which was originally developed by Willis and Deardorff (1976) to scale Lagrangian dispersion experiments.

Spatial development of the flow
Here, we use this to scale the travel-time d/u h of a signal imposed at the lateral boundary with the eddy-turnover time z i /w * in the CBL, where d indicates the distance to the inflow boundary in wind direction and u h indicates the mean boundary-layer wind speed (averaged over the depth of the boundary layer). The use of this scaling is in accordance with Muñoz-Esparza and Kosović (2018), who argue that the flow development primarily scales with u h /w * , describing well the dominant transition 20 mechanism. At 10 UTC, the TKE peaks at about 3 km and 5 km downstream of the inflow boundary within the lower and upper part of the CBL, respectively. This can also be observed visually in Fig. 12 where the amplitude of the up-and downdrafts close to the inflow boundary appears stronger. This TKE overshoot is a result of the boundary-layer adjustment process, where the peak location corresponds to the distance where the bulk of the initially uprising thermals reaches the inversion layer, starting to entrain warmer air from the free atmosphere into the boundary layer. This entrainment then slightly stabilizes the boundary  At 16 UTC the situation becomes qualitatively different. Even though the TKE within the lower part of the CBL peaks again close to the inflow boundary and approaches a nearly constant value farther downstream, it gradually increases starting at about 20 km downstream. Within the middle and upper part of the CBL, the TKE is close to zero near the inflow boundary, according to the only small amplitude of the imposed synthetic turbulence (see Fig. 13), and gradually increases towards the outflow boundary. This can also be observed in Fig. 12 where the up-and downdrafts near the inflow boundary are only weak 5 and become stronger further downstream, indicating that turbulence first needs to develop spatially. Due to the horizontally heterogeneous boundary layer with already weakly stable boundary conditions near the inflow boundary and still convective conditions farther downstream, the TKE does not reach an equilibrium value and a spatial adjustment length cannot be accurately determined for this point in time.
In order to investigate how the structure of the turbulent flow develops, Fig. 15 shows the corresponding horizontal profiles  Finally, we would like to note that we also simulated different scenarios with higher (2UH) and lower (05UH) wind speeds, as well as with increased (15RS) and reduced (075RS) shortwave solar radiation, in order to test the convective scaling. Even though the peak amplitudes in the horizontal TKE profile are different due to the modified forcing (see Fig. 17), the peak locations coincide with respect to dw * /(u h z i ) at the respective height levels. This, in turn, indicates that the required distance    needed to allow the flow to spatially develop is not an universal number but scales with u h z i /w * in a convective boundary layer, meaning that with higher wind speeds or less surface heating the required adjustment fetch increases (see Tab. 4).
Summarized, under convective conditions the turbulent flow is fully developed within the boundary layer after about 2.0 u h z i /w * and further adjustment effects farther downstream are only small. However, we note that the absolute distance required to allow for fully spatially developed turbulence is still in the order of kilometers, meaning that the model domain 5 should be sufficiently large to place the region of interest sufficiently apart from the inflow boundaries. . Previous studies with the WRF model revealed that these kind of up-and downdrafts are a numerical artefact rather than a realistic feature of the boundary layer when the boundary layer depth is within the range of the horizontal grid spacing (Ching et al., 2014;Zhou et al., 2014;Shin and Dudhia, 2016). Even though results from mesoscale WRF simulations are not necessarily transferable one to one to COSMO, we nevertheless assume a similar behaviour here. With a boundary-layer depth of ≈ 2.4 km at 12:00 UTC (see Fig. 7), 15 the dominant length scales of the flow approach the horizontal grid spacing, meaning that convection can partly be resolved on the COSMO grid. Figure 18a turbulence behind the turbulence-adjustment region, though they might affect the rate of evolution of turbulence near the inflow boundaries. This might be attributed to the different coupling time steps. In this study, the coupling time step was one hour, while Mazzaro et al. (2017) used one minute. Since the under-resolved roll-like structures are not necessarily stationary, the horizontal movement can be well considered with a one-minute coupling, while with only one-hourly coupling these imposed temporary structures are present over a longer time interval at the lateral boundaries so that the under-resolved structures 5 can effectively propagate into the model domain. As Fig. 18a indicates, this may introduce a location bias to the turbulent flow, so that the PALM solution becomes dependent on the presence of unrealistic flow structures in the mesoscale model. In such a case, modellers may consider using homogeneous boundary conditions which, as Fig. 18b shows, avoid the artificial generation of persisting structures. However, one should be aware that, especially for large LES domains approaching the size of mesoscale structures, large-scale gradients vanish and the mean mesoscale boundary-layer development may not necessarily 10 represent local conditions any more.

Implications near the inflow and outflow boundaries
In this section, we focus on the flow near the inflow and outflow boundaries. Due to different model representations of surface processes and different surface input data, the mesoscale near-surface wind and temperature profiles can deviate from the one the LES would simulate. As an example, Fig 19b shows   Here, the horizontal flow needs to be decelerated which causes a mean updraft close to the outflow boundary. We note that these mean up-and downdrafts are most pronounced at nighttime, ranging between 0.1-0.4 m s −1 in this setup. Even though at daytime these mean up-and downdrafts cannot be detected visually in the instantaneous vertical wind speed (see e.g. Fig. 12), we also found mean up-and downdrafts near the in-and outflow boundary but with a lower amplitude in the order of 0.05-0.1 m s −1 . Note that these near-boundary adjustment effects cannot be avoided by the mass-flux correction in Eq. 1 and 2, 25 which only acts on the global scale and ensures divergence-free boundaries, but are a result of the interconnection between the surface friction and the pressure solver to maintain incompressibility on all considered scales.

Computational efficiency of the synthetic turbulence generator
With respect to the mesoscale nesting, the generation of synthetic turbulence represents the computationally most expensive part, while setting the boundary conditions and enforcing a divergence flow field is computationally much less expensive. In 30 order to examine the efficiency of the turbulence generator implementation and estimate its computational cost, we have carried out a performance test. shows corresponding vertical profiles of the horizontal wind speed averaged over a 10 × 10 km area located at the domain center, as well as the lateral inflow profile from the COSMO solution. The wind blows from the east.
The most expensive part of the turbulence generation is the computation of the filtered random numbers, see Eqs. (25) and (26), which depend on the turbulent length scales. Therefore, based on the described setup in Sect. 3, we ran a set of idealized simulations where we varied the turbulent length scales. The length scales were set to a vertically constant value up to z = 2500 m, and to zero above. We performed 100 time integrations with a time step that was held constant at 0.5 s for all simulations. The simulations with different length scales were performed twice, one set where the computation of the 5 filtered random numbers is distributed over all available MPI-processes (hereafter referred to as distributed), and a second set where the filtered random numbers are only computed on boundary MPI-processes (hereafter referred to as non-distributed) and the rest of the MPI processes is on hold. The number of MPI-processes used for this scaling test was n = 1296. Figure 20 shows the consumed CPU time by the synthetic turbulence generator for different length scales. As expected, the consumed CPU time increases with increasing length scale. For small L i /∆x i the consumed CPU time for both, non-distributed and 10 distributed computation, is below 1% of the CPU time consumed in the time-stepping (see red lines) and no significant difference between both cases can be observed (black lines); this also shows that the additional MPI-communication required Finally, we note that for the simulation covering an entire diurnal cycle, the length scales vary significantly with lower values at nighttime and larger values around noun. For these simulations the turbulence generation consumed about 2.5% of the CPU time.

Summary and conclusions
In this paper we presented a mesoscale-nesting interface for the PALM 6.0 model system that extends PALM's capabilities to simulate atmospheric boundary layers under evolving synoptic conditions. The mesoscale-nesting interface, which currently relies on output of the COSMO model, consists of two components: (i) the preprocessing interpolation tool INIFOR which provides initial and boundary conditions as a netCDF file, and (ii) PALM's internal boundary condition routines which 5 read and process the initial and boundary conditions as well as imposed additional synthetically turbulent fluctuations. We described INIFOR's interpolation methodology in detail, beginning with the relevant model differences between PALM and COSMO, leading to the conceptual steps needed to interpolate COSMO model output onto the PALM grid. Since the interpolated mesoscale boundary conditions are essentially free of turbulent fluctuations, the flow first needs to spatially develop before the turbulent transport of momentum, energy and water can be analysed. In order to minimize the extent of development 10 zones near the lateral inflow boundaries which the LES model would otherwise require to generate turbulence via shear and convective instabilities by itself , we employed a synthetic turbulence generation method according to Xie and Castro (2008). Using this approach, spatially and temporally correlated fluctuations of all three velocity components are generated based on parametrizations of the Reynolds stresses as well as turbulent length-and time-scales.
We demonstrated the nesting interface and the effectiveness of the synthetic turbulence generation using a semi-idealized 15 benchmark case: we simulated a convective boundary layer developing near Berlin, Germany, on a clear-sky spring day using initial and boundary conditions derived from DWD's operational COSMO-DE analysis. For the sake of analyzing the spatial development of the flow, the case was idealized in that we assume flat terrain with homogeneous grass land instead of using more realistic land-surface heterogeneity, in order to disentangle turbulence built-up due to the synthetic turbulence generation and convective and shear instabilities from effects of the particular surface heterogeneities of the Berlin area. We found that the 20 flow rapidly develops up-and downdrafts, whereas the adjustment of the TKE takes a longer distance of about 2 − 3 u h z i /w * , meaning that the turbulent flow needs a fetch length that corresponds to at least two eddy turn-over time to be fully adjusted.
Even though the adjustment distance could be significantly reduced, it is still in the order of several kilometers, which means that significant parts of the computational resources are still required only for the spatial development of the flow. To further reduce the computational effort, an alternative could be to combine the mesoscale nesting together with PALM's self-nesting 25 , i.e. a relatively coarse grid resolution in the outermost parent domain and finer grid resolutions within the nested child domains. Another worthwhile branch of development would be the implementation of the the cell-perturbation method by Muñoz-Esparza and Kosović (2018) in PALM as an alternative to the synthetic turbulence generation to profit from recent promising developments in this regard (Mazzaro et al., 2019;Muñoz-Esparza and Kosović, 2018). For a stationary boundary layer, Muñoz-Esparza et al. (2015) showed that the cell-perturbation method requires shorter fetches compared to 30 synthetically generated turbulence according to Xie and Castro (2008).
In our benchmark case, the boundary-layer in the mesoscale COSMO model does not develop synchronously with the boundary-layer in the LES. For example, in COSMO the boundary layer develops more rapidly before noon and the evening transition starts earlier compared to the LES simulation. As the signals due to non-synchronous boundary-layer development are imposed to the inflow boundary, these propagate through the LES domain creating a horizontally heterogeneous boundary layer during the morning and evening transition phase. Furthermore, we observed under-resolved convective rolls emerging in the mesoscale model that, similar to Mazzaro et al.'s (2017) findings, propagate into the LES domain. In the present study, we eliminated these roll-like structures by averaging over the inflow boundary, being aware that especially for larger domains synoptic-scale horizontal gradients or effects of mesoscale topography cannot be considered (Mazzaro et al., 2019). To elim- Overall, especially the non-synchronous boundary-layer development and the imposed roll-like convection emphasize that the representation of the boundary layer in the LES and accompanied vertical gradients of wind velocity, potential temperature, etc.
depend on the boundary-layer representation in the mesoscale model. Suppose the boundary layer is not well captured in the mesoscale model, e.g. due to misrepresented convection and turbulent mixing, cloud cover, or atmosphere-surface exchange, 15 the boundary layer resolved in the LES will also be affected by this. In such cases, the physically more credible LES solution (with respect to the boundary-layer representation) will be continuously pushed towards the mesoscale solution. Here, further research is required to better understand the causes for such model discrepancies, under which circumstances they arise, and what the implications are for the representation of the turbulent boundary-layer flow.
Further branches of future development will be to enable INIFOR to also process WRF (Skamarock et al., 2008) and 20 ICON (Zängl et al., 2015;Reinert et al., 2020) output, as well as to add further prognostic quantities to the mesoscale nesting interface, e.g. chemical compounds, aerosols, liquid and frozen water. This is especially important to properly consider clouds and precipitation in the LES, which in turn also affect the surface net radiation and thus the entire boundary-layer development.
However, we expect that in many future applications with mesoscale nesting the outermost parent domain will only run with relatively coarse grid resolution, so that cloud physics will not be captured well in the LES, especially for high-altitude clouds. 25 Hence, we also plan to enable INIFOR to also provide incoming short-and longwave radiation fluxes, which, to date, PALM is already enabled to consider either from observations or manually from observations or mesoscale model output.
The main focus of this study was to demonstrate the capability of the mesoscale nesting approach and to confirm the effectiveness of the synthetic turbulence generation to reduce the fetch length needed for the model to develop balanced turbulence characteristics. Dedicated evaluation runs of the PALM 6.0 model system including the mesoscale nesting interface Code and data availability. The PALM model system 6.0 is freely-available at http://palm-model.org and distributed under the GNU General Public License v3 (http://www.gnu.org/copyleft/gpl.html). The preprocessor INIFOR is included in the PALM software repository as a utility and is currently available at https://palm.muk.uni-hannover.de/trac/browser/palm/trunk/UTIL/inifor. The simulations documented in the present article were performed using revision 4564 of the PALM model system, which includes INIFOR version 1.4.15. A complete archive of the software used for this publication, including the input data used, analysis and plotting scripts, as well as a step-by-step reproduction guide is available at https://doi.org/10.25835/0084787 (Kadasch and Sühring, 2020).
Appendix A: A note on large-scale pressure forcing 5 As opposed to our comment in the PALM 6.0 Overview paper , the geostrophic wind forcing is not required in the present mesoscale nesting interface, which we think, warrants further explanation. In idealized model setups, the assumption of periodicity is often used. However, using periodic boundary conditions prevents the model from developing any mean horizontal pressure gradient. Thus, if large-scale pressure gradients are important to a given problem, they need to be externally prescribed. This is often done by using an equivalent geostrophic wind profile that is obtained from the mesoscale 10 pressure and density fields P and ρ, respectively, which enters the model in the form of the additional forcing tendency in the horizontal momentum equations. With this external mesoscale forcing, even an atmosphere initially at rest, will eventu-15 ally develop a mean horizontal flow representative of the real conditions. In the case of inflow Dirichlet boundary conditions, the situation is reversed: the dynamic pressure develops a mean horizontal gradient as a result of internal forces in order to maintain continuity under the prescribed inflow boundary conditions.
In incompressible formulations, the pressure solution is obtained by constructing and solving a Poisson-type equation which can be obtained by applying the divergence operator to the momentum equation. The equation is simplified by exploiting the 20 incompressible continuity equation which represents a divergence constraint on the mass flux ρv. As a result, the pressure solution of the Poisson equation acts as to enforce this divergence constraint onto the flow field. In the case when Dirichlet boundary conditions are used for the velocity, any divergence resulting from the tendencies in the momentum equation, will be compensated by a corresponding gradient in the dynamic pressure solution. For instance, mean friction and mean Coriolis forces will result in pressure gradients opposing those effects.