The Radiative Transfer Model (RTM) is an explicitly resolved three-dimensional multi-reflection radiation model integrated into the PALM modelling system. It is responsible for modelling complex radiative interactions within the urban canopy. It represents a key component in modelling energy transfer inside the urban layer and consequently PALM's ability to provide explicit simulations of the urban canopy at metre-scale resolution. This paper presents RTM version 3.0, which is integrated into the PALM modelling system version 6.0. This version of RTM has been substantially improved over previous versions. A more realistic representation is enabled by the newly simulated processes, e.g. the interaction of longwave radiation with the plant canopy, evapotranspiration and latent heat flux, calculation of mean radiant temperature, and bidirectional interaction with the radiation forcing model. The new version also features novel discretization schemes and algorithms, namely the angular discretization and the azimuthal ray tracing, which offer significantly improved scalability and computational efficiency, enabling larger parallel simulations. It has been successfully tested on a realistic urban scenario with a horizontal size of over 6 million grid points using 8192 parallel processes.

Accurate representation of spatio-temporal radiative exchange processes is essential for realistic modelling of the atmospheric boundary layer, especially with the urban boundary layer. These processes determine the energy budget of the surfaces and thus strongly affect boundary-layer dynamics as well as the spatio-temporal distribution of temperature, moisture and other scalar variables. In contrast to synoptic-scale and mesoscale atmospheric models, microscale and building-resolving models encounter considerable challenges to accurately model such processes due to their fine spatial resolution and the heterogeneity of urban environments.

Many urbanized mesoscale models consider the vertical and horizontal radiative exchange within the urban canopy layer, including shading and multiple reflections, by assuming a strongly simplified urban surface structure

Consequently, the method and the sophistication of modelling the radiative exchange within the urban boundary layer vary in microscale atmospheric models (

After the original submission of this paper.

However, some of the microscale models with more explicit radiative modelling have limitations in simulating realistic urban domains. For example, some models use only the Reynolds-averaged Navier–Stokes (RANS) method for simulating the airflow, which is not always suitable for the geometrically complex and highly heterogeneous urban environments. Also, some of the mentioned models are not suitably designed and parallelized to work on high-performance supercomputers (HPCs) with hundreds or thousands of CPU cores, which makes the design and implementation of the explicit 3-D radiative exchange easier, but it severely limits the size and resolution of the modelled domains.

The RTM version 1.0

Radiation processes are traditionally modelled in PALM by a one-dimensional radiation model with simulation of vertical radiation exchange without any lateral interactions

This one-column approach, however, is not sufficient to model the surface energy balance inside the urban canopy layer. This area is typically characterized by complex geometry of terrain, buildings and vegetation, for which the radiative transfer processes in all directions cannot be neglected. Therefore, the PALM modelling system includes the Radiative Transfer Model (RTM) as part of PALM-4U (PALM components for urban modelling). This model takes the radiation from the PALM radiation model, e.g. RRTMG, as input and calculates the radiation processes taking place inside the urban canopy layer explicitly in a fully 3-D geometry. Through this, RTM provides the radiative fluxes and the surface net radiation including its components on the 3-D geometry, which are then used to model the surface energy balance, evapotranspiration in the plant canopy, and biometeorological quantities (see Sect.

The main goals of the RTM development were to create a computationally efficient model which simulates all substantial radiative processes taking place inside the urban canopy and which is fully integrated with the rest of the PALM model and its components, in particular

the spatial discretization of the domain matches the discretization of other parts of the PALM model;

RTM is executed as part of the PALM programme, and it utilizes its parallelization scheme; and

RTM utilizes PALM data structures and subroutines as much as possible and provides its results directly back to PALM and its modules.

The paper describes version 3.0 of RTM, which is part of PALM version 6.0. This paper is a follow-up paper to

RTM version 1.0, as part of the PALM-USM module, has been evaluated with respect to performance and accuracy on a small urban scenario in Prague–Holešovice

New discretization schemes for direct solar irradiance and for the sky view, which includes diffuse solar irradiance, longwave irradiance from the sky and reflection as well as emissions from surfaces towards the sky (see Sect.

A new discretization scheme for the reflected and emitted radiation between surfaces (Sect.

The novel azimuthal ray-tracing algorithm (Sect.

Plant canopy interaction with LW radiation (absorption and emission, Sect.

Evapotranspiration and latent heat flux in the plant canopy (Sect.

Bidirectional integration with the radiation forcing model (e.g. RRTMG, Sect.

Calculation of mean radiant temperature for selected levels aboveground with provision of radiant fluxes for the biometeorology module (Sect.

Integration of RTM within the PALM radiation module and coupling to all surface modules (Sect.

Multiple improvements, bug fixes and changes in interfaces with other PALM modules.

In order to quantify the differences brought by the new simulated processes and improved discretization schemes, a comparison study has been performed on the same scenario using PALM version 6.0 with RTM 3.0 with different sets of newly available simulated processes enabled. The results of this comparison are available in

The paper is organized as follows: Section

The PALM model discretizes the modelled domain using a regular three-dimensional grid. The model supports arbitrary rotation of the grid around the vertical axis, with the default having the

The

For each horizontal coordinate

This limitation was present at the time of the original submission of this paper, but since then the fully 3-D geometry has been implemented in RTM version 4.1 starting from r4671;

The RTM considers two spectral ranges of electromagnetic radiation independently: shortwave (SW) visible solar radiation and longwave (LW) thermal radiation. The modelled radiation originates from the sun, the atmosphere and all the modelled surfaces. The result of RTM is the amount of absorbed, reflected and emitted radiation for every face (both horizontal and vertical) and the amount of absorbed and emitted radiation for each grid box containing resolved plant canopy (plant canopy grid box, PCGB). The model follows the radiation as it spreads from sources and as it propagates through the urban canopy layer and reflects off individual faces, taking into account model geometry, shading and mutual visibility between the faces, partial transparency and/or opacity of the plant canopy, and reflective properties of the individual faces. Figure

Radiative processes simulated by RTM version 3.0.

To limit the computational effort to a reasonable level, some less important processes have to be simplified or disregarded. These are the following.

The discretization of RTM uses the same Cartesian grid as the rest of the PALM model. Each radiative quantity is modelled as a singular value per surface discretization unit (face), and the propagation of radiation is described as interactions between mutually visible faces.

The model considers all reflections and emissions to be Lambertian (i.e. ideally diffuse), following Lambert's cosine law whereby the amount of radiation leaving the surface in one direction is proportional to the cosine of the angle

For any two mutually visible faces

Calculation of the view factor between surface

The value of

This formula allows for the description of the face

The view factor values carry all the information about the geometry of the urban layer necessary for calculating propagation of reflected and emitted light among surfaces. Once they are known, calculation of the instantaneous fluxes can be reduced to simple vector multiplication. Determining the view factor values consists of multiple steps.

The second step is implemented in RTM using a

Due to this complexity, the ray-tracing task takes place during the model initialization phase before the actual simulation of time steps begins. The values representing the view factors and other relevant data are precomputed, exchanged among the parallel processes and stored in such a way that the number of calculations and MPI communications performed during computation of time steps is minimized.

RTM version 3.0 offers two selectable methods for simulation of the irradiance of each face by providing two different schemes for discretization of the view from each face, which is represented by a set of irradiance factors. The legacy discretization scheme (originally introduced in RTM version 1.0; see

The current angular discretization scheme uses a different simplification with a better trade-off between complexity and accuracy and a guaranteed worst-case total number of view factors of

Illustration of the legacy view discretization scheme. For the highlighted face, ray tracing is performed between its centre and the centres of its visible faces, creating a set of its view factors.

While establishing the mutual visibility between two faces, the path between the faces is represented by a single ray connecting their centres. This is in accordance with the general principle of discretization by a rectangular grid, on which the area or volume covered by each face or grid box is represented by a single scalar value and the resolution can be increased for more spatial precision at the expense of computational resources (see Sect.

Together with simplifying the ray tracing, the view factor value calculation is also simplified. Instead of solving the full integral in Eq. (

The induced error is smaller for very distant faces and larger for faces close to each other. This error is considered acceptable within the resolution of the model, as it can always be reduced by increasing the resolution. The more important issue of this approach is that the sum of the approximate view factor values is no longer guaranteed to equal 1. Because of this, the modelled system could artificially gain or lose energy and possibly even diverge exponentially in time. To guarantee the conservation of energy, the normalization of the approximate view factor values is used in order to maintain Eq. (

The asymptotic complexity and scalability of the RTM can be evaluated using two different approaches: considering either a domain growing in size horizontally, while the vertical size and typical shapes of obstacles are kept constant, or considering a gradually increasing resolution for the same domain, which increases the amount of discretized data in each dimension.

The complexity and scalability for the latter case can be determined exactly. The number of faces increases proportionally with the surface area. For a domain with a size of

In order to analyse the scalability of the algorithm, assume that the number of processes used for the calculation grows by the same factor as the size of the 3-D grid, i.e. by

The situation is better in the first case in which the domain of size

To reduce the high number of view factors with the legacy discretization scheme, RTM allows for the exclusion of some view factors that are considered less important. First, a minimum value

The normalization described in Sect.

The asymptotic complexity of the legacy scheme does not allow simulations of very large domains with horizontal sizes of the order of millions of grid boxes or more. Furthermore, if the view from some face is composed of both very close and very distant faces, the computational resources are used unevenly: proportionally fewer resources are spent on close faces, each of which represents a higher share of the face's view and also a potentially greater share of its irradiance, while more resources, often a majority, are spent on less relevant distant faces. Neglecting the smaller view factors as described in Sect.

Thus, we introduce a novel angular discretization scheme for reflected and emitted radiation. The general motivation for this approach is based on the observation that the properties of most surfaces are smooth in space, and thus two faces next to each other tend to have similar properties and radiate similarly more often than two generic unrelated faces. This consideration leads to the idea of representing a target face's irradiation from multiple neighbouring distant faces by a single view factor that uses the radiation from one of them, but its view factor value represents all of them. This approach allows for the use of the computational resources more efficiently.

The angular discretization scheme divides the view from each face into a fixed number of directions specified by uniformly distributed azimuth and elevation angles, as opposed to the uneven set of directions towards the centres of every other visible face in the legacy discretization scheme. Ray tracing is performed towards this fixed set of directions with considerable optimization due to the fact that multiple rays of this set share an identical horizontal direction (i.e. azimuth; see Sect.

A 3-D representation of the angular discretization scheme for a horizontal face

This approach is equivalent to the ray-tracing algorithm used in computer graphics; the only difference is that the ray directions in computer graphics correspond to individual pixels of the simulated camera's sensor and often some super-sampling is used for anti-aliasing. This similarity demonstrates that the result of this ray-tracing arrangement represents a reasonable simplification of view from the selected target and also that the accuracy can be improved as needed by increasing the angular resolution, i.e. the number of discretized azimuth and elevation angles.

An additional benefit of the angular discretization is the fact that the view factor values, if calculated analytically, always add up exactly to 1 and there is no need for normalization. A single face often represents an obstacle detected in more than one direction. In such cases, the respective view factors are aggregated to save resources. For faces very close to each other, the sum of the view factor values representing those directions is typically more precise than the normalized approximate value calculated using Eq. (

With angular discretization, the view from the centre of each face is divided into sections, each of which is bounded by azimuth angles

The section of view between

For a horizontal face, the normal angle

In the case of a vertical face, the calculation depends on the orientation of the face. The calculation is presented for a northward-oriented face, for which the face normal

The angular discretization scheme greatly improves scalability, which can be demonstrated by following the two scaling approaches introduced in Sect.

The CPU time and interprocess communication demands for ray tracing are slightly higher than that because the average separation distance (i.e. ray-tracing length) grows with increasing resolution. For horizontal domain enlargement, only some ray-tracing directions will have greater distances, while for increasing resolution, all distances will be proportionally longer. In both cases, the demands are of the order of

The direct and diffuse components of the incoming solar radiation and the thermal radiation from the sky towards surfaces are represented using the sky-view factor (SVF). It represents the portion of view from individual faces towards the sky which is not occupied by other faces. If the sky is viewed as an imaginary face, SVF makes the system of faces enclosed as specified in Eq. (

The radiant fluxes from the sky propagate through the urban layer similarly to the reflected and emitted radiation from the faces with the exception that the source lies outside the urban layer. As the intention of the design is to avoid ray tracing during model time stepping for the reasons explained in Sect. 2.1, all ray tracing representing these rays is also done in advance during the initialization phase of the model just like with the other rays.

RTM version 3.0 represents the sky by a single SVF. The value of

When plant canopy simulation is disabled, the only information necessary to calculate the SVF for a specific location is the horizon height in each discretized azimuth direction, as described in detail in Sect.

The calculation of direct solar radiation during the model initialization phase is complicated by the fact that the apparent position of its source, the sun, and therefore the geometry of all rays, changes throughout the day, while all the other radiation sources in the model have fixed positions and geometries and only the values of their radiant fluxes change in time. RTM solves this problem by discretization of the apparent solar positions and performing ray tracing between these predetermined apparent solar positions and corresponding faces during the model initialization.

For a typical simulation which spans times of the order of hours or days, there is a fixed number of apparent solar positions (at most the number of radiation time steps), which is further reduced by discretization of azimuth and elevation angles using the nearest discretized direction. RTM uses the discretized directions that are already used for calculation of

For each discretized direction

The resolved plant canopy in RTM is represented as a 3-D discretized field of leaf area density. RTM simulates the absorption of SW and LW radiation from the sun, the sky and modelled surfaces (i.e. shading by plants), as well as the thermal emission of LW radiation from the plant canopy towards the sky and the surfaces (see Fig.

The main objects of radiative modelling in RTM are surfaces; the plant canopy is part of the process, but the focus remains on its interaction with surfaces. The data structures are organized accordingly and the ray-tracing algorithm is adapted to that as well. This arrangement allowed for the additional modelling of LW plant canopy emission and absorption into RTM with no data overhead and a negligible increase in computational time.

This section describes the attenuation by the plant canopy of all the rays that are simulated by the ray-tracing algorithm – it applies to rays between faces, but also to the rays representing the diffuse and partially the direct solar radiation; the absorption of direct solar radiation is described in Sect.

As the ray-tracing algorithm follows the ray from the source to the target, the attenuation is quantified for each PCGB that the ray intersects. Since the ray tracing is performed during the initialization phase of the simulation, the actual radiant flux carried by the ray is not yet known, but the attenuation can be expressed as the absorbed fraction of the flux that enters the PCGB. This fraction remains constant in time and independent of the absolute value of the radiant flux, as long as the leaf area density, on which the optical density of the plant canopy is based, remains constant. For this reason the RTM currently does not allow changing the LAD values during simulation time, which is usually not a problem for typical simulations lasting several days.

The plant canopy within the volume of each discrete PCGB is considered homogeneous, and the leaves are assumed to be randomly oriented. In reality the distribution of leaf orientation may be non-uniform, but this also depends on the tree species, the season, sun direction, and wind speed and direction. As some of these are non-constant during the simulation, and also the effect on absorption is less important than the distribution of LAD within the tree crown

The ratio of the flux

The exponential attenuation with respect to depth matches the Beer–Lambert law. As a continuous model of a discrete sub-grid process, it would correspond to an idealized case with non-transparent and non-reflective leaves wherein all leaves are homogeneously and randomly distributed in the volume of the grid cell; in that case, the only radiative flux passing through the cell would be the free rays that intersect no leaves. In reality, the transmittance of the tree crown is higher than that – the leaves themselves are semi-transparent and some further light is transmitted due to multiple reflections at the surfaces of the leaves. However, the attenuation with semi-transparent leaves is still exponential with respect to depth, and even the measured attenuation in homogeneous LAD media is close to exponential

For a ray that passes sequentially through PCGBs

The total flux

Modelling of the plant canopy thermal emission follows the concept outlined earlier. The emission from the plant canopy is considered from the target face's point of view, while the internal LW radiation exchange inside the plant canopy (among individual PCGBs and intra-grid exchange) is omitted.

Due to the reciprocity property of view factors, the CVF actually represents the fraction of view from face

This enables straightforward modelling of the thermal emission originating from the leaves in PCGB

Thermal emission from the plant canopy towards the sky has to geometrically match the absorbed LW radiative flux from the sky in order to avoid biases in the total energy budget of the modelled domain. It is computed in a similar manner using the special CVF entries, which have the sky as the source instead of a face (their calculation is described later in Sect.

As described in Sect.

In order to determine the direct radiative flux entering each PCGB with respect to shading by obstacles and partial shading by other PCGBs, RTM performs a separate ray-tracing procedure starting backwards from the centre of each PCGB towards the discretized apparent solar directions. For this ray-tracing process, no canopy-view factors are stored, and only the total ray transmittance is determined and stored for each PCGB

During time stepping, the transmittance of the corresponding ray

RTM uses a sub-grid discretization model which sends an array of

Mean radiant temperature (MRT) at a certain point in space is defined as the temperature of an imaginary object for which that object would be in radiative equilibrium with it surroundings, which means that the absorbed irradiance would be equal to the emitted radiant exitance. Calculation of the MRT is closely related to the radiative processes in the RTM, and thus it is implemented with advantage inside this module. This allows for the use of a similar approach and reuse of existing routines; it also ensures that MRT is calculated with the same discretization scheme as the scheme used in RTM for the calculation of LW and SW radiation, which allows users to avoid utilization of some highly simplified yet common approaches. Calculated MRT values are available directly in RTM in the form of PALM output variables, and they are provided to the biometeorology module for calculation of biometeorological quantities related to human thermal comfort (see Sect.

Considering both LW and SW radiant fluxes for a hypothetical object with emissivity

The calculation of the MRT utilizes a similar concept as the calculation of irradiance with angular discretization. For each point at which MRT would be simulated, the MRT factors are calculated during the initialization phase of the model run. MRT factors are the equivalent of the average irradiance factors for the whole surface of the hypothetical sphere, which means that there is no dependence on the direction of irradiance. For face

The MRT factors are precalculated using the 2-D ray-tracing algorithm with angular discretization of the whole view, together with MRT sky-view factors and direct solar irradiance transmissivities for each MRT point. Depending on configuration, the MRT can be calculated for the centre of every grid box in the first layer above a terrain or even in multiple vertical layers.

The pure physical MRT value is usually defined with respect to a spherical black-globe thermometer. On the other hand, the biometeorology applications require the MRT value related to a clothed human body, which is tall and narrow, and it is therefore affected by radiation from its sides proportionally more than by radiation directly from above. It is modelled in RTM as a configurable asymmetrical generic object with specified albedo, emissivity and aspect ratio. These MRT values for the hypothetical human body are then provided to the biometeorology module. Further details are described in Sect.

The basic ray-tracing algorithm, which is used in RTM when the legacy discretization of view is enabled, was first implemented in RTM 1.0 as part of PALM-USM 1.0, and is it carried over from previous versions of RTM with minor changes like the addition of options

With the introduction of the angular discretization in RTM version 2.0, a new variant of the ray-tracing algorithm was developed, which was highly optimized for this angular discretization. This algorithm is further called 2-D ray tracing.

This is a novel algorithm which takes significant advantage of the specific data representation and parallelization of the PALM model. The 3-D fields in PALM are represented as arrays for which the

The algorithm utilizes the following feature of the 2.5-D geometry: for every point of view and for every azimuth there is a distinct horizon (

The core of the 2-D ray-tracing algorithm works by following a discrete set of azimuths from point

To determine partial shading by the plant canopy, RTM needs to track more than just the horizon angle for each traced azimuth. The plant canopy may have a diverse vertical structure; thus, an evenly discretized set of elevation (zenith) angles is tracked for each azimuth. This forms a uniform, regular set of directions, which is used for all types of radiative processes; it is used for calculation of the sky-view factors, direct irradiance transmittance and also for the angularly discretized view factors towards other surfaces. This way, a single 2-D ray-tracing routine computes all the respective values at once without any overhead.

During tracing of each ray, the information about LAD all along the ray path is needed. This information is distributed in particular MPI processes and needs to be obtained by means of MPI communication. In order to reduce fragmentation of one-sided MPI operations, the 2-D ray tracing requests all LAD data for all applicable PCGBs belonging to the whole half-plane cross section (one discrete azimuth) in all required vertical levels at once. When these data are retrieved from all involved MPI processes, the RCSFs are generated in a two-pass calculation for each discrete azimuth – from point

The 2-D ray-tracing algorithm needs to determine the complete information for the time-stepping radiation calculation, including the index of the opposing face, the view factor value and the total transmittance of the connecting ray, for each discrete direction under the horizon angle. As specified in Sect.

The index of the opposing face has to be determined using MPI one-sided communication request (MPI

Obstacle identification algorithm (vertical cross section).

For each new grid column processed during ray tracing, there may be at most one new horizontal face and zero or more vertical faces identified as new opposing faces. The identification algorithm can be demonstrated with an example shown in Fig.

The generated VF entries for the opposing faces are sorted and aggregated for each ray-tracing origin (after ray tracing towards all discretized azimuth angles), creating at most a fixed number of entries that do not need to be normalized, as described in Sect.

RTM radiation interaction is called in PALM after every call of the one-column radiation scheme (e.g. RRTMG), which is applied in regular configurable intervals. For each radiation step, the radiative fluxes on the top of the urban canopy layer are updated first, and then the RTM calculates the fluxes within the urban layer using inputs from its top border.

The implementation of the time-stepping part of RTM is straightforward, and its changes since RTM 1.0 are only related to the addition of newly simulated processes, like the plant canopy LW interaction and the calculation of MRT. The current implementation in RTM 3.0 is fully described in Sect. S1.2.

This section presents the parts of the RTM module that are responsible for interaction with other modules in PALM, together with the respective parts in those modules that were added by the authors of this paper in order to enable the coupled simulation of the described processes.

As described in Sect.

The implementation is based on calculating effective radiation surface parameters for the radiation model: an effective surface emissivity

For LW radiation, the lower boundary condition of the forcing radiation model can be expressed as

Here,

The standard choice that the normalizing area

For SW radiation, the lower boundary condition of the forcing radiation model can be expressed as

Radiative transfer between the atmosphere and surfaces as well as among surfaces themselves depends on the surface temperature, which is the result of the surface energy balance calculated in the surface modules. However, one of the components in the surface energy balance is the surface net radiation, which is calculated in the RTM. The exchange of information between the surface modules and the RTM is therefore mutual.

PALM includes two surface modules: the land surface module (LSM) for natural-like surfaces, such as vegetation-covered, water and pavement surfaces, and the building surface module (BSM) for building surfaces such as walls, windows and roofs. Both modules solve the energy balance for each surface, partitioning the available net radiation into ground–wall heat fluxes, as well as sensible and latent heat fluxes. For a detailed description of LSM and BSM see

Each of the discrete surfaces may have distinct soil or wall material properties, such as heat capacity or conductivity, as well as distinct surface properties such as albedo, thermal emissivity and roughness length. In the LSM a face (i.e. surface element in LSM and BSM terminology) is assumed to be either vegetation, water or pavement, while in the BSM a surface element is further divided fractionally into walls, windows and green surfaces. Each fraction exhibits distinct radiative properties. For performance optimization reasons, the corresponding properties and state variables for the surfaces are stored within a dynamic data structure, which encompasses arrays for various surface variables. Each type of surface with different a spherical orientation has its own derived data structure defined; e.g. northward- and southward-facing BSM surfaces can be accessed individually without further if–else conditions necessary. This way of representing the surface allows for the execution of surface-energy-related code in a consecutive manner without hampering loop vectorization. However, RTM solves interactions between all surfaces and it thus needs, again for optimization reasons, one single array of surface properties and state variables. Hence, surface information from the respective arrays of the derived data structure is gathered into a single linear array before the RTM code is executed. This is done for the surface temperature, albedo and emissivity. For fractional surfaces, these values are calculated as the weighted average of the different fractions (wall, window and green fractions).

After the radiation interactions are performed in the RTM, the resulting LW and SW radiation fluxes at the surfaces are distributed back onto the surface-type data structure. Subsequently, the updated radiation fluxes at the surfaces are supplied to LSM and BSM.

An important process associated with plant canopies is transpiration of water vapour from the green parts of plants. It is actively controlled by plants by opening and closing stomata and thus changing the resistance of the leaf surface against the evaporation of the leaf water. This process is mainly affected by the incoming SW radiation, air temperature, air humidity and the soil water content

Calculation of the plant canopy transpiration rate is based on the Jarvis–Stewart model in the form described in

The resulting latent heat fluxes and humidity gradients then enter the prognostic equations of humidity and potential temperature

The biometeorology module in PALM

The calculation of MRT is closely related to the RTM radiative processes. This fact allows for the calculation of MRT inside RTM with little additional effort and overhead utilizing the existing RTM routines (see Sect.

The RTM provides the MRT values for the BIO module in the form of separate SW and LW mean irradiance for each simulated MRT box. This approach allows the BIO module to process the incoming fluxes independently and to apply the radiative properties of the human body (albedo and emissivity) inside the BIO module. The shape of the simulated body, however, affects the MRT factors, and thus it needs to be defined inside the RTM. The current version of RTM contains three selectable types of MRT body geometries: sphere (simulated globe thermometer), ellipsoid and a simple human body parameterization, with the possibility to supplement other arbitrary geometries. The ratio of the major and minor axes of the elongated shapes is configurable in RTM with the default of

This section presents an evaluation of the convergence and computational performance of the current RTM implementation. A validation of the whole PALM model with RTM in a realistic urban environment against a comprehensive set of observations for a large scenario in Prague–Dejvice is presented by

The simulations presented in Sect.

The surface geometry and properties used in RTM are available with a certain level of detail and discretized by a regular grid. Hence, a natural expectation would be that decreasing the grid spacing below a certain level would not introduce new information and that the RTM would converge to one solution. With RTM, increased model resolution leads to a higher number of finer faces and PCGBs. In order to investigate how sensitive the resulting radiative fluxes are to model resolution, multiple simulations have been performed for the small urban scenario with resolution halved iteratively from 8 m down to 0.5 m. Because only radiative fluxes are of concern in this experiment, only one daytime time step was compared.

The finest simulation with a resolution of 0.5 m is taken as the base case, and other scenarios are compared to it by radiative fluxes at the matching surfaces. Finer resolutions mean increased detail in the 3-D structure of model surfaces; therefore, not all surfaces represented in the finer-resolution scenario correspond to the coarser-resolution scenario. In this experiment, around 70 %–80 % of fine-resolution faces could be matched to respective coarse-resolution faces. The results are shown in Fig.

On double-logarithmic scales the radiant flux errors decrease almost linearly. The largest errors can be observed in the SW fluxes, with deviations to the reference case of almost

Double-logarithmic presentation of mean deviations of surface SW and LW irradiance as well as net radiant flux against the finest-resolution case of 0.5 m.

The angular resolution of the angular discretization scheme (see Sect.

The angular discretization resolution also controls the discretization of direct solar irradiance; therefore, different angular resolutions also lead to a different number of discrete apparent solar positions throughout the day. For this experiment, a 1 d long simulation was performed with five different angular resolutions: 18

Table

Scaling of angular resolution.

The results for the convergence are shown in Fig.

Double-logarithmic presentation of mean deviations of surface SW and LW irradiance as well as net radiant flux for different angular resolutions. Mean deviations are shown relative to the finest angular resolution of 1.125

In order to quantify the appropriate number of reflections for typical urban scenarios, a simulation of the small urban scenario was performed for one time step of a summer daytime simulation with a different number of reflection steps. To evaluate deviations in net radiant flux values, the reference scenario was simulated with an excessive 300 reflection steps, for which all remaining unreflected flux values are almost zero, i.e. below the lowest positive value of the floating-point numerical representation.

A double-logarithmic presentation of potential and actual errors in SW and LW radiation caused by an insufficient number of reflections. The maximum and mean of the remainder of unreflected radiation per surface are shown as lines; the absolute discrepancies of net radiant flux compared to a perfectly reflected scenario are shown by individual points (maximum, 95th percentile and mean). The net flux errors above 15 reflections are zero (below the floating-point resolution), and so are the 95th percentiles of LW error above 8 reflections.

The results are shown in Fig.

These results support the recommendation to use the default RTM configuration value of three reflection steps for most scenarios. Considering that with the default radiation update interval of 60 s, the RTM uses only a small fraction of time-stepping computational time, the number of reflection steps can be increased to e.g. five with negligible computational costs.

To verify model scalability, a horizontal scaling experiment was performed on the Salomon supercomputer at IT4Innovations National Supercomputing Centre

The original model domain was doubled iteratively in both the

For each domain size, a short 10 min simulation was performed, and the durations of individual tasks from model initialization and model time stepping were recorded together with the number of view factor data entries as a measure of memory complexity. The radiation update interval was 60 s.

Scaling of the number of view factor entries.

Table

A double-logarithmic presentation of the computation time spent for different sub-tasks while simulating progressively larger domains (by the means of horizontal quadruplication). Each simulation uses a constant number of processes per horizontal tile. The sub-tasks shown are RTM initialization and time stepping along with time stepping of the rest of the model as a reference. Time-stepping time is shown for a 1 d long simulation as extrapolated from the 10 min test simulations.

The computational time measured by the scalability test is presented in Fig.

The temporal scaling of the time-stepping phase of RTM is shown together with the time stepping of the rest of the model as a reference. RTM calculation takes between 2 %–5 % of the time-stepping phase. The largest simulated domain with 8192 parallel processes running on 342 individual nodes displays slight worsening of the scaling curve for both RTM and for the rest of the model, probably due to the growing complexity of interprocess data exchange. Future versions of RTM may be improved for the largest domains thanks to planned optimization of the amount of exchanged radiative flux data (see Sect. 6).

A double-logarithmic presentation of computational time versus the number of processes for a small scenario, typically suitable for 16–32 processes.

Figure

We can see that between 1–16 processes the parallelization of both the initialization and time-stepping phases is very good, even though the radiative interactions have very strong spatial interdependency, meaning significant mutual data exchange between subdomains. For further increasing the number of processes, both the RTM initialization and the RTM time stepping become less efficient. This is attributed to the relative increase in costs for MPI communication compared to the cost of computations performed on each process, which is in accordance with Amdahl's law of strong scaling. In other words, when the subdomains become too small the speed-up with an increasing number of processes becomes less efficient.

A 3-D representation of instantaneous net SW

This paper gives a description of the significantly updated and extended model RTM 3.0 in PALM. It focuses on new and redesigned features in comparison with RTM 1.0, which was described in

Also, sensitivity tests on performance-affecting configuration options (spatial model resolution, resolution of angular discretization and the number of reflection steps) are presented in this study, supporting their recommended configuration for typical urban scenarios. Finally, the applicability of RTM to large real-life scenarios is presented, demonstrating that the computational demands of RTM are in line with other components of the PALM model with respect to domain size.

Model validation on large scenarios and long-term experience with various realistic simulations have also identified specific weak points in model representativity and RTM's potential for further improvements

Several modules in the PALM model, as well as the model core, now support fully 3-D structures with downward-facing faces e.g. at bridges or lateral openings to courtyards. However, in many real-world scenarios overhanging structures are infrequent and only occur at a minor number of grid points. The current ray-tracing algorithms takes advantage of the 2.5-D geometry to improve computational efficiency. Hence, the proposed update will still use the simplifications made for the 2.5-D geometry while enabling the fully 3-D support only at grid points where required.

This feature was not available at the time of the original submission of this paper, but since then it has been implemented as described in RTM version 4.1 starting from r4671:

For now, the representation of obstacles in PALM is fully based on the Cartesian grid; i.e. a grid box is either fully obstacle or fully atmosphere. As a consequence, surfaces that are actually slanted in reality, such as roofs and natural slopes, are represented as step-like surfaces. Beside implications for microscale flow biasing e.g. the surface friction, such step-like representation increases the total surface area in the model, which affects the amount of radiative flux and adds artificial shading and reflections. Future developments of PALM include the implementation of the immersed boundary method (IBM)

All reflections are treated as Lambertian, i.e. fully diffuse, in the current version of RTM. Surfaces with mainly specular reflections, such as glass and polished metal surfaces, thus cannot be represented realistically. Multiple ways to implement specular reflections in RTM have been considered, but the feature would be of limited use with a strictly Cartesian grid; therefore, the decision on how to implement specular reflections is being postponed after the implementation of immersed boundary conditions.

In the current parallelization of ray tracing, the rays are traced as a whole in the process which owns the subdomain of the ray's target. This has many computational advantages (see Sect.

A substantial change in the ray-tracing algorithm is being considered, whereby each ray would be divided among segments belonging to individual subdomains. The process owning the subdomain of the ray's target would successively ask the respective processes that own other segments of the ray to perform the ray tracing of those segments, and it would aggregate the results. However, this algorithm could be significantly slower for small domains. The advantages and disadvantages need to be verified, and the new algorithm can be implemented as an optional alternative to the current ray-tracing algorithm, possibly with automatic switching.

In the current implementation, the leaves in the plant canopy are considered randomly oriented, as discussed in Sect.

Another considered change is related to absorption of direct solar radiation. In the current implementation, the radiative fluxes absorbed in the plant canopy are discretized differently for the direct solar radiation and for other radiative fluxes (diffuse, reflected and emitted radiation). A separate ray-tracing cycle is performed to calculate ray transmittances for the discretized apparent solar positions for each PCGB and a sub-grid model used for the direct solar irradiance (see Sect.

The proposed change also uses a similar approach for the direct irradiance of the plant canopy – using only the attenuation of the rays to and from faces. This approach has multiple benefits: it avoids extra ray tracing, which can take a significant amount of time for large domains with a lot of plant canopy; it unifies the discretization for all radiative fluxes in the plant canopy, and it guarantees that the total plant canopy heat flux from direct solar irradiance equals the sum of the irradiance deficit at surfaces caused by partial shading from the plant canopy. However, it neglects the absorbed fluxes from rays that would pass the domain without striking any surface. This is only relevant for plant canopy near domain boundaries; on the other hand, such areas always suffer from a lack of simulated radiative interaction with elements outside the domain, and they cannot be considered representative anyway. Another potential problem is a risk of the Moiré effect in the spatial distribution of plant canopy heat flux, which needs to be examined in realistic scenarios.

In addition to that, the magnitude of the error induced by neglecting SW radiation scattered by plant leaves towards directions other than the direction of the incoming radiation is being studied. The actual significance of this is strongly dependent on the structure of the plant canopy and its surroundings. Possible improvements of the model in this regard are being considered, and the impact on model performance versus the significance of the error is being evaluated.

Current implementation of interprocess data exchange in time stepping uses the MPI

Two different approaches are currently being considered to improve the scalability of this particular code. The first one takes advantage of the fact that typical simulations are performed on clusters with many CPU cores per node; selected arrays can be allocated in shared memory with local access for all MPI processes running on the particular node, avoiding the need to allocate identical global arrays for each process and reducing intra-node communication.

The other considered approach involves creating a face visibility mapping among MPI processes; each process allocates an array of visible faces from other subdomains that are grouped and ordered by MPI process rank and exchange a minimum amount radiosity data using the MPI

RTM 3.0, as part of the PALM model, is free software. Its source code is distributed under the GNU General Public License version 3 (

The supplement related to this article is available online at:

PK and JR are the core authors of the methods, algorithms and implementation of the RTM, including the coupling to the BSM, PCM and BIO modules. MS is the author of RTM integration to the PALM radiation module as well as the coupling to the BSM and LSM modules. SeS and MHS created and validated the RTM coupling with the forcing radiation model. VF is the author of the evapotranspiration and latent heat flux model. All authors contributed to the text of the article, as well as to debugging, validation and maintenance related to the RTM.

The authors declare no competing interests.

The simulations were performed on the HPC infrastructure of the Institute of Computer Science of the Czech Academy of Sciences (ICS) supported by the long-term strategic development financing of the ICS (RVO:67985807) and partly in the IT4I supercomputing centre, which was supported by the Ministry of Education, Youth and Sports from the Large Infrastructures for Research, Experimental Development and Innovations under project “IT4Innovations National Supercomputing Center – LM2015070”.

Financial support was provided by the Operational Program Prague – Growth Pole of the Czech Republic under the project “Urbanization of weather forecast, air-quality prediction and climate scenarios for Prague” (CZ.07.1.02/0.0/0.0/16_040/0000383), which is co-financed by the EU. The co-authors Matthias Sühring, Sebastian Schubert and Mohamed H. Salim were supported by the Federal German Ministry of Education and Research (BMBF) under grant 01LP1601 within the framework of Research for Sustainable Development (FONA;

This paper was edited by Leena Järvi and reviewed by two anonymous referees.