Interactive comment on “Sensitivity analysis of the PALM model system 6.0 in the urban environment”

General comments This manuscript presents a series of sensitivity simulations of the PALM model in a real urban area in Prague, Czech Republic. The sensitive tests were conducted two-fold by changing the model’s physical properties and changing urban morphological characteristics. All the simulations were conducted for a speciﬁc hot day over a small block area, and then their results were compared in terms of the inﬂuence of UHI and air quality. This study presents a potential capability of using the PALM model as a tool for urban climate research, and it seems that intensive computational works have been done for the sensitivity experiments. However, this study has some signiﬁcant drawbacks that should be improved. First, it is not clear what this study contributes because its purpose is not clearly presented in a scientiﬁc sense. There

have been many studies that investigated the sensitivities of the urban climate model's input parameters and UHI mitigation scenarios. Though the previous studies did not use the PALM model, their contributions might be summarized and compared with this study. Unfortunately, I cannot find something new in this study conducted with the LES model. A clear scientific reason for conducting the sensitivity experiments using a PALM model needs to be presented first. Second, it is not clear why the two types of sensitivity experiments were organized. Besides, what is the reason for selecting the scenarios and model parameters used in the experiment? Before the sensitivity simulations, the reference simulation should be done with an optimized setting that can well simulate the actual (measured) meteorological conditions. In many studies, the sensitivity experiments presented in this study have been performed, and significant results have been presented. Actually, the results of this study are not beyond previous studies. Overall, this manuscript should be further improved by setting up more specific scientific questions and reanalyzing the sensitivity simulation results focusing on the scientific purpose.
First of all, we would like to thank the reviewer for many interesting suggestions. Carefully, we went through all the comments and hopefully present a much-improved version of the manuscript.
Regarding the state-of-the-art, we agree with the reviewer that many existing studies have been performed, especially for what we call B-type scenarios, however, not many have been presented in this systematic manner. By that, we mean studies that would systematically test a larger set of parameters in the models. The studies that we are aware of usually dealt with one or two of these parameters (typically albedo and emissivity in the "synthetic" scenarios).
In fact, when we were preparing the simulations, we found out that setting up the model with real-life parameter values was a very complicated task. Either they are described in engineering tables in a certain range or only estimated. Furthermore, assigning one value to a specific surface itself brings a level of uncertainty. Our motivation (for the C2 A-type scenarios) was then to assess how much this uncertainty in input parameters can influence model results. We haven't found a single study using an LES model (or a climate model with an urban parameterization for that matter) that would test such an extensive set of parameters, therefore we are confident that our study brings quite a significant amount of new findings.
The reason we limit the comparison to LES models is mainly that simplified radiation models or climate models employing urban parameterizations do not give the same answers as a very local CFD model (see e.g. a discussion in a recent metastudy by Krayenhoff et al, 2021). Even at a very high resolution of a couple of 100 meters, the urban parameterization is based on a simplified representation of cities typically by a street canyon model with only a schematic representation of real conditions. Radiation models, on the other side of the spectrum, are useful for local temperature assessment, but only allow limited inclusion of the street-scale flow characteristics typically for calculations of thermal comfort, but are not useful for air quality studies, for which the dynamical part is essential. In that regard, we feel that from a model development point of view, comparisons need to be made with the same group of models. Of course, this does not limit the use of various model types in real-life applications. In fact, this analysis was performed within the framework of a larger project URBI PRAGENSI, in which mesoscale models were used alongside PALM, but each in their respective field of expertise.
We reformulated the introduction of the study to better show the motivation for the analysis. We also added some more citations of several recent papers and two comprehensive reviews to the state-of-the-art section and discussion.
Regarding the evaluation of the basecase simulation, we need to stress that in our case the focus is on the model sensitivity in realistic conditions, not necessarily real conditions. This semi-real setup allows to perform sensitivity tests in better controlled conditions then a fully real setup while it still provides realistic conditions. Preliminary tests, however, were performed and the simulation showed to capture realistic conditions. A thorough evaluation is clearly beyond the scope of this manuscript, however, validation of the previous model version in the same area was performed in Resler et al. (2017) and the latest version has been evaluated (albeit in a different area) and described in another manuscript, also under revision for this special issue (https://doi.org/10.5194/gmd-2020-175).
Specific comments L46-47: It should be focused on specific scientific problems. What do the 'systematic' sensitivity studies mean?

Reformulated.
L76-77: The boundary meteorological conditions might be critical in determining the meteorological conditions over the target domain, so more description of the boundary conditions will be helpful. How to feed the WRF output to the PALM model? What is the feeding time step? L90: More specific or technical description might be helpful for the mesoscale and microscale coupling strategy. How frequently is the WRF output provided? L154-155: Please add a reference paper that explains the WRF and PALM coupling strategy. If this were done in this study, more description would be informative and useful.
Section 2.2.2 (WRF configuration) was extended with brief information about the offline coupling procedure.
Detailed procedure is beyond the scope of this manuscript, and has not been published yet (detailed information about the principle of the PALM mesoscale coupling was published in Kadasch et al., 2020 (https://gmd.copernicus.org/preprints/gmd-2020-285/, the description of the details of WRF processing into PALM inputs was described in Resler et.al. 2020 https://doi.org/10.5194/gmd-2020-175), our software used for this processing is part of the PALM SVN distribution under the UTIL/WRF_interface directory together with a brief description of its utilization. We have added these references to the manuscript text.
The 1-hourly 3-D fields from WRF outputs (T, Q, U/V/W) were horizontally and vertically C4 GMDD Interactive comment Printer-friendly version Discussion paper interpolated (in that order) to the PALM model grid. Because the PALM model used a higher-resolution terrain that would differ from the coarse terrain in WRF by as much as tens of meters, the vertical interpolation had to include stretching of the atmospheric columns.
At the bottom, the atmospheric columns were shifted to match the PALM terrain, therefore there were no missing data below the original terrain and the surface effects from WRF were preserved. However, at higher altitudes, the atmospheric columns could not be shifted by the same amount, as that would introduce unrealistic horizontal gradients mimicking the terrain shift below. In order to avoid this, the atmospheric columns were stretched heterogeneously. The WRF model uses either sigma or hybrid vertical coordinates, our simulations use the hybrid option where the lowest level is terrainfollowing and the highest level is isobaric. For each column, the geopotential height of each level in the WRF data was recalculated using the same formula and parameters used in WRF for calculating the heights of the hybrid levels, however with the surface pressure altered to match the PALM terrain. The recalculated level heights were then used for linear vertical interpolation into the PALM Cartesian vertical coordinate system.
The interpolated 3-D fields were used as initial conditions for the first timestep and their top and lateral boundaries were used as boundary conditions for all timesteps. For the velocity fields, the total volumetric flux disbalance was calculated for each timestep as a sum of the volumetric inflow minus outflow for all boundaries. This residual volumetric flux was then divided by the total area of the five boundaries and subtracted from the respective inwards-directed velocity component for each boundary in order to make the inflow and outflow perfectly balanced, as is required by the incompressible equations used in PALM.
L87: Is there any reason to select PM10 for analysis rather than NOx, PM2.5? if vehicular emission is estimated, more relevant species might be NOx and PM2.5 rather than PM10.
We acknowledge this suggestion and for the revised version we chose PM2.5 for the analysis rather than PM10. On the other hand, considering the methodology of emission calculation the emissions among different species differ only by a multiplicative constant. As there is no chemistry and pollutants are treated as passive tracers the results (in a relative sense) would be the same for any of the three pollutants. We extended the emission description paragraph (2.6) to clarify this.
L93: Why didn't you use urban parameterization in the WRF simulation? Generally, the use of urban parameterization in WRF can give better simulations than the NoahLSM bulk urban parameterization. Providing realistic meteorological boundary conditions to the PALM model might be critical in the simulation over the target area.
This analysis is focusing on the model sensitivity in realistic conditions, not necessarily real conditions. Additionally, for this particular case study, no detailed measurements are available against which the WRF model simulation could be perfected for the particular location. The reason for using this case study is to stay comparable to our previous study (Resler et al., 2017).
Another manuscript, also under revision for this special issue (https://doi.org/10.5194/gmd-2020-175), describes a validation experiment with similar settings. The second study also discusses the reasoning why no urban parameterization was used. Briefly, in our extensive preliminary testing with the local URBI PRAGENSI project (http://www.urbipragensi.cz -only in Czech), the WRF quite surprisingly performed better without urban parameterization in comparison to background synoptic stations in Prague and mainly in terms of the temperature vertical profiles which are fundamental for providing boundary conditions to PALM (in contrast to surface values).
L103-105: which parameters were measured? What are the methods to get the parameters from the site? More specific descriptions will be necessary.
Parameters were collected in a specialized data collection campaign carried out in the C6 framework of the URBAN-ADAPT project (financed by the Technology Agency of the Czech Republic) by a project partner. Details can be found in the Resler et al (2017) paper (https://doi.org/10.5194/gmd-10-3635-2017) as referenced in the manuscript.

L112-114: any reference for Prague 3D model?
Added reference to Prague OpenData portal (description only available in Czech language) and basic information about the dataset. L121-124: it seems that this spin-up result can influence the analyses of the sensitivity simulation results.
The spin-up period itself is not included in the analysis. Some influence may persist in the first hours of the "real" simulation after the spin-up. Given the enormous computational costs, it was not feasible to extend the simulation for another day. However, we have performed preliminary tests which showed that a longer simulation time does not change significantly. Moreover, as the results show, for most of the scenarios, the differences between respective simulations show up mostly after the sunrise, which gives the model enough time. Therefore, we think this approach is correct.
Added some clarification in the text.
L125: does the 'simulation' mean sensitivity or spin-up simulation?
By simulation we refer to the complete simulation, starting with the spin-up part and then continuing with the full LES run.
L134-135: Despite low vegetation fraction, this study says the vegetation has the most important factor in this area. Are there any studies to evaluate the physical parameterization of vegetation in the PALM model? If so, add the papers as references.
The paper describing land-surface modeling in the PALM model is currently also under review for GMD (https://doi.org/10.5194/gmd-2020-197, added to the model introduction section).

C7
L172: why was the heatwave episode selected? I guess the series of sensitivity simulation results might depend on the case selection.
Certainly, the results would be different. The reason for choosing specifically a heatwave episode was to stay in-line with the typical application of UHI mitigation studies (as represented by the B-type scenarios in the study: increase of high reflective surfaces, tree shading, following thermal comfort etc). Additionally, the simulations performed in 2018 and 2019 were, and frankly still are, so computationally expensive, that repeating this exercise for more different cases (e.g. for winter) was technically unfeasible within the time frame of the URBI PRAGENSI project, in which they were performed.
L179: It seems that the PALM model does not cover the LLJ in the vertical direction.
The parent domain, with a vertical extent of 2.5 km, covers the LLJ at 640 m.
L187: Please check the emission unit in Fig. 3. It is difficult to read the emission intensity from the Fig. 3. What are the emission fluxes of NOx, PM2.5, and PM10 used in the simulation? How reliable are the estimated emissions? The primary pollutants emitted from passenger cars are NOx, CO, VOCs, and small particulate matter fractions. For PM10, blown-dust might be a major primary source on roads.
We thank the reviewer for pointing out the discrepancies in emission fluxes in Fig. 3, we corrected the values in the Figure, and in accordance with the other comment we changed the displayed pollutant to PM2.5. We also added information about the ranges of emission fluxes of the other two pollutants to the text.
Regarding the question of the reliability of the emissions. It is hard to estimate the uncertainty of the emissions as the uncertainty of some input parameters are unknown. One parameter is traffic intensity which is based on annual census data. Although the exact vehicle numbers for modelled days are unknown we think this can serve as a good estimate of the typical traffic load in the locality for the given period. Second C8 possible source of error is the assumption that only passenger cars are present. Based on the traffic census data there are only 2 L191-192: More explanation will be needed why the sensitivity tests are needed. What should be the base simulation in this study? What do you mean by 'real values'? How did you select the model parameters?
All simulations use the same dynamical inputs, the baseline differs only by using real values (ie. as measured or estimated based on the actual materials used in the real city), while the scenario simulations change them. We tried to clarify that better in the manuscript text. The sensitivity tests are described in the following two sections in detail together with the reasoning for why they're needed and how they were selected (Atype scenarios = typically material constants = assessment of the uncertainty in model outputs due to uncertainty in model inputs; B-type scenarios = typical UHI adaptation measures).
L200-201: I think that the necessity does not worth publication.
Reformulated L212: Result section should be significantly revised. Tables 3 and 4 are too busy but show little differences. Many studies have reported similar sensitivity results. Please compare your results with them. Fig. 4: at which level are the variables plotted? Define the surface temperature. Please show how to calculate MRT and PET from the model results. PM concentration field looks unrealistic in magnitude and spatial distribution. Compare also Fig. 3. Is there any comparison against measurements?
We agree that the tables were a bit messy, a color code would certainly help, but it's not possible to have colors in the GMD LaTeX tables. Since both tables were in the supplement (TableS02) together with the rest of the variables and locations, we decided to eliminate these two tables from the manuscript and only reference the supplement. We added some more references to existing studies in the discussion, mainly two large