Comment on gmd-2021-337

The calibration is mostly related to numerical discretization of the momentum and continuity equations. Emission of dust in numerical models depends on the discretization of surface winds. The surface winds are inferred from the pressure level wind vectors derived by solving numerically the momentum equations. The numerical discretization of these equations will be affected by the numerical resolution. Obviously higher resolution will resolve sharp topographic variations with stronger downslope winds. On the other hand, flat terrain without roughness elements using low resolution will generate stronger gustiness. So, changing model resolution has a non-trivial effect on surface winds. Concerning dust emission, the flux depends on the cubic power of surface winds (see Equation 2), which will amplify wind bias related to model resolution. This implies that “tuning” dust emission is a required method to simulate scale-aware tracer with numerical model. This is also true for other tracers, such as sea salt emission from the oceans.

compared with one another and the AEM and TEM results. It is concluded that DOD was not suitable for evaluation/calibration of dust models.
All these aspects are interesting and worthwhile to explore, however, the listed points do not support the authors' conclusions, because a) the correctness/completeness of a formulation of streamwise saltation flux is independent of the use of albedo-based or another roughness representation, b) a dynamical representation of surface roughness can also be used with TEMs using methods other than the proposed albedo-approach, c) the albedo-approach is also calibrated. The authors aim to proof the advantages of AEM using uneven comparisons with what they described as a TEM: AEM uses dynamical vegetation, while TEM does not; AEM uses an updated streamwise flux formulation, while TEM does not; AEM uses variable soil texture/clay content, according to the proposed conversion of streamwise saltation to vertical dust emission flux, while the TEM does not (about the latter, there is contradictory information in the manuscript).
To evaluate the calculated dust emissions, the authors use two observational data sets: MODIS DOD at a spatial resolution of 1 degree and MODIS-based DPS at a spatial resolution of 250 m for a region in the southwestern USA. Again, the conclusion of DOD not being a suitable reference is not supported by this comparison due to the substantial difference in spatial resolution (approximately 400-fold higher resolution for DPS compared to DOD) in the presented comparison, and due to the definition of DOD used in the study, which seems to be simply a grid-point selection of AOD for DPS locations and likely includes aerosol other than dust as well.
Besides these previous aspects, the description of a TEM seems inaccurate and outdated at many places, and often not supported by recent references.
In light of these aspects, I cannot recommend publication of this manuscript. Additional comments are added below. * L56-57 "total wind friction velocity ustar created by all scales of roughness" sounds like surface roughness were solely sufficient to describe ustar. Shouldn't the atmospheric flow be an even more important aspect? * L64 There is no "reanalysis model". Reanalyses are based on model runs and observations. * L67-76 This discussion is repeated from Webb et al. (2020). A reference to Kok et al. (2014), who have previously brought up this issue, should be added. It is not clear whyif this problem has been identified -the authors are not simply using the updates expression for both estimates with AEM and TEM, as this issue has nothing to do with the proposed albedo-approach. If the authors wish to test the sensitivity to the streamwise saltation flux formulation, this test needs to be performed holding all other settings constant.
* L80 Why is it that the correct values of R are not known for every pixel and every time step? Parameterizations of drag partition also rely on satellite data as input, as does the presented albedo-approach. While each data set comes with uncertainties and missing data, I do not see a fundamental different in the knowledge obtainable about surface roughness from satellite data for use with one approach or another. * L82 What is meant by values of z0 are pre-tuned and tend to maximize dust emission? No reference is provided for this statement. Values in the aforementioned references were obtained based on satellite data and ground-based measurements, as the authors also mention in the following sentences.
* L85 The use of preferential source areas, e.g. as in Ginoux et al. (2001), in some models has the purpose to generally specify soil erodibility and circumvent the need to prescribe detailed soil-surface properties. It is not specific to a description of surface roughness or z0. * L95 Again, the "correct" Equation (3) is not related to the use of albedo to describe roughness. Hence a comparison of the albedo-approach and Eq. (3) with another approach and Eq. (2) is inconsistent. * L101-102 Why would the albedo-approach be inconsistent with a grain-scale entrainment threshold? Is that because the albedo-derived u* is also resolutiondependent? * L116-117 If the calculation of dust emission flux from streamwise sediment flux depends on %clay, then why a fixed clay content is used in the TEM (L537), but a spatially variable clay content is used in the AEM (L586-587). This, again, is an uneven comparison, which is unjustified. Then in L200, you claim that a soil clay content map was used with both models. In the appendix again, it is claimed that soil clay content was fixed in the TEM. Please explain.
* L120-121 Marticorena and Bergametti (1995) use the adjustment of dust emission according to the bare soil fraction in addition to a drag partition scheme. In the implementation of a TEM in the present study, this adjustment is used alone. Why is no drag partition applied, in combination with dynamic surface roughness as for the AEM? This would provide much better insight into the performance of the albedo-roughness approach.

* L130 Whether E includes brown vegetation depends on what data is used to define it.
* L131 This sentence is lacking foundation and in my humble opinion also dispassion. * L135 Not clear which pre-tuning is meant.
* L140-142 The authors claim that using AOD for evaluation or calibration of a dust model includes the assumptions that 1. dust in the atmosphere represents the dust emission process, and that 2. the spatial variation of magnitude and frequency of modeled dust emission is correct. This is incorrect. First, for model evaluation/comparison, observed AOD (better DOD, dust optical depth) is compared with modeled AOD/DOD, and not with modeled emissions. Second, the goal of comparing modeled with observed atmospheric dust fields is to determine how well the observed fields can be reproduced with the model. Indirectly, but not unambiguously, this also sheds light on modeled dust emissions. No direct observations of dust emission or surface dust concentration are available on a global scale; hence this indirect evaluation is made. If emissions were assumed to be correct, model evaluation would only test dust transport and deposition processes and their parameterizations, which is not the case. * L150 Do I understand right that DOD was obtained from AOD by selecting pixels which coincided with a DPS? Over North America, I expect that even at DPS locations, this DOD contains a significant contribution of other aerosol, leading to a larger value and therefore higher frequency of DOD > 0.2 than from dust alone. * L155 I would see it the other way round: The correct probability of occurrence of (any) sediment flux depends on the correct (magnitude and) frequency of dust emission. * L160 How are these assumptions circumvented? Do you mean that you evaluate the frequency of emission instead of the magnitude? * L168 How is the aggregation performed? Are the frequencies aggregated or the original data? What is meant by "normalizing the results to the lowest resolution data"? * L169 Please provide more detail about how the DPS observations have been obtained. * L197 Do I understand correctly that the us*/u10 from albedo is completely decoupled from the atmospheric model and that, to calculate dust emission, you calculate us* by multiplying us*/u10 with the ERA5 winds, which correspond to a totally different modeled u* calculated with a different roughness? If so, I very much wonder about consistency of the obtained us* with both, the original albedo-approach and the atmospheric model. If the us*/u10 does not need ancillary data to calculate dust emissions, then -if used in an ESM -I wonder about consistency with the model winds responsible for dust transport after emission. It is also mentioned that the albedo-based us*/u10 from polar-orbiting MODIS data has incomplete coverage. I believe you argued earlier that for this same reason, the R ratio for use in TEMs cannot be estimated accurately. So this applies also to the albedo-based us*/u10? * L214-215 If I understand well, wind speeds are obtained from ERA5 and in the AEM combined with the albedo-based us*/u10 ratio. Do I also understand well, that the AEM obtains one us*/u10 value each day from which dust emission is estimated, combined with the hourly ERA5 input? How are the different temporal resolutions treated? What is the motivation to select a narrow wind speed range between 8.5 and 9.5 m/s? * L224-232 This paragraph seems redundant as there is no noteworthy drag partition used in the TEM as described here, hence (vegetation) roughness is not considered, but only surface coverage. The same applies to the discussion in L254-256. * L238-239 I agree that the interplay of friction velocity and roughness is critical for dust emission, but this can be easily implemented also for the TEM. One option is described in the appendix, but not used. * Fig. 2 What does "changed" mean in the axis titles? Does this refer to a difference or normalization? I also assume that uf should be u10 in the x-axis. * L295 I am impressed that despite the severe limitations in the presented TEM implementation, it gives a similar (actually higher) R^2 than the AEM when compared with DPS. * L316 The pattern similarity between TEM and u10 is most likely a direct result of how u* was calculated and of not including roughness in the TEM implementation.
* L322 Do the daily maxima used in both models refer to wind speed data? * L355 Please provide a reference for this statement.
* L358-362 Please give the RMSE also at Fig. 4. * L362-364 I do not see how the use of a fixed z0 can be called a tuning of the TEM. Most importantly, the calculation of us*/u10 is also calibrated/tuned, so there is no difference between AEM and TEM in that regard.
* Table 1 The left and center columns contain a large amount of overlap. Conceptually, it is not clear to me why u*ts at the grain scale should be inconsistent with the albedoapproach unless us* is not correctly retrieved. This may be related to resolution as indicated by the authors, a problem similar to the model calculation of u*. At the same time, the albedo-approach is claimed to be scale-invariant which appears contradictory. The authors also suggest that modeled u10 may be too large. While this can be the case, models typically underestimate strong winds, in particular with decreasing resolution. In line 4 of the table, it is noted that DPS may not include all dust emissions. Shouldn't this also be a reason why the modeled dust emission frequency is higher? Finally, in the last line of the table, the calibration (tuning) of the albedo-approach is questions. Unfortunately, the assessment of this issue in the center column is not clear. It is also not clear why research on the applicability of the approach for a range of conditions is of low priority. This should be first priority when proposing a new parameterization. * L393 It seems that the authors consider the dependence of dust emissions in the TEM on u10 (or better u*) negative and uncertainty arising from it more problematic than the fact that the albedo-based us*/u10 ratio was calibrated only against a data set covering a very limited range of conditions. * L431 While the works from Marticorena and Bergametti (1995) and Shao et al. (1996) have certainly been major advances, there have been many additional advances since. Generally and in contrast to what is described in this paragraph, the importance of vegetation dynamics for dust emission has been well recognized for a long time and has been implemented in several global models/ESMs, also for climate simulations (see also previous comments).