Comment on gmd-2020-415

-line 2, nitrogen dioxides should be dioxide -line 4-5 “After NOx from different sources being emitted...” is grammatically incorrect and awkward -line 12, estimating should be estimation; observation should be observations

context of the manuscript page numbers. I cannot make this decision for the authors, but I would ask that they consider actually combining this work with Fang & Michalski 2020 -better digesting the work in both papers to focus on the most salient model results, actually quantify the sensitivity of results to change in emissions versus change in transport within the model domain, and then reflect on how d15N-NOx and NO3-are currently being interpreted in the literature. This would represent a major step forward to being able to use this tracer quantitatively. As it stands, the premise that transport somehow would not matter for setting the distribution and seasonality in d15N-NOx is not appropriate -it is well known from a concentration point of view that this would be true, and the authors themselves point out on page 4 (lines 11-15) the "unsatisfactory" nature of the 2020 paper's approach so why is it justified to separately publish that?
It is of critical importance to address the significant overlap between this work and the cited companion paper of Fang & Michalski (2020). The "companion paper" (hereafter referred to as FM2020) specifically concludes that the model is biased due to the lack of fractionation associated with chemistry. Here they focus on the role of transport and mixing in changing the distribution of the N isotopic composition of NOx, and conclude that the model is biased due to the lack of fractionation associated with chemistry (i.e. this is the same conclusion as FM2020). Additionally, there is significant overlap and direct plagiarism of much of the text of FM2020 in this manuscript (much of the introduction and Figure 1 is verbatim from the other manuscript), which must be modified if this manuscript is to stand alone. Why are these works separated? I think it would make much more sense to combine the two manuscripts. Much of this work under review here is repetitive with FM2020 and the real power of the results in this manuscript is limited in its current form. While I value the need to focus on the detailing the model specifics in a journal such as GMD, the manuscript here lacks any real interpretation, quantification of the sensitivity of the output to the model parameters and consideration of the implications of the predicted values compared to previous studies in the literature of interpreting the isotopic composition of NOx and nitrate. From my point of view, the only purpose of a "no transport" case study should be to 1) quantify the impact of assuming that everything is locally derived, 2) show how "wrong" that assumption is, and 3) reflect on the implication of this for previous interpretations (such as Elliott et al. (ES&T, 2007) who assume a direct 1:1 relationship of d15N-NOx to d15N-NO3-or for considering the utility of isoscapes such as that presented in Walters et al. (ES&T, 2015)). Second, the model is compared with one set of observations of "d15N-NOx" in Indiana (within the domain of the model runs). The domain of the model is never justified in the manuscript (i.e. why focus on the MidWest?). The 2-day transport lifetime against "loss" or conversion is fair (based on Levy et al, 1997, which should be cited) but this should be brought up early on in the paper to justify the domain of the study. Additionally, the measurements that are compared with are specifically d15N-NO2 and not d15N-NOx. This is important because of the equilibrium fractionation that can occur between NO and NO2 in the atmosphere. While fractionation associated with chemistry will apparently be discussed in a future manuscript, the authors must be precise in their comparison since it is used to validate the model (here and in FM2020). Walters et al. (2018) calculate that the fractionation between NO and NO2 under the conditions where NO2 was captured is likely small, but it's not small enough to make no difference in the comparisons being made here. If the authors feel it is a small enough difference, then this should still be stated and made clear that a direct comparison between d15N-NOx and d15N-NO2 is ok in this case. This needs to be considered and discussed in the manuscript to make the comparison between the model and observations robust.
Third, the sensitivity to the starting emissions values should be evaluated. Table 2 shows that there a very large changes in d15N-NOx when just assuming different emission rates (ie NEI 2002 versus 2016). The emission "source" value measurements, used as input for the model, report large ranges and have analytical error associated with them. It appears median values are used (though this is not specifically stated in this manuscript). Shouldn't a range of values (or at least a high and low value) be tested for the sensitivity to changing these source values? Shouldn't the error associated with the observations be considered in terms of what types of changes in d15N-NOx can actually be detected and therefore be considered important? The use of specific values should be better justified in the manuscript. The on-road values come from Walters et al., 2015 and are calculated based upon tailpipe measurements for a handful of specific vehicle types that yield a huge range. Yet, observations of on-road vehicles are available for the Midwest and northeastern US and if anything much better constrain this emission source value to -4.7 +/-1.7 in Miller et al., 2017 (which is cited in the manuscript but never seems to be used; note too that Miller et al., 2017 contains measurements of roadside d15N-NOx that could be compared with the model in future studies that expand the domain to include the northeastern US). Similarly, Yu andElliott, 2017 andMiller et al, 2018 are cited in the introduction (in a sentence that is exactly the same in FM2020), but the values from these papers are not being used in Table 2  Fourth, one of the key conclusions of this work is that changes in the polluted boundary layer (PBL) are critical to transport and dispersion of NOx such that the pattern of d15N-NOx is importantly changed based on the PBL height. I'm not convinced the results shown support this conclusion. Figure 6 shows the difference in d15N when using the same emissions (NEI) but different meteorology -in other words, how much difference does changes in meteorology make when holding emissions "constant". Figure 7 shows the change in the PBL for these same runs. The pattern of important changes in the PBL do not at all match with the pattern of important changes in d15N, so how can the change in the PBL be the major driver? In fact, the changes in d15N between the two meteorological conditions is really between -1.25 and +1.25 per mil except in the westernmost edge of the domain during summer, and on the whole the best represented value of change geographically is actually zero. So what are we really learning? And how is the PBL change important if it's greatest change is in the spring and summer in the center and eastern domain where the change in d15N is essentially negligible? Then in Figure 12, it ends up seeming that because the shape of PBL change looks like the shape of change in d15N-NOx simulated by CMAQ we can conclude that it is the driver. Why not look at PBL changes within the time frame of the only observations that are compared with. There is a lot of variability in the d15N-NO2 data shown in Figure 11, and some significant day/night differences. If the PBL changes followed those changes in d15N, then following that up with Figure 12 would be logical and more convincing of the potential for the PBL to be the driver of the d15N changes.
Fifth, "the role of deposition" section and comparison of d15N-NOx with d15N-NO3-seems out of place in this work. It is clearly shown in the literature, and much of the author Michalski's own work, that fractionations associated with NOx cycling and the formation of NO3-are important for setting the d15N-NO3-. The only reason to include this section in the manuscript would be to show the distribution of d15N-NOx if one were to assume d15N-NOx = d15N-NO3-and reflect on the studies that have done this (seemingly incorrectly). The comparison of d15N-NOx and NO3-does reveal that some of the seasonality of the d15N-NO3-is dictated by the sources of NOx, but the values for d15N-NO3-are clearly largely offset from the d15N-NOx b/c of chemistry and partitioning between the gas and particle phases of NO3-. This would be better addressed, on the whole, as part of the next manuscript where with and without chemistry could show the change in d15N-NO3-rather than comparing with d15N-NOx. Additionally, lifetime matters more in the case of NO3-, which is often stated to have an average lifetime of 3-5 days (see for instance Xu and Penner, ACP, 2012), so transport from outside the domain would need to be considered to make a comparison with d15N-NO3-observation fruitful. Finally, it needs to be addressed why in this work there are only 8 NADP sites being compared with, while it appears that 82 measurement sites are included in FM2020?

Additional detailed comments:
There are number of language errors that should be addressed to improve the clarity of the paper -for instance "disperse" throughout the manuscript should be "dispersion"; "medium" should be "median", "closed" should be "close". A number of symbols (such as HNO3 and HNO4 in the introduction) do not print out correctly for some reason.
In general, for the figures (in the manuscript and in the supplement), the captions need to be improved. Most captions are extremely similar, with perhaps a words or two different, making it difficult to compare and interpret. I would suggest specifying the comparison being made (for instance, figure 6 could specify that this is a comparison of the impact of meteorology using the same NEI 2002 emissions, so the reader knows how to interpret the figure). In addition, the authors should consider adding average values over the domain to each figure -the text often specifies the average value over the domain for a specific season, but this would be much easier to compare across the seasons if the average or median value were placed in the upper right corner of each figure. For the map domain figures the y-axis is never defined and show be removed or defined. Additional specific comments on some of the figures is included below. Lastly, consider the use of significant figures throughout the manuscript -the observations are typically reported to one decimal place so why report model results to 3 decimal places?
Title: Is it necessary to have the CMAQ, SMOKE and WRF versions as part of the title? The title is a bit unwieldy; these versions could be specified in the abstract and manuscript; "plays" should be "play" and I would argue that atmospheric "processes" are not really being tested here, it's really transport or meteorology. -p4, line 29 remove "of" before meteorological -p4-5, line 1, "(nitrate)" maybe should be "(e.g., nitrate") since NOx can also be removed as other products. Also this removal as nitrate comes up without much discussion of the chemistry that converts NOx to NO3-or appropriate references within the Introduction to characterize this well so that the deposition simulations can be followed by a more novice reader.
-p5, line 1, remove "of" before chemistry -p5, lines 5-8: throughout here "meteorology" should be "meteorological"; value should be values; "condition impacts" should be "conditions impact" -p5, line 17, add degrees symbols to the latitudes and longitudes -p6, lines 26-29, I think it would be more understandable to a more general audience if 0.0036 was clarified as the natural abundance of 15N. This is not technically the ratio of 15N/14N of air N2 as stated. It would also be more clear if it was stated that d15N-N2 is, by definition, 0 per mil. (the statement here become confusing especially on p11 when the "15NOx concentration equals 0.0036 of…14NOx").
-p7, line 10, area is not stationary sources; the definition in Eq 5 include mobile sources. -p8, line 23, "were selected based on [the] same timeframe as selected NOy d15N measurements." The justification for the years selected should probably be presented sooner. There should be justification for the domain being focused on the Midwest -the NOx measurements compared to the model are d15N-NO2 not d15N-NOy; and I'm not sure it is relevant to compare with d15N-NO3-for this study.
-p11, line 28, Canada, in reality is not an "'emission-free zone" and transport from this region is certainly expected within the 2-day time frame. This is another example where the sensitivity to this assumption should be stated or made clear as to how it affects the results and conclusions.
Results and Discussion: -p14, lines 9-10, here it is stated that on-road vehicles have a d15N=-2.7 +/-0.8 per mil, yet Miller et al. 2017 report consistent observations from the Midwest and northeast of -4.7 per mil on average.
-this section (3.1) of the results would be improved if there were discussed in the context of how isoscapes of NOx or NO3-, purely based on the assumption that local emissions = a local signal is not correct and that transport and mixing play a key role in setting the distribution predicted by the model -this section (3.1) also does not discuss the potential for NO2 to be deposited quickly near roads. This is something that is hypothesized in Elliott et al. (2007) and part of why they draw the conclusions that stationary NOx sources in the Midwest explain the d15N-NO3signals on the east coast (versus vehicles on the east coast, which are the dominant local source). Also, consider that this is suggested by the study of Redling et al. (Biogeochemistry, 2013).
-p15, line 8-9, given that this is a model it should be better accounted for why the grids located in the suburb of major cities is dominated by on-road vehicle emissions -is this because of transport from the cities? Or is this b/c urban sources decrease and vehicles become the major local source in the suburbs? - Figure 3: the scale doesn't appear to cover the entire range? i.e. isn't there predicted values as low as -34.3 per mil and as high as -20 per mil? Also it should be made clear what the box in (b) represents. Same question about the scales in Figure 4.-do they cover the entire predicted range? -p17, the interpretation here would be improved it reflected on the implications for conclusions drawn in the published literature such as Elliott et al (2007), isoscapes and similar studies that do not consider realistic transport of NOx and changes in the d15N-NOx that are clearly reflected in this model study.
-p18, lines 13-14. Please provide more context here -what measurements was made in China? How does this correspond to what is being stated here? How do these measurements indicate/agree with the evolution being seen in the model? -p21, Figure 7, Given the argument above that the PBL changes itself do not appear directly linked to important changes in d15N-NOx, this figure does not seem useful.
-p23, lines 1-3. The statement here about emission control technologies should 1) be referenced and 2) is incorrect. Felix et al. demonstrates that emission control technologies lead to HIGHER values of d15N from smokestacks, not LOWER. This needs to be addressed.
-the most dramatic changes in d15N seem to be associated with changes in emissions. Could the authors detail what significant changes there are in emissions between 2002 and 2016 (and why) and then more directly discuss the impact on the d15N compared to quantifying the impact of meteorology alone.
-p26, line 5-7, the latter hald of the sentence starting with "The predicted d15N of atmospheric NOx over the nested domain…" is confusing -please rephrase or change to two sentences.
-p26, line 14, "closed" should be "close" -p26, what are we actually learning from the nested-grid simulation? This section discusses what changes in the predicted values between the nested domain and the full domain, but does not really discuss how/why this matters. Figure 10 is pretty unimpressive given that the changes are less than 1 per mil, with the most change in the range of >0.5 per mil such that this would never actually be detected in observations. This figure does not seem necessary for the main text, and would again be more interesting the context of quantifying the changes in d15N that occur with changes in emissions, changes in meteorology and changes in domain.
- Figure 11 should cite the reference the observations are from and see comments above about discussion regarding the observations being d15N-NO2 not d15N-NOx - Figure 12, see the more significant comments above. The link with PBL height is not very convincing over the whole domain based on the earlier figures comparison and Figure 12 cannot really be considered in the context of the observations in Figure 11 since the observations were only taken during summer. So why not plot the seasonality in the d15N and PBL from several locations in the domain to make the link with the PBL more concrete? There's also no discussion of how the measurements were originally interpreted and how this review using the model changes or enhances the previous interpretation. For the figure caption add "Left axis" after the square and circle in the parentheses to be consistent with the "(x, right axis)" [Also, are there not any other observations within the domain to compare with? Elliott's group at Univ of Pittsburgh has published at least passive sampler d15N-NO2 in Redling et al. and in Felix and Elliott 2014 in the Midwest -why are these not considered for comparison?] -p30, line 9-16 do not seem necessary given that all of this information is in the table.
-p30, line 18-19, "In general, the CMAQ simulations of d15N(NOx) under most of the scenarios conducted in this study, except the simulation based on NEI-2016 and 2016 meteorology, perform better than the SMOKE simulation of d15N(NOx), which only take the variability of NOx emission source into account (Table 2)"… how is "perform better than" defined? Matching the median of the West Lafayette observations or the entire range or ? This is confusing b/c the 2016 simulation compares best with the observations so the 'except' here seems out of place. It also seems somewhat obvious that the 2016 simulation w/ real meteorology compared to the 2016 observations would yield the best result. Maybe reporting the SMOKE results in Figure 13 would help with this too (or better yet combine with the SMOKE results for one manuscript). Also it's important to consider again whether there are any other observations that could also be compared with within the model domain.
-p31, Figure 13, change "dots" to "open circles" in the caption; do the x's reflect the mean? Include this in the -p32, line 5, why the ~ signs? Try to make clear in the text here when observations are being discussed and when model output/predictions/simulated values are being discussed.
-p32, Figure 14, what is the number of measurements being represented here? Can an "n=" be included. Why the choice of only a few sites for comparison here? The seasonality based on the 8 locations is much more muted than the seasonality presented across 82 sites presented in FM2020. See above in the major comments with regards to issues with the comparison of d15N-NOx and d15N-NO3-and the lifetime/transport distance that might be associated with NO3-compared to NOx and whether the comparison here is justified. Note too that Aug, Sep, Oct are particularly different between the model and observed wet deposition -what is the cause of this? -p32-33, "The difference between CMAQ simulated δ15N values of NOx and measured δ15N values of NO3-is caused by the following two factors: a). the mixture of isotopically lighter NOx from the surrounding area discussed in section 3.2, and b). the net N isotope effect during the conversion of NOx to NO3-, which will be addressed in future work." These conclusions do not seem supported. The mixing in of lower d15N-NOx from the surrounding area should be reflected in the real d15N-NO3-and in the d15N-NOxi.e. the simulations with transport from the time period when the observations were taken should be more realistic and better reflect what is observed. The second point is the same conclusion drawn in FM2020 -i.e. we need to assess the impact of chemistry b/c this is important and it makes all of the simulations here difficult to interpret as to whether they are worthwhile to interpret without the context as to whether NOx and NO3-can be realistically simulated without chemistry effects.

Conclusions:
-"The simulation indicates that the PBL height is the key driver for the mixture of anthropogenic and natural NOx emission, which deepens the gap between d15N of atmospheric NOx and NOx emission." -> is this really proven? There is not enough evidence presented to really draw this conclusion. And it's not really the gap between the d15N-NOx and NOx emission, its whether or not the d15N-NOx in the atmosphere is determined by local sources alone versus a more realistic treatment including transport.
-"The performance of CMAQ simulated d15N(NOx) is better than SMOKE d15N(NOx)…" -> based on what metric? And it might make more sense to make clear here that simulations that include transport (i.e. CMAQ) reveal …., while those that are based on local emission sources alone (i.e. SMOKE) reveal…