the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Simulation Model of Reactive Nitrogen Species in an Urban Atmosphere using a Deep Neural Network: RND v1.0
Junsu Gil
Jeonghwan Kim
Gangwoong Lee
Joonyoung Ahn
Abstract. Nitrous acid (HONO), one of the reactive nitrogen oxides (NOy), plays an important role in the formation of ozone (O3) and fine aerosols (PM2.5) in the urban atmosphere. In this study, a simulation model of Reactive Nitrogen species using Deep neural network model (RND) was constructed to calculate the HONO mixing ratios through a deep learning technique using measured variables. A Python-based Deep Neural Network (DNN) was trained, validated, and tested with HONO measurement data obtained in Seoul during the warm months from 2016 to 2019. A k-fold cross validation and test results confirmed the performance of RND v1.0 with an Index Of Agreement (IOA) of 0.79 ~ 0.89 and a Mean Absolute Error (MAE) of 0.21 ~ 0.31 ppbv. The RNDV1.0 adequately represents the main characteristics of HONO and thus, RND v1.0 is proposed as a supplementary model for calculating the HONO mixing ratio in a high- NOx environment.
- Preprint
(964 KB) -
Supplement
(244 KB) - BibTeX
- EndNote
Junsu Gil et al.
Status: final response (author comments only)
-
RC1: 'Comment on gmd-2021-347', Anonymous Referee #1, 13 Feb 2022
Review of gmd-2021-347
Simulation Model of Reactive Nitrogen Species in an Urban Atmosphere using a Deep Neural Network: RND v1.0
by Junsu Gil et al.
General comments:
This manuscript describes a new application of a simple feed forward neural network model to calculate HONO mixing ratios based on a set of other measured variables. While this is an interesting and worthwhile application, the paper lacks the necessary details in the description of the deep learning model and contains no ablation studies which are needed to provide the credibility in the results. I also question the validity of the cross validation and test cases that are discussed, because I doubt that these test cases are truly independent data samples. There is no proof of the generalisation capability of the model, so it may well be that this model fails completely if it were applied to measurement data obtained under different conditions.
In summary, this manuscript falls between "major revisions" and "reject". In computer science conferences it would be ranked "weak reject", which means the paper could be saved if the authors invest substantial work in rerunning their model several times and improving the text.
Specific comments:
Abstract: Confusing sentence after "In this study,". After reading 3 times I understood that you are resolving the acronym RND here, but this is well hidden. Suggestion: "In this study, a new simulation approach to calculate HONO mixing ratios using a deep learning technique based on measured variables was developed. The 'Reactive Nitrogen species Deep neural network' (RND) has been implemented in Python. It was trained, ..."
Abstract: Why should RND be called a *supplementary* model? What does it supplement?
l.35: too vague "observatrional constraints on individual species". Does this refer to NOy compounds or any species involved in the tropospheric ozone production cycle?
l.40: NOy has been the focus of attention already in the 1990s. See for example papers by Sandy Sillman et al. You may say "renewed attention".
l.43: to the uniniated reader it might not be clear what heterogeneous reactions have to do with NOy and ozone chemistry. This would merit one or a few more general sentence(s) to describe NOy chemistry. If this text will get a little longer, please consider sumamrizing the HONO/NOy chemistry in a supplement and refer to it. Nevertheless, one or two sentences will be needed here.
l.44: you could add https://doi.org/10.5194/acp-18-3147-2018 to the list of references here.
l.52/53: it would be useful to know if there is general agreement among these different measurement methods or if they haven't reached a satisfactory level of consistency yet. In the following sentence, please provide some order of magnitude numbers of observed versus simulated HONO levels (or a value range).
l.57: the recent adaptation of machine learning techniques in atmospheric sciences is more general that "multi layer artificial neural network". In this context, it suffices to say that "machine learning" has been adopted. Then, in a following sentence you can narrow this down to the employment of deep (artificial) neural networks, which have a capability to learn more complex non-linear relations in data, but also require larger amounts of data for training." The selection of references appears a bit arbitrary. For example, there is a whole special issue in Philosophical Transactions A () on machine learning for weather and climate. Indeed, you may want to first provide two or three general references for ML in atmospheric science (with cf.), then write a sentence which refers specifically to atmospheric chemistry/atmospheric composition and provide some more references there.
l.59-62: the description why deep learning might be useful for the analysis of atmospheric chemical measurements remains vague and superficial. You should state explicitly that neural networks learn relations in data (similar to function fitting) and you should state in what way NNs may improve on numerical simulations (I guess you refer to the fact that they are inherently bias-free?).
l.62/63: introduction of the model acronym: difficult to disentangle the sentence - see comment on abstract above.
l.67: as this is supposed to be a manuscript for the special issue on "machine learning methods and benchmark datasets", you should add a statement here that the code and training data can be downloaded from ..." (you can of course also refer to the code and data availability section here). Re-usability of your model is a key aspect for this special issue (and for GMD in general).
l.70: the steps which are described don't guide the development of RND, but describe the typical machine learning workflow.
l.77: similar issue - this reads as if every user of RND will first have to perform measurements for her/himself. Please separate the dataset preparation from the model development. The model should be generalizable, i.e. be independent of the specific set of measurements which you describe in the paper.
l.105: "wind direction should be converted..." - please describe what you did, not what should be done.
l.106: "missing values" same as above. Did you filter or interpolate?
l.107: what is an "array of measurement data"? Also, what is missing is a description of the time resolution of the measurements and how many independent samples were prepared for the machine learning. How was the train-test-val split done? Have you checked the frequency distributions of the (normalized) variables? Have you considered log transform for non Gaussian variables? How many time steps are included in each sample?
Section 2.3: there is a lot of information missing from the network description: how many nodes per layer? What is the learning rate? How many epochs were trained? Did the learning rate change during training? Did you try out different numbers of layers and nodes per layer to determine the optimum model? Did you perform a hyperparameter search? Also, what exactly is the input data and what exactly is the target output? Loss function... Those things are standard in the machine learning literature and should be adhered to. I see some of this information appears in the figures and the following section (varying the number of nodes), but this belongs in the model description text.
l.136: if June 2018 has been used in the training already, then this month is not an ondependent test dataset any more.
l.154: does this mean that you always used the same number of modes in each layer? And you did not try to reduce the number of layers? 1600 samples appears rather small for a network with 5 layers.
l.160: I don't understand this. First you train the network for 2016 to 2019, then you run it again to obtain HONO results? You already have them from the training.(?)
l.167: I doubt that the inability fo the model to capture minima and maxima is due to the limited amount of data. This is a general aspect of regression models and extensively discussed in Kleinert et al (2021): https://doi.org/10.5194/gmd-14-1-2021
l.205 and following: this discussion of atmospheric chemsitry doesn't belong into a section describing the application of the model. Is this supposed to be a general discussion section, comparing RND to other (CTM) models?
l.235 Finally, here is a list of the input variables. But is has not been discussed, which variable has which influence on the results. I have a suspicion that the network really makes use only of 3 or 4 of the 9 variables it is given. See Kleinert et al. (2021) for a way how this can be tested with bootstrapping.
l.250/251: the ML model doesn't gain any physical understanding of the HONO chemistry, so it cannot be used to test the existing knowledge. You could use such a tool to forecast HONO levels, for example to determine if it might be worthwhile conducting HONO measurements at a specific location or during a specific time period. You may also be able to use the tool in the context of quality controlling the measurements: any strong disagreement would raise a warning that measurements should be checked with extra care.
Also, you can of course use it to estimate HONO concentrations when these were not measured in order to then perform 0D model runs, as you show in Figure 8.
And in this light, I would agree with the statement that RND is a "supplementary tool".l.262: please provide an explicit URL here (you can still add the reference)
Technical corrections:
l.55: related to the comment on l.43: you presume that the reader is familiar with the basics of HONO chemistry, but this cannot be taken for granted.
l.30 play instead of plays
l.34 and *it* determines...Â
Citation: https://doi.org/10.5194/gmd-2021-347-RC1 -
AC1: 'Reply on RC1', Junsu Gil, 02 Apr 2022
The comment was uploaded in the form of a supplement: https://gmd.copernicus.org/preprints/gmd-2021-347/gmd-2021-347-AC1-supplement.pdf
-
AC1: 'Reply on RC1', Junsu Gil, 02 Apr 2022
-
RC2: 'Comment on gmd-2021-347', Anonymous Referee #2, 14 Feb 2022
In this paper, deep neural network based model is used to calculate nitrous acid (HONO) mixing ratios based on the analysis using HONO measurement data from Seoul between 2016 and 2019. Since I am not an expert in atmospheric sciences, but in data and computer science, I will in my review focus on the computational method used and its validity based on the size and type of the data.Â
The paper is generally well written and takes action to document the use of the suggested model. The citation to code availability is missing DOI (and one has to go over to Zenodo to locate the code)
The approach taken is motivated by the success of deep learning based methods in various areas. However, here (as often elsewhere) it is not taken into account, that deep learning is most useful in situations in which there are massive amounts of training data — which is not the case here. There are nine input features and there are 1636 data items (1122 for training and 514 for validation). Hence, the data is not really massive and because the amount of interactions is limited (only nine input variables), its is quite likely that more traditional machine learning methods would work well (e.g., ordinary linear regression could be used to provide a baseline (and could even suffice), then one could see how e.g., support vector machine or random forest would work). In the paper, the use of deep neural networks is argued by them being more useful than traditional models, because they are able to handle large amounts of data. For the data used, there is no reason to assume that it could not be handled using also some of the traditional methods, in particular, when the data is small, more complicated models are quite prone to overfitting.Â
Suggestion for improvement 1: Test different ML learning models to be able to evaluate properly the usability of the suggested model.Â
My second concern is the feature selection or the lack of it. The model blindly uses the nine input variables from the data. This kind of "taking an ML model off-the-shelf" very rarely produces the best possible results and can seriously affect the performance of the model. In addition to feature selection, it might be also possible to compute some surrogate features, e.g., provide information about dependencies in the modelling domain, reducing the need for the ML models to explicitly model these dependencies.
Suggestion for improvement 2: Use feature selection (for all the models) to search for a best possible set of input features.Â
Finally, the testing of the model using data from April 2019, shows some of the limitations of the developed model. It seems that there is an occurrence of concept drift (when the distribution of data changes, the model does not work well anymore). Also, the error might increase due to overfitting of the model. This aspect should be studied further, in particular it would be important to be able to provide the region in which the model’s accuracy is on an acceptable level. There is a rich body of literature in detecting concept drift (for a survey, e.g., see Zliobaite I., Pechenizkiy M., Gama J. (2016) An Overview of Concept Drift Applications. In: Japkowicz N., Stefanowski J. (eds) Big Data Analysis: New Algorithms for a New Society. Studies in Big Data, vol 16. Springer, Cham. https://doi.org/10.1007/978-3-319-26989-4_4).Â
Suggestion for improvement 3: Analyse the region in which the proposed model can be expected to work, at least provide some discussion on the effect of overfitting and concept drift and how theses affect the usability of the model.Â
Based on these observations, I would reject the paper in its current form, with the encouragement to resubmit, taking the suggestions for improvement into account.Â
Citation: https://doi.org/10.5194/gmd-2021-347-RC2 -
AC2: 'Reply on RC2', Junsu Gil, 02 Apr 2022
The comment was uploaded in the form of a supplement: https://gmd.copernicus.org/preprints/gmd-2021-347/gmd-2021-347-AC2-supplement.pdf
-
AC2: 'Reply on RC2', Junsu Gil, 02 Apr 2022
Junsu Gil et al.
Junsu Gil et al.
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
916 | 133 | 25 | 1,074 | 53 | 9 | 10 |
- HTML: 916
- PDF: 133
- XML: 25
- Total: 1,074
- Supplement: 53
- BibTeX: 9
- EndNote: 10
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1