Downscaling Atmospheric Chemistry Simulations with Physically Consistent Deep Learning
- 1Pacific Northwest National Laboratory, Richland, WA, USA
- 2University of Southern California, Los Angeles, CA, USA
- 3ClimateAi, Inc. San Francisco, CA, USA
- 1Pacific Northwest National Laboratory, Richland, WA, USA
- 2University of Southern California, Los Angeles, CA, USA
- 3ClimateAi, Inc. San Francisco, CA, USA
Abstract. Recent advances in deep convolutional neural network (CNN) based super resolution can be used to downscale atmospheric chemistry simulations with substantially higher accuracy than conventional downscaling methods. This work both demonstrates the downscaling capabilities of modern CNN-based single image super resolution and video super resolution schemes and develops modifications to these schemes to ensure they are appropriate for use with physical science data. The CNN-based video super resolution schemes in particular incur only 39 % to 54 % of the grid-cell-level error of interpolation schemes and generate outputs with extremely realistic small-scale variability based on multiple perceptual quality metrics while performing a large (8 x 10) increase in resolution in the spatial dimensions. Methods are introduced to strictly enforce physical conservation laws within CNNs, perform large and asymmetric resolution changes between common model grid resolutions, account for non-uniform grid cell areas, super resolve log-normally distributed datasets, and leverage additional inputs such as high-resolution climatologies and model state variables. High-resolution chemistry simulations are critical for modeling regional air quality and for understanding future climate, and CNN-based downscaling has the potential to generate these high resolution simulations and ensembles at a fraction of the computational cost.
Andrew Geiss et al.
Status: open (until 24 Jun 2022)
-
CC1: 'Comment on gmd-2022-76', Patrick Obin Sturm, 24 Mar 2022
reply
The authors in this work develop a neural network (NN) approach for super-resolution, allowing atmospheric chemistry simulations to be run at coarser grid resolutions for computational efficiency. The convolutional NN super resolution is in better agreement with fine resolution output than other interpolation methods. Particularly interesting is the incorporation of scientific fundamentals in the NN, by using a layer in the NN that enforces physical consistency. Specifically, a conservation layer in the NN ensures that the mixing ratio of chemical species in super-resolved regions agrees with the mixing ratio of the corresponding coarse grid cells.
Physically consistent machine learning (ML) tools in the field of atmospheric chemistry modeling are quite new, and this study seems to me to make a novel and potentially valuable contribution in a relatively underexplored area. That being said, I’d like to bring to the authors’ attention some prior related work that develops physically consistent machine learning tools for use in atmospheric chemistry modeling. This manuscript shares goals and has some conceptual overlap with recent work published and under review in GMD:
Sturm, P. O. and Wexler, A. S.: A mass- and energy-conserving framework for using machine learning to speed computations: a photochemistry example, Geosci. Model Dev., 13, 4435–4442, https://doi.org/10.5194/gmd-13-4435-2020, 2020.
Sturm, P. O. and Wexler, A. S.: Conservation laws in a neural network architecture: Enforcing the atom balance of a Julia-based photochemical model (v0.2.0), Geosci. Model Dev. Discuss. [preprint], https://doi.org/10.5194/gmd-2021-402, in review, 2021.
The neural network used in this work has some similarities to the physics-constrained NN architecture developed in Sturm and Wexler (2021) in that a layer at the end of a neural network enforces conservation laws. The approach in Sturm and Wexler (2021) has some similarities to the Beucler et al. (2021) approach already cited in line 86, but with a specific focus for conserving mass in atmospheric chemistry. This work is distinct in its application (super-resolution in a spatial domain) and for that reason strictly conserves mixing ratio in space.
The work in this manuscript additionally develops a way of re-dimensionalizing quantities before enforcing the conservation of mixing ratios. This allows the CNN to operate on normally distributed data, which has shown to improve convergence while training neural networks. The re-dimensionalized data is then passed to a layer that enforces hard constraints (conservation of mixing ratio of chemical species). This is an interesting method that allows for normalization while still enforcing conservation in a dimensionalized space; this approach could also be used to improve upon the physics-constrained NN developed in Sturm and Wexler (2021) in future studies.
If the authors think these recent papers are relevant, a short discussion of them might fit after the sentence on lines 81-83 “Several studies have addressed this problem by adding terms to the neural network’s loss function that nudge it towards better agreement”, or perhaps after the motivation for NN tools that contain “internal representations of known physical properties of the underlying system” in line 86. Sturm and Wexler (2020) contributed the first framework towards strict conservation laws in ML surrogate models of atmospheric chemistry. The physics-constrained NN in Sturm and Wexler (2021), despite not being a convolutional NN, is a type of NN architecture that contains an internal representation of the system (atom flux continuity/ a stoichiometric balance) which is applied to strictly conserve atoms in a neural network surrogate model of atmospheric chemistry.
-
AC3: 'Reply on CC1', Andrew Geiss, 06 May 2022
reply
Thank you for your feedback. I was not previously familiar with these two papers, but they are definitely relevant, and I plan to add references to both in the final revision of the manuscript. I am glad to see more work on integrating physical constraints directly into ML architectures. Like you say, the method developed in these two papers, our super-resolution method, and the method from Beucler et al. 2021, are unique in the exact approach used, but are closely related in that they all implicitly enforce physical constraints within the neural network architecture.
-AG
-
AC3: 'Reply on CC1', Andrew Geiss, 06 May 2022
reply
-
CEC1: 'Comment on gmd-2022-76', Juan Antonio Añel, 25 Apr 2022
reply
Dear authors,
After checking your manuscript, it has come to our attention that it does not comply with our Code and Data Policy.
https://www.geoscientific-model-development.net/policies/code_and_data_policy.html
You have archived your code on GitHub. However, GitHub is not a suitable repository. GitHub itself instructs authors to use other alternatives for long-term archival and publishing, such as Zenodo (GitHub provides a direct way to copy your project to a Zenodo repository). Therefore, please, publish your code in one of the appropriate repositories.
In this way, you must include in a potential reviewed version of your manuscript the modified 'Code and Data Availability' section, including the DOI of the repository. Also, there is no license listed in the repository of your code. If you do not include a license, the code continues to be your property and can not be used by others, despite any statement on being free to use. Therefore, when uploading the model's code to the repository, you could want to choose a free software/open-source (FLOSS) license. We recommend the GPLv3. You only need to include the file 'https://www.gnu.org/licenses/gpl-3.0.txt' as LICENSE.txt with your code. Also, you can choose other options. For example, Zenodo provides GPLv2, Apache License, MIT License, etc.
Please, reply as soon as possible to this comment with the link to the new repository so that it is available for the peer-review process and the Discussions stage, as it should be.Juan A. Añel
Geosci. Model Dev. Executive Editor
-
AC1: 'Reply on CEC1', Andrew Geiss, 29 Apr 2022
reply
I have added a GNU GPL 3.0 license file to the github repository and generated a Zenodo link and DOI for the project code: 10.5281/zenodo.6502896
-AG
-
AC1: 'Reply on CEC1', Andrew Geiss, 29 Apr 2022
reply
-
AC2: 'Comment on gmd-2022-76', Andrew Geiss, 29 Apr 2022
reply
I have fixed an incorrect label in the video supplement. Here is an updated youtube link for easier viewing (I recommend setting HD playback): https://youtu.be/QL_onStfd90
I also generated a permanent Zenodo link to the video supplement: https://doi.org/10.5281/zenodo.6506306
These will be added to a future version of the manuscript.
-AG
-
RC1: 'Comment on gmd-2022-76', Anonymous Referee #1, 12 May 2022
reply
Review of “Downscaling atmospheric chemistry simulations with physically consistent deep learning”
Geiss et al. produce an interesting use case of upsampling/downscaling CNNs for the purpose of producing higher resolution chemistry simulation coarse forecasts from process models. I think the main interest, and perhaps the novelty, is in their adaption of widely used CNNs to ensure that to high resolution output is physically realistic and consistent. I am not aware of this being done before, so I think this is of interest to the community and therefore GMD.
Overall, I thought the message of the paper was clear and tells a compelling story as to how these upsampling methods can be useful. I particularly liked how the introduction contains all the relevant details though I think some terminology could use more explanation (comments about this are later in the review). I have no major concerns about this manuscript, and any minor comments and thoughts are included below. Thank you for a nice and interesting paper!
Major comments
Section 1.1 and 3.1: I think the introduction of the technical language could be improved a bit. GMD will have readers who don’t understand what CNNs are or why we’d want to use one for image processing. I think perhaps a couple of plain text sentences about CNNs and why they’re useful would benefit this section. There’s also language used without introduction such as 3-layer, vanishing gradients, convolutional kernels, and deeper CNNs, that could be explained more. I don’t disagree at all with what you’ve written, I just think the audience (GMD) and the manuscript will benefit from a little more explanation.
Minor comments
L23: It’d be worth clarifying that you mean small ‘spatial’ lengthscales as small temporal lengthscales are equally important.
L92: It’s not immediately apparent to me why being able to train a CNN on log-normally distributed data is a result of your work. Wouldn’t a standard approach be to scale your date before training?
L160: What’s a spatial chip?
L174&177: There are numerous other loss functions that will account for the issue of the loss dominating for large concentrations (negative log likelihoods, normalised loss etc)
L201: Unit for this value 4x10^6 would be useful?
L203: Neither a ReLU nor an ELU will enforce non-negative outputs. I believe this sentence to be incorrect and it should be updated.
L271: What was the motivation for using MAE to evaluate. I personally would have thought MSE alongside so evaluation of fractional errors to be more informative.
L275: This could be a limitation in my understanding, but I thought SSIM values are between 0 and 1, not -1 and 1.
Table1: What’s the intuition behind CO and it’s fairly similar performance (for LOG-SSIM, but not MAE) across the downscaling methods?
Figure 2 and others: Representing ship tracks only really shows that the CNN learns that these are stationary features right. If we moved this ship track elsewhere, I’d imagine the CNN wouldn’t upscale that well. Is this right?
Figure 6: About the ‘ringing artifacts’. Should we be concerned that the interpolation methods (particularly the CNN) are producing upsampled output that contains these harmonic artifacts?
Sec5: What’s the additional computational cost of using VSR methods as opposed to SISR?
Typographical comments
L114 (and throughout): I don’t think chemicals should be in LaTex math mode. If using LaTex, try using the chemformula package and \ch{} command.
L159: Mention explicitly that SLP is sea level pressure?
L415: As far as I can tell, GCM hasn’t been introduced.
Other thoughts
L111: I really appreciated the forethought in scaling to a fairly standard (non-square model resolution)
Sec3.4: I think this is all very sensible and a nice solution to the conservation problem
I really appreciated the availability of the code and the video supplement. I thought these were useful additions.
Andrew Geiss et al.
Model code and software
Atmos. Chem. Downscaling CNN Andrew Geiss, Sam J. Silva, Joseph C. Hardin https://github.com/avgeiss/chem_downscaling
Video supplement
Ozone Super Resolution Andrew Geiss, Sam J. Silva, Joseph C. Hardin https://youtu.be/JPJX1k-5yew
Andrew Geiss et al.
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
470 | 129 | 18 | 617 | 4 | 4 |
- HTML: 470
- PDF: 129
- XML: 18
- Total: 617
- BibTeX: 4
- EndNote: 4
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1