the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
FastCTM (v1.0): Atmospheric chemical transport modelling with a principle-informed neural network for air quality simulations
Abstract. Chemical transport models (CTM) have wide and profound applications in air quality simulations and managements. However, its applications are often constrained by high computational burdens. In this study, we developed a neural network based CTM model (FastCTM) to efficiently simulate ten air pollutant composition variables, including major PM2.5 species of SO42-, NO3-, NH4+, organic matters and other inorganic components, coarse part of PM10, SO2, NO2, CO and O3. The FastCTM has a principle-informed structure by explicitly encoding atmospheric physical and chemical processes in a basic simulator. Specifically, in the simulator, five neural network modules are proposed to respectively represent five major atmospheric processes of primary emissions, transport, diffusion, chemical reactions and depositions. Given 1-hour initial condition data, the FastCTM is able to simulate future 24-hour concentrations of the ten air pollutants with corresponding meteorology fields and emissions as input. The FastCTM is trained with operational forecast data from a numerical CTM model named Community Multiscale Air Quality (CMAQ) in 2018–2022. The well-trained FastCTM is evaluated comparing to the long-term CMAQ forecast in an independent year 2023, and achieves high agreements with mean RMSE values of 9.1, 11.9, 4.4, 4.0, 48.9 and 10.9 μg/m3 and R2 values of 0.8, 0.81, 0.8, 0.83, 0.9 and 0.7 for PM2.5, PM10, SO2, NO2, CO, and O3. Besides, assessed against hourly site observations of six criteria pollutants, the RMSE values of FastCTM have small relative differences of 4.3 %, 4.2 %, -2.8 %, -1.7 %, -0.3 % and -3.2 % compared to that of CMAQ. The FastCTM model also exhibited reasonable responses of air quality to meteorological variables of air temperature, wind speed and planetary boundary layer height, as well as to input pollutant emissions. Furthermore, due to the principles-oriented structure, internal process analysis could be performed by FastCTM to quantify the specific contribution from each of the five processes for hourly air pollutant concentration changes. In a nutshell, FastCTM has multi-functional advantages in air pollutant concentration simulations, sensitivity analysis and internal process analysis with high computation efficiencies on GPU and accuracy.
- Preprint
(2266 KB) - Metadata XML
-
Supplement
(896 KB) - BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on gmd-2024-198', Anonymous Referee #1, 10 Feb 2025
The authors describe a new, neural network-based reduced-order model of atmospheric chemistry and transport (FastCTM) which has been trained using an extensive dataset of output from CMAQ. FastCTM uses a novel and interesting approach, building physics-informed networks for five separate operators. The authors show that FastCTM is able to reproduce the general patterns of concentrations calculated by CMAQ for an out-of-training-data year (2023), and that the sensitivities of FastCTM to key meteorological variables or nation-wide changes in emissions mostly follow expected patterns. If FastCTM can be shown to be reliable in policy-relevant contexts then it could be a very useful tool.
This approach to modelling is interesting, and this methodological advance has the potential to significantly accelerate air quality scenario analysis. A CTM which can respect key physical constraints (e.g. mass conservation) while also accurately reproducing the effect of different perturbations to emissions and meteorological fields would have great value. However, the manuscript as written does not quite live up to this promise. Along with some minor concerns, the key challenge is that the authors do not show evidence that this new model can fulfil the roles of a CTM and produce accurate results for one of the most common use cases (i.e. understanding the effects of different perturbations). I explain this concern in more detail below, and until this concern is addressed I do not believe the manuscript should be accepted for publication by GMD.
Major commentsThe most significant concern relates to the validation/evaluation of the model. The authors appear to have trained the five physical operators based on several years of output from the CMAQ chemistry transport model. While I have some questions regarding the training process, I will take it as read for the moment that the training was done in such a way as to avoid overfitting. However, the verification of the model rests on its ability to predict, from the 2018-2022 data, the performance in 2023. This approach is inadequate for two reasons. First, the authors do not compare the performance of the model to simpler approaches with the same data such as generalized additive models, gradient boosting, or linear regression with land use (see e.g. Wong et al., 2021 and Cheng et al., 2021). Without such a comparison to evaluate how such models would have performed in predicting 2023, it is difficult to say what the magnitude of FastCTM’s advance is. This is exacerbated by the relatively shallow quantitative assessment in section 3.1.1. RMSE and R2 values are provided, but it is not clear how these were calculated; given that these are calculated as a function of time, are these calculated based on the difference in each of the 158,742 grid cells between CMAQ and FastCTM? A deeper analysis which investigates how model performance varies between (e.g.) rural and urban areas, coastal and inland areas, winter and summer, and so on would provide a much more robust test of the model’s ability to predict the effect of changing meteorology. This could be informed by (e.g.) taking the difference of FastCTM for 2023 against CMAQ for 2023, and comparing that to the difference between CMAQ for 2023 and the average of CMAQ from 2018 to 2022. This would at least demonstrate whether FastCTM provides more explanatory power for the mean atmospheric state than taking the average concentration from the previous five years.
Second, and perhaps more importantly, the function of FastCTM is to reproduce the results of high-fidelity CTMs at a fraction of the computational cost – specifically to support air pollution simulations, sensitivity analysis, and internal process analysis (abstract lines 30-32). The comparison to 2023 only tells us that FastCTM can reproduce the general pattern of air pollution in 2023, but does not tell us whether FastCTM will accurately predict the effect of interventions. The sensitivity tests in section 3.2 have no basis for comparison, and are in any case so broad (representing nationwide changes in temperature, PBL height, or emissions) that they are a limited test of the CTM’s capabilities. At the very minimum, an evaluation is needed which shows that FastCTM’s trends actually match the underlying trends in WRF-CMAQ; this should be straightforward for the emissions cases. Since the goal of FastCTM is to reduce computational costs, it is critical that FastCTM be shown to be faithful to its parent model for realistic applications such as projecting the impact of a change in emissions. Going further and comparing sensitivities for local or single-sector emissions changes would provide even more powerful proof, and I strongly recommend that the authors consider such a comparison.
Without these kinds of quantitative comparisons I can only judge the model’s success based on data such as Figure 3, where I am concerned because the patterns do not – speaking qualitatively – appear to match that well between CMAQ and FastCTM. I am particularly concerned that the model may be mostly reproducing emissions maps and historical scalings, rather than accurately representing chemistry and transport (especially given that transport is 2-D only). A more critical, quantitative analysis of the models strengths and weaknesses would be necessary before I would recommend its use in a scientific or regulatory context.
Minor commentsThe description of the five operators does not quite seem to verify that physical constraints are being satisfied but this may simply be a misinterpretation on my part. For example, can you confirm that the method you used to generate the convolution kernels (Eq 5-7) for transport ensures mass conservation? This seems to depend on Ci being in units of molec/cm3 rather than ppbv, but the units of Ci are not clearly specified.
A related concern is that surface layer winds are used and treated conservatively, which neglects the fact of rapid vertical mixing. Can the authors provide evidence that the surface winds (which would be expected to be slower than the mean wind speed in the boundary layer) are accurately predicting pollutant motion? It seems that any model which is designed to predict transport using only the horizontal near-surface winds will underestimate overall transport. Should the model not be using the PBL-averaged horizontal winds in Eq. 4 instead of the surface winds at 10-meter height (lines 83-84)?
It would be helpful to get more detail on how components such as the diffusion encoder were trained. Currently the manuscript states that 5 years of data (2018 – 2022 inclusive) were used in training, but not how the five different models were trained using that data. A naïve assessment would assume that all five sub-models were trained based simply on hour-to-hour pollutant concentrations, but that would suggest that the models were each trying to represent all atmospheric processes simultaneously.
Line 172 says that the reaction encoder in Equation 12 “has the same structure as that of reaction and deposition encoder models (Eq. 10)”. This is recursive, but also Eq. 10 refers to the diffusion module?
On line 194, “We did not use the fixed area as that in the previous studies (Xing et al., 2022)” – can you elaborate? It was not clear to me what this meant.
The y-axis labels on Figure 5 say “Percentage”, but from context it appears these must really be the factor difference from the baseline (as all cross at 1.0).
Finally, there are numerous minor grammatical errors (e.g. L14: simulations and managements; L67: interpretations of the FastCTM are also widely vowed; L70: including and major; and so on). This is not important for judgment of the paper’s appropriateness for publication, but I recommend the authors take another look at the paper to correct such minor issues.
CitationsWong et al., “Using a land use regression model with machine learning to estimate ground level PM2.5.” Environ. Pollut., 2021.
Cheng et al. "Influence of weather and air pollution on concentration change of PM2. 5 using a generalized additive model and gradient boosting machine." Atmospheric environment, 2021.
Citation: https://doi.org/10.5194/gmd-2024-198-RC1 -
RC2: 'Comment on gmd-2024-198', Anonymous Referee #2, 01 Mar 2025
Lyu et al. put forth the 'FastCTM' model which seems to be a reduced complexity model that discretizes changes in concentrations for 10 air pollution species. Though interesting, the presentation of methods, results, and context of the study needs to be heavily refined before being accepted. The details of the study are currently not sufficient as they stand.
Introduction:
"The air pollutant and species concentrations can be then calculated by solving these complicated equations with numeric methods (Byun and Schere, 2006), which is often time-consuming and requires intense computational resources." --> This thought is not very well flushed out. A single reference from 2006 does not detail at all what makes these models computationally expensive."Quantifying the contributions of individual processes would provide fundamental explanations for a model's predictions, and therefore is also useful in identifying potential sources of error in the model formulation or its inputs (Liu et al., 2010)." --> I find this introduction quite poor. The authors provide minimal examples of emulating entire CTMs but give no examples of using ML to emulate and replace CTM model components which there are many for chemistry, photolysis, deposition, etc. This needs much greater discussion as it shows a lack of awareness by the authors of what currently exists, below of which are only several examples:
Krasnopolsky, V. M., Fox-Rabinovitz, M. S., and Chalikov, D. V.: New Approach to Calculation of Atmospheric Model Physics: Accurate and Fast Neural Network Emulation of Longwave Radiation in a Climate Model, Monthly Weather Review, 133, 1370–1383, https://doi.org/10.1175/MWR2923.1, 2005.
Kelp, M. M., Jacob, D. J., Lin, H., and Sulprizio, M. P.: An Online-Learned Neural Network Chemical Solver for Stable LongTerm Global Simulations of Atmospheric Chemistry, Journal of Advances in Modeling Earth Systems, 14, e2021MS002926, https://doi.org/10.1029/2021MS002926, _eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1029/2021MS002926, 2022.
Xia, Z., Zhao, C., Du, Q., Yang, Z., Zhang, M., and Qiao, L.: Advancing Photochemistry Simulation in WRF-Chem V4.0: Artificial Intelligence PhotoChemistry (AIPC) Scheme with Multi-Head Self-Attention Algorithm, https://www.authorea.com/users/816476/articles/1217166-advancing-photochemistry-simulation-in-wrf-chem-v4-0-artificial-intelligence-photochemistry-aipc-scheme-with-multi-head-self-attention-algorithm, 2024.
Zhong, X., Ma, Z., Yao, Y., Xu, L., Wu, Y., and Wang, Z.: WRF–ML v1.0: a bridge between WRF v4.3 and machine learning parameterizations and its application to atmospheric radiative transfer, Geoscientific Model Development, 16, 199–209, https://doi.org/10.5194/gmd16-199-2023, publisher: Copernicus GmbH, 2023.Silva, S. J., Heald, C. L., Ravela, S., Mammarella, I., and Munger, J. W.: A Deep Learning Parameterization for Ozone Dry Deposition Velocities, Geophysical Research Letters, 46, 983–989, https://doi.org/10.1029/2018GL081049, tex.copyright: ©2018. American Geophysical Union. All Rights Reserved., 2019.
"process analysis" --> I don't know what this means" Interpretations of the FastCTM are also widely vowed to improve deep learning model applications in earth system science and climate studies. " --> Not sure how you can claim this given no evidence, more aspirational than substantive
"The FastCTM is currently configured to simulate hourly concentrations of 10 pollutant variables, including and major species of PM2.5 (SO4 2−, NO3 −, NH4+, organic matters and other inorganic components, coarse part in PM10, CO, NO2, SO2 and O3." --> Not sure how many atmospheric chemists and climate scientists want a CTM with only ten species. Needs much more motivation. Even small chemical mechanisms in operational use have around ~70 species.
Methods:
"CMAQ structures" --> I don't know what structures means here-Is this predicting only surface level concentrations?
-I would not really call this model a CTM, this feels more like a reduced order model. There are potentially hundreds of chemical species that lead to the formation of PM2.5, O3, etc. And yet you do not mention the chemical mechanism at all in the WRF-CMAQ model. This work is basically mapping emissions to concentrations in a fairly naive way.
" A detailed description of CMAQ principles is available elsewhere (Byun and Schere, 2006) " --> I find this lazy. This paper is 20 years old and I do not know what you would like the reader to find in it.
"Chemical Reaction Module" --> This just sounds like a first order approximation using idealized rate constants. There is a very rich and long history of using ODE solvers to get the solution to complex chemical mechanisms. There really is not enough discussion with this module (or really any of the preceding module sections). You are highly simplifying each of these processes without an underlying discussion of why you are doing so. There already exist data-driven and reduced complexity modeling systems that accomplish similar air quality regulation goals (e.g., InMAP, EASIUR, APEEP).-I don't explicitly understand how this is a machine learning model. You describe a sequence-to-sequence modeling framework reminiscent of an LSTM, but no mention of memory or hyper parameters in general. The inclusion of these equations may seem more like a symbolic regression kind of ML framework, but the details are sparse and lack substance. Are all the modules trained jointly so that error influences each other? Chemistry is constantly affected by other modules (and vice versa) yet these interaction terms can't be considered during training at all. That is, how does error propagate from one time step to the next in training? Is the underlying WRF-CMAQ simulations two-way coupled such that weather influences chemistry and chemistry feedbacks via aerosol effects to influence the weather? Not enough details in the underlying simulations or the joint training of modules. There are examples of this kind of offline training here:
Kelp, M. M., Jacob, D. J., Kutz, J. N., Marshall, J. D., and Tessum, C. W.: Toward Stable, General Machine-Learned Models of the Atmospheric Chemical System, Journal of Geophysical Research: Atmospheres, 125, e2020JD032759, https://doi.org/10.1029/2020JD032759, 2020.
Yang, X., Guo, L., Zheng, Z., Riemer, N., and Tessum, C. W.: Atmospheric chemistry surrogate modeling with sparse identification of nonlinear dynamics, https://doi.org/10.48550/arXiv.2401.06108, 2024.
Liu, Z.-S., Clusius, P., and Boy, M.: Neural Network Emulator for Atmospheric Chemical ODE, https://doi.org/10.48550/arXiv.2408.01829, 2024.
"The main objective of our study is to build and validate a principles-guided neural network based FastCTM that could simulate spatial-temporal fields of hourly concentrations of major air pollutant species like a traditional CTM. Besides, the FastCTM could model individual contributions from each of the atmospheric processes of transport, diffusion, deposition, reaction and emission. " --> this should be stated earlier. The term "principles-guided" is vague, and I don't really consider this 'like a traditional CTM'. You discretize the potential processes that affect air quality outputs, but this is more like a traditional reduced complexity model approach. I think a deeper review into the literature would help the authors situate their work in this established landscape."Furthermore, CMAQ and FastCTM forecasts were both evaluated by hourly observations from national monitoring sites (as shown in Figure S5 in the supplementary material) for six criteria pollutants (PM2.5, PM10, SO2, NO2, CO, and O3)." --> What is the point of this if CMAQ is your ground truth?
Results:"Besides, since the FastCTM is a 2-D model only considering atmospheric processes within the boundary layer, lower consistency with the CMAQ model during daytime could be due to more active vertical turbulence which is not fully represented." --> Isn't the point of having this processed-based emulation the ability to attribute errors to processes? This sounds hand-wavy and does not explain the variability very well
"It is important to note that the relatively low R2 values observed for NH4+ can be attributed to the fact that it is the sole cation included in the FastCTM model without a corresponding acid-base balance, which may affect the model's predictive accuracy." --> I don't see how this is the reason. WRF-CMAQ has many base pairings that can neutralize NH4+ that are not represented here. I don't recall conservation of mass as a constraint in your chemical module. Furthermore, how do you know that NH4+ does not precipitate out as it is very hydrophilic.
-I actually believe it is quite concerning that the RMSEs vary diurnally. You should also plot the WRF-CMAQ and FastCTM time series against each other. A diurnal error actually may suggest that you are not correctly learning the atmospheric dynamics of the system well. You may be predicting an average concentration across all time and that's why you see a diurnal error profile.
"FastCTM forecasts using zero values as input air quality data were almost the same as that using ordinary input in the long leading hours, indicating that FastCTM simulations in long leading hours are not affected by initial conditions, just like deterministic numeric CTMs (such as CMAQ)" --> This is hard to conclude, you need to plot actual concentration time series instead of RMSEs. It seems like the error is always the same, this could mean the FastCTM always predicts the same values given the time of day. More results need to be presented.
Figure 3 is unwieldy. There are 60 mulitplots and not well labeled on the figure. Here you should show spatial differences in terms of both absolute and relative error. Seems like FastCTM does not capture the highest concentration values, which is concerning given that is the largest impact on health and climate. Hard to have any substantive discussion of results without any quantitative measure regarding Figure 3.Section 3.1.2. Again, I don't see why this comparison makes sense. You do not incorporate any station data, so why would you make comparisons against it? The WRF-CMAQ model is the ground truth here.
Sections 3.2: These don't have much meaning if we do not understand how the FastCTM model behaves in relation to the parent model
Figure 8: These color bars are difficult to discern changes in concentrations. Does adding d through h yield panels a or b? Again, individual contribution doesn't matter if we don't know how the model actually behaves.
Citation: https://doi.org/10.5194/gmd-2024-198-RC2
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
222 | 46 | 15 | 283 | 23 | 22 | 19 |
- HTML: 222
- PDF: 46
- XML: 15
- Total: 283
- Supplement: 23
- BibTeX: 22
- EndNote: 19
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1