the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
The SAPRC atmospheric chemical mechanism generation system (MechGen)
William P. L. Carter
Jia Jiang
Zhizhao Wang
Kelley C. Barsanti
MechGen is a software system designed to derive gas-phase reaction mechanisms for reactive organic compounds under atmospherically relevant conditions for use in chemical models and for data analysis and interpretation. It has been used to derive versions of the SAPRC mechanisms used in airshed models, with SAPRC-22 being the most recent. MechGen derives fully explicit mechanisms for many types of organic compounds and their oxidation products when they react in the atmosphere in the presence of oxides of nitrogen and other pollutants and then uses these explicit mechanisms to derive reduced or lumped mechanisms more suitable for use in airshed models. This paper gives an overview of the system, describes the procedures it uses to generate explicit and reduced mechanisms, and presents several types of applications. The assignments and estimates used to derive individual chemical reactions and assign rate constants are discussed in a separate companion paper. The system is publicly accessible for generating explicit mechanisms for single compounds and viewing associated documentation using a web-based interface. A separate terminal login is available for deriving mechanisms for multiple compounds, multi-generation mechanisms, and portions of lumped mechanisms for airshed models, as well as for system programming and management. MechGen is designed to accommodate updates to the chemical estimates and assignments that it uses to reflect our evolving knowledge of and ability to estimate atmospheric reactions of organic compounds.
- Article
(1216 KB) - Full-text XML
-
Supplement
(1570 KB) - BibTeX
- EndNote
1.1 Background
Many hundreds of volatile organic compounds (VOCs) are emitted into the lower atmosphere, from both anthropogenic and biogenic sources. Once emitted, they can undergo reactions to form oxidized organic products, including gas-phase toxics, criteria pollutants, and secondary organic aerosol (SOA). For example, in the presence of nitrogen oxides (NOx), these reactions can generate radicals that react to form ozone (O3) and oxidized nitrogen compounds that affect air quality. The atmospheric reaction mechanisms for most of these compounds are complex, particularly for larger molecules that involve a large number of reactive intermediates and form a large number of oxidized organic products that can continue to react in the atmosphere. In addition, in most cases, these mechanisms involve reactions with rate constants that have not been measured and must be estimated. Because of the complexity, for practical reasons, it is necessary either to greatly simplify the mechanisms for most VOCs, use extensive lumping or reduction in VOC representations, or use an automated chemical mechanism generation system to generate estimated mechanisms.
An overview of automated generation of reaction mechanisms was presented by Green (2019), which included references to systems that have been developed. A notable example is the RMG system of Green and co-workers (RMG, 2025), originally developed to model combustion systems (Gao et al., 2016) but later extended to apply to other areas such as liquid-phase systems and heterogeneous catalysis (Liu et al., 2021). RMG has an extensive set of tools, links to kinetics and other databases, and has extensive documentation (RMG, 2025). Kirchner (2005) discussed an atmospheric chemical mechanism generator program called CHEMATA, but that work focused on its application to condensed mechanisms, and literature searches revealed no subsequent work with that system. The most comprehensive systems specifically designed to predict detailed mechanisms for lower tropospheric air pollution models are the Generator for Explicit Chemistry and Kinetics of Organics in the Atmosphere (GECKO-A) system (Aumont et al., 2005) and the SAPRC mechanism generation system (MechGen) that is the subject of this paper.
MechGen was developed for deriving the portions of the SAPRC-90 through SAPRC-16 atmospheric chemical mechanisms (Carter, 1990, 2000, 2010a, b, 2016; Carter and Heo, 2012, 2013; Venecek et al., 2018) that concerned reactions of C2+ organic compounds. The first version, used for SAPRC-90 (Carter, 1990), generated mechanisms only for alkanes, using the procedures and estimates documented by Carter and Atkinson (1985). When SAPRC-99 was being developed, the mechanism generation system was re-written and extended to cover a much wider range of acyclic and monocyclic compounds, including monoalkenes, alcohols, ethers, esters, aldehydes, ketones, and organic nitrates in addition to alkanes (Carter, 2000). That version was used to determine the net effects of these compounds in the presence of NOx that was incorporated into SAPRC-99, though reactions in the absence of NOx were not generated. A number of updates to the system were made when SAPRC-99 was updated to SAPRC-07 (Carter, 2010a, b), including the ability to generate mechanisms for a wider variety of compounds; however, it remained limited to generated mechanisms in the presence of NOx, and aromatics were still not supported. The system was further updated for use in the development of the SAPRC-16 (Carter, 2016; Venecek et al., 2018) mechanism and unpublished updated versions, incorporating capabilities to generate reactions in the absence of NOx, reactions of aromatics, autoxidation reactions of peroxy radicals, and other enhancements. Further updates were made for the current version, which has been used in the most recent of the SAPRC mechanisms, SAPRC-22 (Carter, 2023).
The chemical basis of MechGen and the methods it uses to assign or estimate rate constants or mechanisms are documented by Carter et al. (2025a). The current, and first publicly documented, version of MechGen, v1.1, is available online, along with a comprehensive user manual and a quick start guide. These are made available at the MechGen website (Carter, 2025a) and GitHub site (Jiang et al., 2025), and archived versions can also be downloaded from these websites. This paper describes the software system and the procedures it uses for mechanism generation and processing.
1.2 Chemical systems represented
MechGen is capable of generating fully explicit mechanisms for the atmospheric reactions of most types of organic compounds emitted into or formed in the lower atmosphere and of the intermediate radicals they form. While temperature-dependent rate constants are provided for many reactions, for others, the rate constants or branching ratios are applicable only for conditions representing the troposphere, i.e., around 298 K and 1 atm. MechGen is not currently designed for estimating mechanisms for combustion modeling or for the low temperature and pressures characteristic of the upper troposphere.
Table 1 lists the types of stable compounds whose reactions can be generated and shows the types of initial atmospheric reactions that can be generated for them. Table 2 lists the types of reactions that are generated, including reactions of intermediate radicals as well as reactions of stable compounds. The reactions of stable compounds include H-atom abstractions by OH, NO3, and Cl radicals, additions to double bonds by these radicals and by O3 and O3P, and reactions by photolysis. The types of radicals generated include carbon-centered radicals that in most cases react primarily with O2; peroxy radicals that in most cases react with NO, NO2, NO3, HO2, or other peroxy radicals and in many cases also have unimolecular reactions; alkoxy radicals that can react with O2 or by various types of unimolecular reactions; and excited and stabilized Criegee intermediates formed in reactions of alkenes with O3. Measured rate constants or branching ratios derived from measured product yields are used when data are available. In the absence of such data, which is typically the case, rate constants or branching ratios are estimated using various structure-activity relationships (SARs) or other estimation methods. The details of the types of reactions and how rate constants or branching ratios are assigned or estimated are given by Carter et al. (2025a).
Table 1Types of stable compounds whose reactions are supported by MechGen.
* See Carter et al. (2025a) for more details and types of compounds not supported.
1.3 System software and access
The current version of MechGen is incorporated into an online multi-user object-oriented (MOO) system that was originally developed as a programmable text-based virtual reality system (Curtis, 1997; Fox, 2004; Wikipedia, 2025). Features of the MOO programming language, which is similar to Python, made it well suited for mechanism generation applications compared to the original Fortran version used for SAPRC-90 (Carter and Atkinson, 1985; Carter, 1990). The MOO system also has online access capabilities that make it relatively straightforward to permit multiple users to access it online simultaneously.
Because MechGen is written in an object-oriented language, the system employs various software “objects” that include properties defining their characteristics and programs or subroutines controlling their operations. The MechGen software objects referenced in this paper include reactant objects that represent molecules or radicals that can be reacted; group objects that represent portions of the molecule that determine how they react; reactor objects that are assigned to each user login that provide a user interface and give mechanism generation options; environment objects that contain concentrations of atmospheric species that can be used to determine product yields from mechanisms; multi-generation mechanism objects that control multi-generation mechanism derivations for a given reactant; and lumping objects that control the full mechanism generation process and whether and how lumped mechanisms are also produced. The MechGen system employs other types of objects, but a discussion of these details of the software system is beyond the scope of this paper.
MechGen is accessible online via a web interface available at the MechGen website (Carter, 2025a) or through a terminal interface such as Telnet (Postel and Reynolds, 1983), with instructions available at the MechGen and MechGen GitHub sites. Terminal access provides the highest degree of system capability, while web access is more user-friendly for basic operations and is better suited for providing information about the system and the reactions being generated. Due to the high computational demand, resource-intensive operations such as deriving multi-generation mechanisms or generating mechanisms for large or for large numbers of molecules are not currently available using the online system. To access the full capabilities of MechGen, users need to install the system locally by following the instructions provided in the user manual.
1.4 Overview of documentation
The methods used by the current version of MechGen to derive or estimate rate constants and mechanisms for gas-phase atmospheric reactions of organic compounds are described by Carter et al. (2025a). While that paper describes the chemical basis for the mechanisms MechGen generates, it does not explain the system itself or its functionality. This paper fills that gap by discussing the following specific capabilities of MechGen:
-
Deriving unimolecular or bimolecular reactions of individual compounds and radicals under atmospheric conditions, using assigned or estimated rate constants and products formed (Sect. 2);
-
Deriving explicit single-generation mechanisms of a selected compound by reacting the compound and all the rapidly reacting intermediates formed, but not reacting the stable, non-radical products (Sect. 3.1);
-
Deriving minimally reduced mechanisms that give the same predictions as the explicit mechanism at a selected temperature but with about the number of reactions and the number of species (Sect. 3.2);
-
Estimating yields of products formed when reacting a compound under selected environmental conditions (Sect. 3.3);
-
Deriving explicit multi-generation mechanisms of a selected compound by fully reacting the compound and all stable reactive products formed in non-negligible yields (Sect. 4); and
-
Optionally driving lumped mechanisms that use a limited number of lumped model species to represent chemically similar compounds, and using various approaches to reduce the numbers of reactive intermediates in the mechanisms (Sect. 5).
Additional details about the methods and algorithms are also given in the Supplement. Details concerning how to work with MechGen and a description of features not covered here are given in the user manual that is available at the MechGen website (Carter, 2025a). The MechGen GitHub site (Jiang et al., 2025) also contains the user manual and downloadable software and files used by MechGen (see Sect. 7.3). The examples shown here and the Supplement are based on the current chemistry assignments and methods discussed by Carter et al. (2025a). The system is expected to operate similarly when the chemistry assignments and estimates are updated in the future.
2.1 Creation and specification of reactants
Reactants are created as objects within MechGen by specifying their structures. The structure of an organic reactant or radical is specified by giving the “groups” in the molecule or radical and indicating the groups each are bonded to and the type of bond. As listed in Table 3, groups are parts of molecules that are treated as units in the system and contain no more than one carbon, nitrogen, or halogen atom but can also contain one or several hydrogen or oxygen atoms. The leading or trailing “–”, “=”, “#”, or “–a” indicates groups that can bond to neighboring groups by single, double, triple, or aromatic/allylic bonds, respectively. Any groups with such designations can bond to other groups with such designations to form chains. Mechanisms are generated and rate constants are estimated based on the groups present, the groups they are bonded to, and, in some cases, groups elsewhere in the molecule. Groups can be radical or non-radical, with reactants containing a radical group referred to as radicals or intermediates. Examples of reactant designations are given in Table 4, and additional discussion on reactant designation and creation by specifying structures is available in the user manual.
Table 3List of groups and group designations used to specify C2+ organic reactants whose reactions can be generated using MechGen.
a Bond types are indicated by “-” (single or aromatic), “=” (double), or “#” (triple). The number of bonds shown indicates if it is bonded to one or two or more groups. The “()” notation is not part of the group designation itself but is used to indicate that the group is bonded to a third or fourth group. An “a” or “p” prefix in the group name indicates that it has alternating single and double bonds. The “p” prefix is used for phenyl, phenoxy, or phenyl peroxy radicals. Some groups can have different bond designations depending on their location in the molecule, e.g., “CH3-” or “-C[.]=” rather than “-CH3” and “=C[.]-”. See examples in Table 4. bAllylic groups are used in reactants with two or more resonance structures involving adjacent double bonds and radical centers. Allylic radical groups are used for the portions where that the radical center is located in at least some of the resonance structures. Aromatic groups are used for the portions where the radical center is not located in any resonance structures. c The “{excited}” designation is given at the end of the structure designation and indicates how the excited intermediate was formed. This designation is also used following full reactant structure designations to indicate excited adducts formed when radicals add to double bonds, but in this case the excitation is associated with the entire molecule and not a single group.
Table 4Examples of designations of selected representative compounds or intermediates.
a The first MechGen structure codes given are those generated by the system. If subsequent structures are given, they are alternatives that can be used to create the same compound. b The symbols “v” and “∧” are used to indicate “cis” or “trans” configurations, analogous to “\” and “/” in SMILES notation. c The symbol “*” is used to indicate ring closure. “*1”, “*2”, etc., are used for multiple ring structures, analogous to “1”, “2”, etc., in SMILES notation. d The symbol “aC” is used to indicate carbon centers where single and double bonds are in resonance, as in aromatic species or allylic radicals.
MechGen also includes objects representing elementary species that can react with reactants (e.g., OH or O3), react with radicals (e.g., O2 or NO), or be formed in reactions (e.g., H2O or CH4). These are listed in Table 5.
Table 5List of elementary species used by MechGen.
a These species are treated as unreactive by MechGen. b Pseudo-species used to indicate photolysis or unimolecular reactions of stable compounds. c Generic species used as reactants (RO2., RCO3.) or products (RO., etc.) in peroxy + peroxy reactions. d “RO–alpha–H” refers to the carbonyl formed when an α-hydrogen on an alkoxy radical is abstracted.
2.2 Single-step reactions
Single-step mode consists of generating only the initial reaction(s) of a specified compound or intermediate, without reacting the resulting products or intermediates. When reacting stable compounds, it is necessary to specify the type of reaction (e.g., whether unimolecular or a bimolecular reaction with a specified oxidant), whereas all possible reactions are automatically generated for radicals. Generating reactions in single-step mode is useful to obtain information on how the system estimates a compound's reactions, as the results show not only the products formed and the rate constants or branching ratios derived but also documentation on the estimates or assignments used during reaction generation.
The algorithm for generating single-step reactions is shown in Fig. 1, and examples of output are shown in Fig. S1 in the Supplement. Note that Fig. S1 shows not only the generated reactions but also text documenting how the reactions were derived. Reaction types and derivation methods are discussed by Carter et al. (2025a), and the specific procedures shown in Fig. 1 depend on whether the reactant is a stable compound or an intermediate and also on the type of intermediate. Assigned reactions and rate constants are used for reactants for which assignments are made, but for most reactants, reactions and rate constants have to be estimated using methods discussed by Carter et al. (2025a). In some cases, assignments are made for rate constants but not for mechanisms or branching ratios; in those cases, the reactions are generated using the general estimation methods, but the rate constants output are the assigned rather than the estimated values.
Some radicals are estimated to undergo unimolecular or O2 reactions so rapidly that it is not necessary to estimate their total rate constants to predict their atmospheric fates. For example, most alkyl radicals are consumed only by reaction with O2, and many “explicit” mechanisms simply replace them with the peroxy radicals they form, resulting in these alkyl radicals not appearing in the generated mechanism. MechGen does include these radicals and their reactions, though it flags them as being “fast” and only outputs their branching ratios, when applicable.
For photolysis reactions, the system outputs a “photolysis set” name that gives absorption cross sections and optionally quantum yields as a function of wavelength, as well as a wavelength-independent quantum yield if the quantum yields are not given in the photolysis set. The photolysis sets referenced by MechGen are those used for the SAPRC-22 mechanism as listed and documented by Carter et al. (2025a), and these files can be downloaded with the other files needed to implement that mechanism (Carter, 2025a). The single-step reaction output also includes the calculated photolysis frequency for a representative solar light intensity and spectrum, but this should not be considered part of the mechanism because it depends on the light environment.
In the case of peroxy radicals, the system first derives unimolecular reactions if they are estimated to be non-negligible for this radical and then derives the bimolecular reactions with NO, NO2, NO3, HO2, generic alkyl peroxy radicals (RO2), and generic acyl (RCO3) peroxy radicals, in that order. Deriving all the possible organic peroxy + peroxy reactions is impractical due to the large number of peroxy radicals formed in realistic photooxidation systems and the fact that the other peroxy radicals present are unknown at the time of mechanism generation. Instead, MechGen treats peroxy radicals as reacting with the totals of all alkyl and all acyl peroxy radicals. The products from each reaction type include those formed from the subject radical, plus counter species used to represent the type of co-products formed from the sum of peroxy radicals it reacts with. Those counter species can be either generic alkoxy radicals, designated “RO.” or “RCO2” [representing RC(O)OO•], or carbonyl or alcohol H-transfer products, designated “RO-alpha-H” and “ROH” or “RCO–OH” (Carter et al., 2025a). They are included as products of these reactions during reaction generation for tracking if desired but are deleted as products when the mechanisms are reduced or lumped.
The user-modifiable parameters that affect the single-step reaction generation process are shown in Table 6. The temperature affects the thermal rate constants. The pressure affects rate constants assigned to falloff reactions and estimated nitrate yields in the reactions of peroxy radicals with NO and, in some cases, fractions of excited radicals that are stabilized. The presence of water affects estimated reactions of low- reactivity, stabilized Criegee intermediates, as indicated in Fig. 1 (Carter et al., 2025a). When users select single-step operations, all estimated reactions are displayed, allowing users to see which reactions are considered and how their rate constants are estimated. However, negligible reactions are removed when the outputs of single-step reaction operations are processed when called during full mechanism generation, as discussed in Sect. 3. In addition, reactions of non-acyl peroxy radicals with NO2 forming alkyl peroxynitrates (shown in Fig. S1c) are not included during full mechanism generation by default because these peroxynitrates are assumed to rapidly decompose at temperatures of interest (Carter et al., 2025a). These reactions can be included by either increasing the kFastUni parameter (see Table 8) or decreasing the reactor temperature such that the estimated rate constant is less than kFastUni.
Table 6List of parameters that affect single-step mechanism generation.
a Affects output displays and estimates of product yields but not reactions in explicit generated mechanisms. However, this affects the full mechanism generation process discussed in Sect. 3. b The actinic fluxes are used to calculate rates of photolysis reactions given the absorption cross sections and quantum yields assigned to the photolysis reactions. The photolysis frequencies are not part of the mechanism but are included as outputs of single-step mechanism generation and are also used to derive overall product yields for various environments, as discussed in Sect. 3.3, and to derive multi-generation mechanisms, as discussed in Sect. 4. c Currently applicable only to low-reactivity Criegee intermediates (Carter et al., 2025a).
2.3 User modifiable mechanism assignments
As indicated above, MechGen uses assigned rate constants and branching ratios when deriving single-step mechanisms whenever possible and otherwise uses estimates and SARs when no assignments are available. While updates will be made as new data become available, the assignments will not always be up to date. In addition, some users may wish to employ MechGen to derive mechanisms using recently measured or newly derived rate constants and branching ratios or to explore how varying uncertain estimates impact mechanisms. To address this, MechGen allows users to add custom (or “user”) mechanism assignments, which can supplement or replace the default settings. The steps involved in making or managing user assignments are described in the user manual.
3.1 Description of process
Full mechanism derivation for a compound involves first reacting it through all, or a designated subset, of its possible initial atmospheric reactions and then reacting any reactive intermediates that are formed. Stable products formed are not reacted, so the mechanism output reflects only a single generation of reactions, as opposed to the multi-generation mechanism derivations discussed in Section 4. Reactive intermediates are defined as products that have radical or intermediate groups (see Table 3) or non-radical compounds that undergo unimolecular reactions with rate constants greater than a specified value, which is 0.0167 s−1 (1 min−1) by default. The result is a sequence of reactions that were generated and their rate parameters, lists of intermediates involved and products formed, and information about the products and intermediates that may be useful.
In order to provide the capability to generate explicit mechanisms of appropriate sizes for specific modeling applications and to estimate relative yields of products formed under various conditions, MechGen provides standard environments that specify concentrations of atmospheric species that can be used for these purposes. The currently available standard environments are summarized in Table 7, with additional information given in Sect. S2.
Table 7Standard environments available for determinations of which reactions are negligible during mechanism generation and for estimation of product yields.
a See Sect. S2 for more information on these scenarios. All these represent urban conditions similar to those used to derive the Carter (1994) reactivity scales, except that the total VOC levels were reduced to more closely represent modern ambient conditions. b Gen = used to determine negligible reactions during full mechanism generation; PY = used for determination of product yields; MGen = used for the multi-generation mechanism derivations discussed in Sect. 4.
The currently available standard environments listed in Table 7 all represent urban conditions, but users can define new environments as described in the user manual to include a wider range of conditions. The default use of these environments is indicated in Table 7, with three environments representing the range of urban conditions with regard to NOx availability and an additional environment representing nighttime conditions. By default, the “Standard” or mid-NOx environment is not used for mechanism generation, as reactions important under those conditions would also be important under either high or low NOx conditions, if not both. The mid-NOx environment is useful, however, for determination of relative product yields under urban conditions that are equally sensitive to both VOC and NOx. It is also the default environment used when deriving multi-generation mechanisms (see Sect. 4).
The user-modifiable options controlling full mechanism derivation are summarized in Table 8, and the algorithm employed is shown in Scheme S1 of the Supplement, which outlines the overall process, and Scheme S2, which details the portion that determines which reactions are deleted. A key option affecting the sizes of generated mechanisms is the “MinYld” parameter, which determines which competing reactions in the single-step reactions can be neglected. A related option concerns whether and which environments are used during mechanism generation to estimate upper limit yields for competing bimolecular reactions. The MinYld test is implemented by assigning an “upper limit yield” for the initial reactant as unity and estimating upper limit yields of intermediates whose reactions are generated as the product of the assigned or estimated upper limit yield of the reactant forming it, multiplied by the upper limit relative yield of the intermediate in the reactions of the reactant forming it. Upper limit yields are also derived for stable products formed in the generated mechanisms, for the purpose of estimating their approximate importance in the mechanism once it is generated. Reactions forming products with estimated upper yields less than MinYld are neglected, unless there is only one reaction of the reactant. When reactions are neglected, the rate constants and yields of the competing reactions are adjusted to ensure no mass or moles are lost (see Scheme S2). Upper limit yields are also derived for non-reacting products, but this is used for information purposes only.
Table 8List of parameters that affect full mechanism generation.
a These tests are not used when hydroperoxy-substituted peroxy radicals rapidly isomerize to form another such radical that can isomerize back to the initial radical, resulting in no net loss of the radical, making competing reactions potentially non-negligible.
The relative yields of products from reactions of the initial reactant or reacting intermediate can depend on the environment. MechGen assumes constant temperature, pressure, and O2 levels when doing full mechanism generations. As a result, relative yields of products from reactants that undergo only unimolecular reaction or with O2 are treated as environment independent. However, environments significantly affect yields of products from reactants that undergo other bimolecular reactions. In such cases, either a single environment must be specified to calculate yields for the MinYld test or upper limit relative yields must be estimated. If no environment is specified, upper limit yields are derived by assuming that each type of bimolecular reaction is equally important. If more than one environment is used for mechanism generation, relative yields are determined for each environment, and the highest yield among these environments is then used to derive upper limit yields for the MinYld test. Note that the total of upper limit yields for competing reactions can exceed 100 % unless only one environment is used.
As shown in Scheme S2, the MinYld test is not used for reactions of peroxy radicals with NO and for photolysis reactions of initial reactants. The only reactions with NO that may be generated are reactions of peroxy radicals forming alkoxy radicals or organic nitrates. Because the formation of alkoxy radicals is always relatively more important, neglecting the nitrate-forming reaction using the MinYld test will result in underestimation of radical and NOx sink processes without significantly reducing mechanism size. Similarly, the MinYld test is not used for photolysis reactions because MechGen does not generate many competing such reactions and deleting any of them may affect predictions of radical sources without significantly reducing the size of the mechanism. Generally, it is unimolecular reactions that tend to have many competing processes where some may be negligible according to the MinYld test.
For a given MinYld value, using no environments gives the largest mechanisms that are applicable to a full range of conditions, whereas using a single environment gives the smallest mechanisms that are suitable for that environment, though they may not be as accurate when applied to other conditions. Using an appropriate set of environments produces mechanisms of intermediate size that may be optimum if the environments used represent the range of conditions where the mechanism may be used. As indicated in Table 8, the default is to use three environments that represent the range of urban conditions. Use of additional environments may be appropriate if the mechanisms are to be used for more remote scenarios.
An example of a mechanism derived using a full mechanism generation process is shown in Fig. S2, with a representative subset displayed in Fig. 2. These figures show reactions generated for 1,3-butadiene under default conditions. Note that designations such as “{*OHadd}” or “{*O3ole}” indicate excited intermediates. This output includes the rate constants at the default temperature of 298 K (“k”), the branching ratio for various competing reactions (“Fac”), the estimated upper limit weighting factor to determine which reactions can be neglected (“Weight”), and the reactants and products involved of the reactions. If a rate constant is not listed, the reaction is assumed to be fast and the reactant to be in steady state, making the fate of the reactant independent of the rate constant. In such cases, the branching ratios for competing fast routes are given in the “Fac” column.
Figure 2Portions of MechGen output showing representative results of a full mechanism generation operation for the reactions of 1,3-butadiene with default options. The full output is shown in Fig. S2.
The counter species such as “RCO2.” and “RCO–OH”, shown in Figs. S1 and S2, are used for reactions of peroxy radicals with other peroxy radicals, as indicated in Table 5. The output shown in Fig. S2 does not include the assigned or estimated Arrhenius parameters for calculating rate constants at different temperatures for some reactions, but this information can be obtained by selecting other output formats for the reactions as discussed in the user manual.
The numbers of reacting intermediates, stable products, and reactions for mechanisms derived for representative compounds using various mechanism generation options are discussed in Sect. 3.4, and ratios of these are shown in Fig. S7. In most cases, the ratios of the numbers of reacting intermediates in the explicit mechanisms are about 30 %–50 % the numbers of reactions, tending to be smaller when no environments or lower MinYld values are used, with no clear dependence on mechanism sizes. The ratios appear to be higher for the representative aromatics than for other types of compounds, at least for these examples. The ratios of numbers of stable products to numbers of reactions tend to be in the 30 %–40 % range and are less variable than the ratios for intermediates.
3.2 Derivation of minimally reduced mechanisms
The numbers of reactions and intermediates in generated mechanisms can be significantly reduced by combining parallel reactions of the same reactants and by eliminating intermediates that always rapidly form the same product(s) regardless of the environment. Combining parallel reactions gives product yields derived from the ratios of the rate constants for individual reactions to the total rate constant and reduces the numbers of reactions but does not affect the numbers of species in the mechanism. The numbers of predicted stable products cannot be reduced without further lumping, but many reacting intermediates whose fates do not depend on environmental conditions can be removed by assuming steady state so that they can be replaced by the products they form. This results in reducing the total number of intermediates by factors of 3 or higher (see below).
The intermediates removed by this process include species that undergo only unimolecular reactions or reactions with O2, with H2O, or by stabilization, meaning their products do not depend on the environment if O2, temperature, water content, and pressure are assumed constant. These include all carbon-centered radicals, all excited intermediates, almost all alkoxy and nitrogen-centered radicals, and many Criegee intermediates. These all react fast enough on the timescale involved with atmospheric modeling that they can be assumed not to build up in concentration, so replacing them by their products should not significantly affect model predictions.
However, this process does not eliminate peroxy or acyl peroxy radicals that do not undergo fast unimolecular reactions, as they are consumed primarily or at least significantly by bimolecular reactions involving NOx and other peroxy species, whose relative concentrations depend on the reacting environment. Nevertheless, as discussed below, this results in the removal of most reactive intermediates in the explicit generated mechanisms.
Vereecken and Nozière (2020) calculated that most hydroperoxy-substituted peroxy radicals should interconvert to other such radicals with relatively high rate constants, and these are incorporated into the estimated mechanisms produced by MechGen (Carter et al., 2025a). These interconversion reactions are not considered when estimating whether competing reactions may be negligible because they are not net sink processes for such radicals. However, including these rapid interconversion reactions in the processed or lumped mechanisms used in models can potentially cause numerical “stiffness” problems in model simulations and also needs to be taken into account when product yields are estimated, as discussed in Sect. 3.3.
Therefore, as part of this initial processing, rapidly interconverting hydroperoxy-peroxy radicals are identified and represented in the processed mechanism by a lumped radical species representing the sets of interconverting radicals that are assumed to be in equilibrium. Reactions forming these radicals are represented as forming this lumped radical species, and reactions of these radicals are replaced by reactions of this lumped species, forming products of the interconverting radicals with yields multiplied by the equilibrium fraction of the radical derived from the rate constants involved. This process is discussed in Sect. S3.1. This removes these rapid interconversion reactions from the mechanism while retaining their effects on overall product yields.
Mechanisms processed this way are referred to as “minimally reduced” or “processed” mechanisms. They should give essentially the same predictions in models as the explicit generated mechanisms as long as the temperature and O2 levels are constant at the defaults set for the reactor when the mechanism was generated (see Table 6) and as long as the steady-state approximation is appropriate for the reacting intermediates removed during the minimal reduction process. This has been verified in test calculations using representative chemical systems.
MechGen automatically derives a minimally processed mechanism after any successful full mechanism generation process. These are used as the input when deriving product yields, deriving multi-generation mechanisms, or deriving lumped mechanisms using various approaches, as discussed in the following sections. The algorithm employed to derive minimally reduced mechanisms from the explicit generated mechanisms is outlined in Sect. S3.2.
The minimal reduction process tends to reduce the numbers of reactions by factors of 2–2.5 and the numbers of reacting intermediates by factors of 3–5. This is shown in Fig. S8, which shows plots of these ratios against a measure of the mechanism sizes for the mechanisms of the 27 representative compounds and mechanism generation options discussed in Sect. 3.4. The extent of reduction in numbers of reactions is variable and has no clear dependence on the mechanism size, but the reduction in numbers of intermediates is less variable and clearly decreases with mechanism size. Again, the two representative aromatics appear to be outliers in this regard, having much greater reductions in numbers of intermediates. This may be related to the fact that the aromatics tend to have larger numbers of intermediates than comparably sized mechanisms for other compounds (see Fig. S7).
Representative examples of MechGen output for a minimally processed mechanism are shown in Figs. 3 and S3, which use the explicit mechanism from Fig. 2 or S2 as the starting point. Figure S3 shows the complete output for the products of the reactions of 1,3-butadiene in the default standard environment, while Fig. 3 shows the first parts of the species and reaction listings. The format aligns with that used by the SAPRC modeling software (Carter, 2025b), which is similar to formats used by other modeling software systems. The top part of the figures shows the names used for reactants and products in the mechanisms and their corresponding structures. Intermediates are named using the name of the reactant followed by “–” and a sequence number, while stable species use various naming methods. Compounds that are important in emissions or the atmosphere are generally given two- to eight-character SAPRC names based on abbreviating their actual names; compounds that may not be as important but where permanent names are or have been needed when developing SAPRC mechanisms are given permanent “ORG-nnnn” names; and others are given temporary “VOC-nnnn” names that are applicable only to this processed mechanism. Additionally, processed mechanisms using actual structure strings to name the reactants and products can also be output, as discussed in the user manual.
3.3 Estimation of product yields
The relative yields of products formed in the generated mechanisms can be determined if the conditions of the environment where the compound reacts are specified and assumed to be constant. Derivations of product yields for specified environments are useful to provide a means for assessing which types of products are important and how their yields vary with environmental conditions. They are also useful for improving the efficiency and optimizing the size of multi-generation mechanisms, as discussed below in Sect. 4.
The algorithm MechGen uses to derive relative product yields from a processed mechanism is described in Sect. S3.3. It uses, in part, the fact that most radical intermediates can be ordered such that intermediates are formed only by reactions of intermediates above it on the list. This assumption is generally true for mechanisms derived using MechGen, with the exceptions of interconversion of phenoxy and phenyl peroxy radicals due to their reactions with O3 and NO, respectively, and of interconversions of hydroperoxy-substituted peroxy radicals due to rapid H-shift reactions, as discussed by Carter et al. (2025a) and Vereecken and Nozière (2020). The most rapid of the hydroperoxy-peroxy interconversions are removed by employing an equilibrium approximation, but some such interconversions are too slow for this approximation to be appropriate, as discussed in Sect. S3.1. For such cases, a matrix inversion procedure derived from the steady-state equations for the pseudo-first-order reactions of the interconverting radicals is employed, as discussed in Sect. S3.3.
Figure 4Portions of MechGen output showing estimated product yields for the reactions of 1,3-butadiene with OH, O3, and NO3 radicals for various environments, derived using default mechanism generation options and environments.
Figure S4 shows an example of product yield output corresponding to the processed mechanism shown in Figs. 3 and S3 and the explicit mechanism in Fig. S2. Selected portions of this output are shown in Fig. 4. This product yield output consists of three parts:
-
The first part summarizes the environments used and, for the web system, gives links to obtain their oxidant concentrations.
-
The second part shows the fractions of the compound that react with the various oxidants and is not shown if the compound reacts with only one reactant. For this example, most of the reaction is with OH, except in the nighttime environment, where reaction with NO3 dominates.
-
The third part shows the product yields for the various environments, sorted in descending order by average yield in the environments, with organic products hyperlinked in the web output so that they can be readily created and reacted if desired and with inorganic products such as NO2 (e.g., from peroxy + NO reactions), HO2, O2, and H2O included. The entry for “NO-loss” shows how many moles of NO are consumed by reaction with peroxy radicals, which is directly responsible for O3 formation when organics react in the presence of NOx.
If the reactant forms non-volatile or semi-volatile products, the product yield output will also include mass percent yields of various volatility bins. If a representative atmospheric organic PM level is specified as a reactor option (50 µg m−3 is the default), the product yield output will also include total SOA yields corresponding to that organic PM level. This is discussed in Sect. S4. The total mass yield of non-volatile products is also output. Note that these mass bin yields will sum up to more than 100 % because of the oxygen and nitrate groups added to the molecules during the oxidation process.
3.4 Effects of varying full mechanism derivation options
The mechanisms derived using the full mechanism operation depend on the values of the MinYld parameter and the use of environments, both of which affect which reactions and intermediates are neglected due to low relative yields. The effects of different choices in this regard are discussed in this section. This was examined by varying MinYld and environment options when deriving mechanisms for the representative compound α-pinene and by comparing results for a more limited set of choices for 27 representative compounds of various types (listed in Table S3). These include the series of n-alkanes from propane through hexadecane, various C8 alkanes, alkenes, oxygenates, aromatics, and two terpenes.
Figure 5 shows the effects of varying the MinYld parameter and environment options on the numbers of explicit reactions, stable products, and peroxy intermediates in mechanisms generated for α-pinene for various environment options. As expected, the number of reactions increases as the MinYld is decreased, with no tendency to level off at either the high or low MinYld range. As also expected, using no environments resulted in the largest mechanisms, using the three default environments (shown in Table 7) resulted in about 3 times fewer reactions in the case of α-pinene, and using only one environment reduces the number of reactions by almost an additional factor of 2. This is because estimated upper limit yields are highest if no environments are used and increase with the numbers of environments if more than one is used. Note that the numbers of peroxy radicals are important because they determine the minimum size of the processed mechanisms, even if product lumping is employed.
Figure 5Effects of varying the MinYld parameter and environment options on the numbers of explicit reactions in full mechanisms generated for α-pinene.
Although removing reactions through the MinYld test does not affect mass balance because they are replaced by competing reactions determined to be more important, these removals will change predicted product yields, reducing the potential accuracy of the mechanism. This can be assessed by assuming that a mechanism derived with very small MinYld values and no environments can approximate the mechanism derived without deletions. Comparing product yields predicted by a mechanism generated with a given set of options with those predicted by this reference mechanism gives an indication of the effects of these deletions on predictions. The total changes in product yields are quantified as
where “Set” refers to a mechanism derived for a compound using the mechanism generation options being considered, “Envt” is a representative environment, “Ref” refers to the reference mechanism for the compound derived with the lowest MinYld and no environments, and “Yield” is the molar yield of product per mole VOC reacted under the conditions of that environment. The summation includes all organic products with yields greater than 0.0005 moles per mole VOC reacted and amounts of NO consumed and the NO2, HO2, and OH radicals that are formed. Yields for products less than 0.0005 moles per mole VOC reacted are insignificant individually and thus excluded in the summation, though they are not necessarily insignificant collectively. The values for the total yield change can range from 0 to a maximum of the total moles of products formed per mole reactant, which is usually more than 1.
Figure 6a shows the total yield changes predicted for the mid-NOx standard urban environment for all the α-pinene mechanisms shown in Fig. 5 as a function of the MinYld parameter. The yields predicted by the reference mechanism derived using a very low MinYld value of 0.01 % and no environments (with ∼26 000 explicit reactions) are used as the standard. As expected, yield changes increase as the value of MinYld increases, though the dependence on environment options is relatively small, except at higher MinYld values. The total changes exceed 0.05 moles of product per mole of α-pinene reacted when MinYld > 1 % but are around 0.01 moles or less when the default MinYld value of 0.5 % is used. Decreasing MinYld to 0.1 % resulted in changes being decreased to 0.003 moles or less.
Figure 6Effects of varying the MinYld parameter and environment options on relative changes in yields of selected groups of products from α-pinene when reacted under mid-NOx urban environmental conditions.
Figure 6b shows the effects of MinYld on moles of NO consumed per mole of α-pinene reacted, and Fig. 6c shows effects on the yields of organic products in the condensed phase, assuming equilibrium and atmospheric aerosol levels of 50 µg m−3 (as discussed in Sect. S4). The effects on NO consumption reflect effects on O3 formation because reactions of intermediates with NO are the processes responsible for O3 formation in the lower atmosphere (e.g., see Finlayson-Pitts and Pitts, 1999). The amounts of organic products predicted to be in the condensed phase assuming equilibrium provides an indication of the effect of the VOC's reaction on SOA formation, though this should be considered to be highly approximate because effects on SOA depend on environmental conditions, estimates of vapor pressure are uncertain, and the equilibrium approximation is an oversimplification. Figure 6b and c indicates that, for α-pinene at least, the apparent effects on both O3 and SOA are relatively small, being only 0.3 % or less when the default MinYld parameter is used.
Figure 6 shows the effects of MinYld on product yield changes for α-pinene under mid-NOx urban conditions. The effects on yield changes for other standard environments incorporated in MechGen are shown in Fig. S6. The total yield changes for the daytime environments are similar, though the effect on NO consumption is lower under lower NOx conditions because there is less NO consumed. The total yield changes are much less under nighttime conditions, presumably because fewer reactions tend to be important. Note that no NO is present in the nighttime scenario, and MechGen predicts that α-pinene forms much lower yields of low-volatility products at night compared to daytime.
The effects of varying MinYld and environment options for other compounds are shown in Fig. 7. This shows effects on (a) numbers of explicit reactions and (b) total yield changes for the mid-NOx environment for 27 representative compounds of various types (listed in Table S3). Plots showing ratios of numbers of reacting intermediates or stable products to numbers of reactions for these compounds with different mechanism generation options are shown in Fig. S7. As expected, the mechanism size increases with the size of the molecules, but the type of molecule is also important, as indicated by the large variability for the representative C8 compounds. Highly branched compounds and aromatics tend to have the smallest generated mechanisms, whereas cyclic alkenes, such as terpenes, tend to have the largest. The effects on product yield tend to correlate with mechanism size, though the results are variable and the correlation is relatively weak (∼50 %), except at the lowest MinYld level.
4.1 Description of process and outputs
The full mechanism generation process discussed above produces only a single-generation mechanism because it does not react the stable products formed. MechGen can also derive multi-generation mechanisms where all reactive products formed in non-negligible yields, including those formed after reacting multiple generations, are reacted. It involves carrying out full mechanism generation for a selected starting compound, then determining the stable products formed, then generating mechanisms for products in non-negligible yields, and then repeating the procedure until no reactive, non-negligible products remain. This would be relatively straightforward, except that reacting all products regardless of their yields would result in unmanageably large mechanisms, dominated by reactions and species of negligible importance under conditions of interest. Therefore, much of the multi-generation derivation by MechGen focuses on determining which products are in fact non-negligible and which do not have to be reacted.
The procedures used for deriving multi-generation mechanisms are described in the user manual. They involve accessing the system using the terminal interface, creating a software object to control the process for a specified compound and to store the results, and then using it to carry out the operations and transmit the results. These objects have several parameters that control which products are formed in sufficiently high yields to react and which can be treated as reactive. One of these is the specification of an environment used to estimate product yields, and another is the“MGminYld” parameter that determines which yields are treated as negligible. In addition, to determine which product can be treated as reactive, a “MinVP” parameter gives the minimum vapor pressure for product compound to react in the gas phase, and a “RxnHours” parameter determines the time for compounds to react and form products. These options can be changed by the users, but by default, the environment used for the multi-step mechanism generation is the first one listed to derive product yields as discussed Sect. 3.3. The default values of MGminYld, MinVP, and RxnHours are 0.01 %, 10−13 atm, and 6 h, respectively (Table 8). These need to be specified to minimize the numbers of totally negligible reactions and processes in the multi-generation mechanisms that are derived.
The algorithm MechGen uses to derive multi-generation mechanisms is summarized below, and a flowchart showing the process is given in Scheme S6. The process starts by generating full single-generation mechanisms for the initial reactant and for all products formed in non-negligible yields. Each mechanism generation is followed by deriving minimally reduced mechanisms, estimating product yields for the selected environment, and determining what products are formed in sufficiently high yield to react. The estimated upper limit yields are derived based on the following:
where “kPUni” is the pseudo-unimolecular rate constant for the reactions of a reactant in the environment, “kUni” and “khν” are the unimolecular and photolysis rate constants for the reactant (if applicable), “OX” and “kOX” refer to all oxidants with which the compound reacts and their rate constants, “RxnTime” is derived from “RxnHours” to be consistent with the units used for kPUni, and “Est.” refers to upper limit estimates. The yields of products from reactants in the environment are derived as discussed in Sect. 3.3, and the kPUni values are derived by multiplying the rate constants for the initial reactions multiplied by the concentration of the oxidant with which it reacts in the environment, if applicable. The reactivity correction factors are calculated as fractions in the environment and are used to account for the fact that slower reacting compounds form lower amounts of products during the reaction time in the environment. The formula for calculating the fractions reacted, derived from integrating the kinetic differential equation, is
This gives near 100 % fractions reacted for rapidly reacting compounds and fractions that are approximately proportional to kPUni for slowly reacting compounds. Note that the estimated upper limit yields tend to decrease as the number of generations of reactions increases.
The standard environments used (or not used) when carrying out the full, single-generation mechanism derivations for individual compounds, as discussed in Sect. 3.1, do not necessarily need to be the same as the one required to carry out the multi-generation process. If standard environments are used during the single-generation mechanism derivation process for individual compounds, they should also represent the environment used for the multi-generation process. This is assured if the environment used for the multi-generation process is one of those used for the single-generation process or if no environments are used for the single-generation process. By default, the three standard environments used for single-generation processes incorporate the conditions of the default standard environment for multi-generation derivations, as indicated in Table 7.
The results of this process consist of lists of reacting species, their reactions and rate constants, and lists of low-reactivity, low-volatility, or low-yield products that are not reacted. As discussed in the user manual, these results can be transmitted to users in several types of files. These include (1) files containing lists of reactants, non-negligible products, and counter species representing atom numbers and total masses of low-yield products and their approximate yields; (2) files giving information about the reactants and products in terms of groups in the molecules; (3) files with listings of the explicit and processed reactions of the reacting compounds; and (4) files that can be used as input to prepare the mechanisms for model simulations using SAPRC box modeling software (Carter, 2025b). The model preparation input files come in two forms: one containing the minimally reduced processed mechanisms for all the reactants and the other giving pseudo-unimolecular reactions of the reactants that are applicable only to the conditions of the environment used to generate the mechanism. The specific types of information that can be obtained are listed in Table S4.
4.2 Examples of results
To illustrate the results of multi-generation mechanism derivation operations and their dependence on selected options, mechanisms were derived for several example compounds ranging from propane to α-pinene using a standard set of options, and additional multi-generation mechanisms were derived for α-pinene using differing sets of options. The compounds and mechanism derivation options are summarized in Table 9, which gives the sets of options, the numbers of explicit reactions, information on lost or low-yield carbon, and estimated SOA formation for the mechanisms. In the cases where the compound or MGminYld parameter was varied, the single-generation mechanisms for individual reactants were derived using the three default environments listed in Table 7 and the default mechanism generation options given in Table 8. In the cases where the environments were varied, the single-generation mechanisms were derived using the stated environment only, with defaults used for the other parameters. Table 9 shows that, as expected, the results depend significantly on the MGminYld parameter and the multi-generation environment used.
Table 9Summary of multi-generation mechanism derivation examples.
a All other options were held at the defaults shown in Table 8. b Environment(s) used when deriving the multi-generation mechanism. c Ratio of moles of carbon in compounds that are not reacted due to low yields divided by moles of carbon in the reacting starting compound after reacting for RxnTime = 6 h. d SOA = total moles of all low-volatility products multiplied by their estimated fractions in the particle phase when total atmospheric SOA levels were set at 50 µg m−3 (see Sect. S4).
The pseudo-unimolecular reaction output was used to prepare input for the SAPRC box modeling software (Carter, 2025b) to carry out model simulations for the evolution of product concentrations and negligible atom counter species over time, under conditions of the environment used to derive the multi-generation mechanism. The use of this output is equivalent to assuming that the atmospheric oxidant levels are constant at the levels associated with the environment, as given in Table 7, and at constant light intensity with photolysis rates calculated for the light associated with the reactor. The model simulations were carried out for 6 h of simulated time, with the starting reactant being initially present at 1 ppm mixing ratio, with no dilution or variation in reaction conditions or oxidant species. The calculated concentrations were used in conjunction with separate MechGen output giving structural information about the compounds (Carter et al., 2025a) for the purpose of computing total yields of compounds with different structures (structural counter species) and total carbons in compounds that were not reacted because of estimated low yields or reactivities.
Figure 8Moles of various types of products or functional groups formed as a function of time for 6 h simulations using multi-generation mechanisms for α-pinene. Products are ordered from high to low maximum yields.
Figure 8 presents concentration-time results for the highest yield structural counter species or C1 products for α-pinene, and these results for all representative compounds listed in Table 9 are given in Fig. S10. These mechanisms were all generated using a MGminYld value of 0.001 % and the mid-NOx urban standard environment, which are the defaults. These figures also show the O C ratios for the reactant and products as a function of time. The structural counter species are listed and briefly described in Table S4. Concentrations relative to the amount of starting material reacted at the times are shown at 30 min intervals up to 2 h of simulation time, then at 2 h intervals after that. The evolution of the ratio of total carbon in low-yield unreacted compounds (NegC) to total reacted carbon is also shown in Figs. 8 or S10, except for cases when they are near zero.
Figures 8 and S10 show that the O C ratios increase with time, as expected due to ongoing oxidation. The yields of the C1 products also increase significantly and make relatively high contributions to the total products formed by the end of the simulations. Relative concentrations of other types of products tend to go down with time after the first half hour, particularly more reactive products such as aldehydes. As expected, the distribution of products significantly depends on the type of compound. Note that these simulations treat formaldehyde as a non-reacting product, while in fact a significant fraction of it would react to form CO and CO2 by the end of the simulations.
Figure 9Effects of varying the environment on maximum molar concentrations of various types of products or functional groups for the simulations using multi-generation mechanisms for α-pinene.
Examples of effects of using different environments on multi-generation mechanisms derived for α-pinene are shown in Table 9 and Fig. 9. In those examples, the single-generation mechanisms were derived using only the subject environment rather than the three default environments, which is the reason that the number of reactions derived for α-pinene in these calculations is smaller than in those where the compound or MGminYld parameter was varied (see Fig. 5), though the overall product yields are essentially the same. As discussed earlier, the standard environment represents intermediate NOx conditions and thus includes both high and low NOx reactions. Therefore, as shown in Table 9, the mechanism derived for it has more reactions and reactants than those derived separately for high and low NOx conditions. On the other hand, fewer types of reactions are predicted to be important in the nighttime environment, so significantly smaller mechanisms are derived. As shown in Fig. 9, the concentrations of compounds with hydroperoxide groups are highest under low NOx conditions, and the yields of N-containing compounds are lower, though the yields of nitrates or peroxyacylnitrates (PANs) are not as strongly affected as one might expect based on the relative NOx levels in the environments. The yields of compounds with ester, alcohol, or peroxy acid groups are significantly lower under nighttime conditions, though the predicted levels of the other compounds are generally within the ranges predicted for the daytime environments. Further discussion of the mechanistic implications of these results is beyond the scope of this paper, but this is worthy of additional study.
The effects of varying the MGminYld parameter on products or counter species in the α-pinene mechanisms are shown in Fig. S11. Increasing MGminYld raises yields of neglected compounds (reflected by NegC), resulting in a slight decrease in yields of most products, with the yields decreasing by ∼10 % when MinYld is increased from 0.25 % to 1 %. This means that yields in multi-generation mechanisms are necessarily lower limits unless very low values of MGminYld are used to minimize the amount of untreated carbon (NegC).
Though smaller than the explicit mechanisms, the minimally reduced mechanisms are still too large for use in practical modeling applications involving complex atmospheric mixtures. Further reduction is necessary, both by using a limited number of lumped model species to represent chemically similar compounds and by using various approaches to reduce the numbers of reactive intermediates in the mechanisms. MechGen can optionally be used to derive lumped mechanisms for organics that are consistent with the lumping approaches used in recent SAPRC mechanisms (Carter, 2010a, b, 2023; Carter and Heo, 2013). As indicated in Table 8, the lumping approach must be specified as one of the options used for full mechanism generation. The default lumping option is “explicit” or no lumping, which involves generating all reactions expected to be non-negligible and then preparing only minimally reduced processed mechanisms from the results, with no additional processing. This was used for all the examples discussed in this paper. Other lumping options currently available in MechGen are briefly summarized below, with additional details given in the user manual, as well as relevant SAPRC mechanism papers and reports. A more complete discussion of lumping algorithms is beyond the scope of the present paper and will be presented in more detail in a subsequent paper (see also Carter, 2023).
SAPRC-11 lumping is the approach used in the SAPRC-11 mechanism (Carter and Heo, 2013) and is essentially the same as that used for SAPRC-07 (Carter, 2010a, b) except as discussed by Carter and Heo (2013). All organic products are represented by one of ∼50 lumped or explicit model species, and all peroxy intermediates are represented by ∼30 chemical operators that represent products formed and how their yields may change with NOx conditions. (An “operator” in this discussion refers to a model species that does not correspond directly to a chemical species but is added to represent overall effects of various reactions [e.g., Carter, 2010a, b].) In addition, all acyl peroxy, phenoxy, and phenyl peroxy intermediates are represented using lumped radical model species in all current SAPRC mechanisms. Note that the peroxy operator method as originally implemented for SAPRC-07 and SAPRC-11 does not provide for peroxy radical isomerization reactions, so, strictly speaking, they are not compatible with mechanisms currently generated using MechGen, which includes such reactions when predicted. This issue is addressed by assigning an “effective NO” concentration to take competitions between unimolecular and bimolecular reactions into account when determining yields of operators representing the products formed, as discussed by Carter (2023). The default effective NO for SAPRC11 lumping is 0.5 ppb, reflecting intermediate NOx urban conditions.
SAPRC-22 lumping is the approach used in the “standard” version of the SAPRC-22 mechanism (Carter, 2023). This approach represents organic products using ∼110 lumped or explicit model species, with the numbers of lumped species increased, in part, to better represent NOx recycling and SOA formation potentials. Peroxy radicals are represented using a modified version of the operator approach used for SAPRC-07 and -11, in which operator yields are calculated at both low and high effective NO levels, and an extra operator species is added for each reaction where the yields are different at different NO levels, so the lumped mechanism is applicable for a wide range of NO conditions. This lumping approach was used to derive the SAPRC-22 mechanism (Carter, 2023), using procedures discussed in the user manual.
If SAPRC-22 lumping is selected, users have the option to generate SAPRC-22-compatible lumped mechanisms that represent selected specific organic compounds explicitly in models, rather than having them lumped as in the standard SAPRC-22 mechanism. The products formed in the reactions of these explicitly represented compounds will then be lumped to be compatible with SAPRC-22, and these lumped reactions can be output along with the rest of the SAPRC-22 mechanism to be used as input to the SAPRC model preparation programs (Carter, 2025b). These expanded versions of SAPRC-22 can then be used in airshed models to evaluate the effects of the selected compounds on O3 or other measures of air quality, or they can be used to allow comparisons of their predicted concentrations with observations.
If any type of SAPRC lumping is selected, the mechanism generation process will exclude the reactions of acyl peroxy, phenoxy, or phenyl peroxy radicals, as these are all represented by lumped model species, so their explicit reactions are not used when deriving the lumped mechanism. In addition, if a non-explicit lumping method is employed, the types of initial reactions the organic compounds undergo are determined by the types of initial reactions of the lumped model species used to represent them, which is part of the input used to define the lumping approach, as discussed in the user manual. In some cases, this is a subset of the types of initial reactions generated when the default “explicit” (no lumping) option is in effect.
Because of the complexity of the atmospheric reactions of most organic compounds, automated mechanism generation systems such as MechGen provide an important link from chemical kinetic and mechanistic data and theories to model predictions of chemical transformations in the atmosphere. As discussed by Kaduwela et al. (2015) and Ervens et al. (2024), the stages of mechanism development for models that provide this link consist of (1) compiling and evaluating available basic mechanistic data; (2) developing SARs and other estimation methods needed to develop complete mechanisms; (3) deriving detailed mechanisms that incorporate these data and estimation methods; (4) reducing the detailed mechanisms so they can be used in various modeling applications; and (5) evaluating the predictions of the mechanisms against observations. MechGen is a tool that can provide the link between mechanistic data and theory and detailed mechanisms and can also be used for mechanism reduction. The development of the more recent SAPRC mechanisms for airshed models (Carter, 2010a, b, 2023; Carter and Heo, 2013) was undertaken with this approach in mind, using MechGen to derive the SAPRC lumped mechanisms from the explicit mechanisms for the many compounds that are represented.
Detailed mechanisms derived by MechGen can also be used in analyses of laboratory (e.g., Li et al., 2022) or ambient data, providing an alternative or supplement to the Master Chemical Mechanism (MCM) (Bloss et al., 2005; Jenkin et al., 1997, 2003; Saunders et al., 2003), which has been widely used for this purpose. MechGen offers the flexibility of being applicable to any compound within the scope of its predictive capabilities, whereas MCM, although comprehensive, is a manually developed mechanism that covers a finite number of compounds. Another alternative without this limitation is the GECKO-A mechanism generation system (Aumont et al., 2005), which also has some online functionality (https://geckoa.lisa.u-pec.fr/, last access: 11 October 2025, https://www2.acom.ucar.edu/modeling/gecko, last access: 11 October 2025).
One advantage of using MechGen is its relatively user-friendly online interface that allows new users to easily examine predictions for single-step reactions or complete mechanisms for selected compounds without extensive training. It also offers terminal-based access that can be obtained using accounts created using the online system to use features of the system not available to web users, and users can download and install their own copies of this software to use its full features, including resource-intensive operations like multi-generation mechanism derivation. This is discussed in the user manual that is available at the MechGen web site (Carter, 2025a). MechGen output can serve as input to the SAPRC box modeling software (Carter, 2025b) to carry out model simulations with either minimally reduced detailed mechanisms or extended versions of the SAPRC-22 mechanism (Carter, 2023), with selected compounds represented explicitly. It can also be used for comparing its predictions with those of GECKO-A or incorporated in MCM to assess differences in predictions that indicate areas where additional experimental, theoretical, or SAR development work is needed.
One of the reasons that MechGen has not been as widely used as MCM or GECKO-A for applications other than SAPRC mechanism development is that it has not been described or documented in the peer-reviewed literature until now, including the continuous revisions that it has undergone. Most researchers have either not been aware of MechGen or have not had a stable version to cite if they do use it. An additional factor that may inhibit its use as an alternative to MCM is that MechGen outputs mechanisms in the format used by the SAPRC model simulation software, which is not as widely used as other modeling software systems. This paper, along with the recently published paper of Carter et al. (2025a), addresses its lack of adequate documentation. Future updates to MechGen will be documented and made available as separate versions to ensure reproducibility. We are developing software for converting MechGen mechanism output to other formats and have already released a converter for the F0AM modeling system (Wolfe et al., 2016), which is available from the MechGen GitHub repository.
MechGen, along with GECKO-A and MCM, represents the current state of the science in atmospheric reactions of organics in the lower troposphere, though they all require updates in some respects to be consistent with recent advances in the continuously evolving science in this area. They all incorporate different estimates or assumptions regarding some uncertain processes, though they also have many assumptions and estimates in common that may not necessarily be correct. The assignments and estimates incorporated in MechGen that need to be updated are discussed by Carter et al. (2025a). Although GECKO-A is also continually being updated, it does not yet incorporate H-shift isomerization (including autooxidation) reactions of peroxy radicals, which are important for many compounds such as monoterpenes (Li et al., 2022). On the other hand, GECKO-A has a more detailed representation of peroxy + peroxy and photolysis reactions. Comparing MechGen predictions with those of GECKO-A and MCM can help identify areas where further experimental, theoretical, or SAR development work is most needed, though this does not rule out the possibility that they all share assumptions or estimates for these highly complex and uncertain mechanisms that may, in fact, be incorrect.
The atmospheric chemistry of organics and development of SARs or estimation methods continues to be an active area of research, and MechGen and other mechanism generation systems need to be periodically updated to continue to represent the state of the science and thus continue to be appropriate tools for developing mechanisms for models. Updating the estimates for the many types of chemical systems that need to be represented, testing the associated software changes, and evaluating the predictions against available data constitute a time-consuming process, making it essentially impossible for any system to be completely up to date at any point of time. This is one reason why comparisons of predictions of different mechanism generation systems or mechanisms are important. Carter et al. (2025a) discuss the many types of estimates and point out the areas where updates are needed and planned.
MechGen is designed to accommodate updates to the underlying chemistry assignments and SARs. Modifying chemical assignments or parameters used in existing SARs is straightforward and discussed in the user manual, but adding or deleting types of reactions or adding or changing the structures of SARs will require programming changes. Because implementing such updates would result in new versions of the system that will give different predictions, it is important that older versions be archived and made available so that previous work that used these versions can be reproduced or evaluated. This was not the case for previous versions of MechGen but will be for newer versions going forward.
The development and maintenance of MechGen has largely been the work of the primary author (Carter), who is nearing full retirement. Unique features and strengths of MechGen (including its integration with other SAPRC tools and models) have led to a growing base of MechGen users, and as this user base has grown, so has the need for a collaborative team to carry it into the future. Steps to ensure the sustainability of this resource thus far have included the publication of the chemical basis of the MechGen system (Carter et al., 2025a), this paper describing the software system, the development of a GitHub site (Jiang et al., 2025), publications describing the use of MechGen beyond the derivation of SAPRC mechanisms (Jiang et al., 2020; Li et al., 2022), and dissemination of knowledge through the growing user community. In addition, MechGen is written in a programming language (MOO) that is no longer widely used or maintained, so conversion to a more actively supported platform would enhance ongoing maintenance and collaborative development in the future. The development of collaborative teams, including those working on the chemical mechanism estimates and databases, as well as the software, will ensure that MechGen remains a valuable, adaptable, and robust mechanism generation system well into the future.
User manual
A complete user manual describing the operation of the MechGen system using both the web and terminal interfaces is available with the MechGen source code (see below) and at the MechGen web site (Carter, 2025a) and MechGen GitHub site (Jiang et al., 2025). It discusses how to obtain, install, and configure the system on the user's own computer. Updated versions of the user manual will be made available on the MechGen and GitHub sites when the source code or documentation has been updated. A quick start web user guide for new users is also available at the MechGen web site and the SAPRC MechGen GitHub page, where it will also be updated when appropriate.
The MechGen source code (v1.1, last updated 25 July 2025) and its documentation, including the full user manual and quick start guide, are available on Zenodo via Carter et al. (2025b) at https://doi.org/10.5281/zenodo.16622705. Detailed system information, including online access and further development, is provided on the MechGen website (https://mechgen.cert.ucr.edu/, last access: 25 July 2025), while the GitHub repository at https://github.com/SAPRC/MechGen (last access: 30 July 2025) offers the latest software releases and installation resources, with updates archived on Zenodo.
All the source code and data used by MechGen are contained within a single MOO code database file as described in available MOO documentation (Curtis, 1997; Fox, 2004; Wikipedia, 2025). The program source code and the data can be viewed and modified by downloading and installing the MechGen database and the MOO server software as described in the user manual. Once the system is installed and configured, the user can access the system using Telnet and log in as a systems programmer (see user manual) and then use MOO commands to view or modify the source code (Curtis, 1997; Fox, 2004).
The data used by MechGen to implement the chemical assignments and SARs are described by Carter et al. (2025a). The detailed parameters, mechanism assignments, and rate constants used are given in tables in the paper or its Supplement.
The Supplement published with this paper contains additional information and details on procedures discussed in this paper. These include the following: additional output examples (Sect. S1); derivation of the standard environments currently available in MechGen (Sect. S2); additional details concerning procedures used (Sect. S3); the method used to estimate fractions of semi-volatile products in the condensed phase (Sect. S4); additional information regarding multi-generation mechanisms (Sect. S5); and additional information on the results with the representative compounds (Sect. S6). The supplement related to this article is available online at https://doi.org/10.5194/gmd-18-8461-2025-supplement.
WPL led conceptualization, data curation, and software; KCB led funding acquisition, project administration, and supervision; JJ and ZW contributed to data curation and resources; all authors contributed to original writing and editing.
The contact author has declared that none of the authors has any competing interests.
Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.
This work was supported in part by the University of California Retirement system and in part by a grant from the US Environmental Protection Agency's (EPA) Science to Achieve Results (STAR) program. It has not been formally reviewed by EPA. EPA does not endorse any products or commercial services mentioned in this publication.
The online version of MechGen is currently hosted on Jetstream2 at Indiana University through allocation EES250027 from the Advanced Cyberinfrastructure Coordination Ecosystem: Services & Support (ACCESS) program, which is supported by National Science Foundation grants #2138259, #2138286, #2138307, #2137603, and #2138296.
The authors wish to thank John Orlando of NCAR for helpful discussions. The opinions and conclusions in this paper are entirely those of the authors.
This research has been supported by the US Environmental Protection Agency (grant no. 84000701) and the California Air Resources Board (grant no. 11-761).
This paper was edited by Olaf Morgenstern and reviewed by Rolf Sander and one anonymous referee.
Aumont, B., Szopa, S., and Madronich, S.: Modelling the evolution of organic carbon during its gas-phase tropospheric oxidation: development of an explicit model based on a self generating approach, Atmos. Chem. Phys., 5, 2497–2517, https://doi.org/10.5194/acp-5-2497-2005, 2005.
Bloss, C., Wagner, V., Jenkin, M. E., Volkamer, R., Bloss, W. J., Lee, J. D., Heard, D. E., Wirtz, K., Martin-Reviejo, M., Rea, G., Wenger, J. C., and Pilling, M. J.: Development of a detailed chemical mechanism (MCMv3.1) for the atmospheric oxidation of aromatic hydrocarbons, Atmos. Chem. Phys., 5, 641–664, https://doi.org/10.5194/acp-5-641-2005, 2005.
Carter, W. P. L.: A detailed mechanism for the gas-phase atmospheric reactions of organic compounds, Atmos.Environ. A, 24, 481–518, https://doi.org/10.1016/0960-1686(90)90005-8, 1990.
Carter, W. P. L.: Development of Ozone Reactivity Scales for Volatile Organic Compounds, J. Air Waste Manage., 44, 881–899, https://doi.org/10.1080/1073161X.1994.10467290, 1994.
Carter, W. P. L.: Documentation of the SAPRC-99 Chemical Mechanism for VOC Reactivity Assessment, Zenodo, https://doi.org/10.5281/zenodo.12600705, 2000.
Carter, W. P. L.: Development of the SAPRC-07 Chemical Mechanism, Atmos. Environ., 44, 5324–5335, https://doi.org/10.1016/j.atmosenv.2010.01.026, 2010a.
Carter, W. P. L.: Development of the SAPRC-07 Chemical Mechanism and Updated Ozone Reactivity Scales, Zenodo, https://doi.org/10.5281/zenodo.12601346, 2010b.
Carter, W. P. L.: Preliminary Documentation of the SAPRC-16 Mechanism, https://intra.engr.ucr.edu/~carter/SAPRC/16/S16doc.pdf (last access: 29 October 2016), 2016.
Carter, W. P. L.: Documentation of the SAPRC-22 Mechanisms, Zenodo, https://doi.org/10.5281/zenodo.12601488, 2023.
Carter, W. P. L.: SAPRC Mechanism Generation System for the Atmospheric Reactions of Volatile Organic Compounds in the Presence of NOx, https://intra.engr.ucr.edu/~carter/MechGen/ (last access: 25 July 2025), 2025a.
Carter, W. P. L.: SAPRC-07 and SAPRC-11 Chemical Mechanisms, Test Simulations, and Environmental Chamber Simulation, Fileshttps://www.cert.ucr.edu/~carter/SAPRC/SAPRCfiles.htm (last access: 25 July 2025), 2025b.
Carter, W. P. L. and Atkinson, R.: Atmospheric chemistry of alkanes, J. Atmos. Chem., 3, 377–405, https://doi.org/10.1007/BF00122525, 1985.
Carter, W. P. L. and Heo, G.: Development of Revised SAPRC Aromatics Mechanisms, https://intra.engr.ucr.edu/~carter/SAPRC/saprc11.pdf (last access: 12 April 2012), 2012.
Carter, W. P. L. and Heo, G.: Development of revised SAPRC aromatics mechanisms, Atmos. Environ., 77, 404–414, https://doi.org/10.1016/j.atmosenv.2013.05.021, 2013.
Carter, W. P. L., Jiang, J., Orlando, J. J., and Barsanti, K. C.: Derivation of atmospheric reaction mechanisms for volatile organic compounds by the SAPRC mechanism generation system (MechGen), Atmos. Chem. Phys., 25, 199–242, https://doi.org/10.5194/acp-25-199-2025, 2025a.
Carter, W. P. L., Wang, Z., and Jiang, J.: SAPRC/MechGen: MechGenv1.1, Zenodo [code], https://doi.org/10.5281/zenodo.16622705, 2025b.
Curtis, P.: LambdaMOO Programmer's Manual, https://lambda.moo.mud.org/pub/MOO/ProgrammersManual_toc.html (last access: 11 August 2025), 1997.
Ervens, B., Rickard, A., Aumont, B., Carter, W. P. L., McGillen, M., Mellouki, A., Orlando, J., Picquet-Varrault, B., Seakins, P., Stockwell, W. R., Vereecken, L., and Wallington, T. J.: Opinion: Challenges and needs of tropospheric chemical mechanism development, Atmos. Chem. Phys., 24, 13317–13339, https://doi.org/10.5194/acp-24-13317-2024, 2024.
Finlayson-Pitts, B. J. and Pitts Jr., J. N.: Chemistry of the Upper and Lower Atmosphere: Theory, Experiments, and Applications, in: 1st Edn., Elsevier, ISBN 978-0-12-257060-5, 1999.
Fox, K.: MOO-Cows FAQ, https://www.moo.mud.org/moo-faq/ (last access: 11 August 2025), 2004.
Gao, C. W., Allen, J. W., Green, W. H., and West, R. H.: Reaction Mechanism Generator: Automatic construction of chemical kinetic mechanisms, Comput. Phys. Commun., 203, 212–225, https://doi.org/10.1016/j.cpc.2016.02.013, 2016.
Green, W. H.: Chapter 5 – Automatic generation of reaction mechanisms, in: Computer Aided Chemical Engineering, vol. 45, edited by: Faravelli, T., Manenti, F., and Ranzi, E., Elsevier, 259–294, https://doi.org/10.1016/B978-0-444-64087-1.00005-X, 2019.
Jenkin, M. E., Saunders, S. M., and Pilling, M. J.: The tropospheric degradation of volatile organic compounds: a protocol for mechanism development, Atmos. Environ., 31, 81–104, https://doi.org/10.1016/S1352-2310(96)00105-7, 1997.
Jenkin, M. E., Saunders, S. M., Wagner, V., and Pilling, M. J.: Protocol for the development of the Master Chemical Mechanism, MCM v3 (Part B): tropospheric degradation of aromatic volatile organic compounds, Atmos. Chem. Phys., 3, 181–193, https://doi.org/10.5194/acp-3-181-2003, 2003.
Jiang, J., Carter, W. P. L., Cocker III, D. R., and Barsanti, K. C.: Development and Evaluation of a Detailed Mechanism for Gas-Phase Atmospheric Reactions of Furans, ACS Earth Space Chem., 4, 1254–1268, https://doi.org/10.1021/acsearthspacechem.0c00058, 2020.
Jiang, J., Wang, Z., Barsanti, K. C., and Carter, W. P. L.: SAPRC MechGen Github Website, https://github.com/SAPRC/MechGen (last access: 30 July 2025), 2025.
Kaduwela, A., Luecken, D., Carter, W., and Derwent, R.: New directions: Atmospheric chemical mechanisms for the future, Atmos. Environ., 122, 609–610, https://doi.org/10.1016/j.atmosenv.2015.10.031, 2015.
Kirchner, F.: The chemical mechanism generation programme CHEMATA – Part 1: The programme and first applications, Atmos. Environ., 39, 1143–1159, https://doi.org/10.1016/j.atmosenv.2004.09.086, 2005.
Li, Q., Jiang, J., Afreh, I. K., Barsanti, K. C., and Cocker III, D. R.: Secondary organic aerosol formation from camphene oxidation: measurements and modeling, Atmos. Chem. Phys., 22, 3131–3147, https://doi.org/10.5194/acp-22-3131-2022, 2022.
Liu, M., Grinberg Dana, A., Johnson, M. S., Goldman, M. J., Jocher, A., Payne, A. M., Grambow, C. A., Han, K., Yee, N. W., Mazeau, E. J., Blondal, K., West, R. H., Goldsmith, C. F., and Green, W. H.: Reaction Mechanism Generator v3.0: Advances in Automatic Mechanism Generation, J. Chem. Inf. Model., 61, 2686–2696, https://doi.org/10.1021/acs.jcim.0c01480, 2021.
Postel, J. and Reynolds, J.: Telnet Protocol Specification, https://datatracker.ietf.org/doc/html/rfc854 (last access: 11 August 2025), 1983.
RMG: RMG-Reaction Mechanism Generator, https://rmg.mit.edu (last access: 11 October 2025), 2025.
Saunders, S. M., Jenkin, M. E., Derwent, R. G., and Pilling, M. J.: Protocol for the development of the Master Chemical Mechanism, MCM v3 (Part A): tropospheric degradation of non-aromatic volatile organic compounds, Atmos. Chem. Phys., 3, 161–180, https://doi.org/10.5194/acp-3-161-2003, 2003.
Venecek, M. A., Cai, C., Kaduwela, A., Avise, J., Carter, W. P. L., and Kleeman, M. J.: Analysis of SAPRC16 chemical mechanism for ambient simulations, Atmos. Environ., 192, 136–150, https://doi.org/10.1016/j.atmosenv.2018.08.039, 2018.
Vereecken, L. and Nozière, B.: H migration in peroxy radicals under atmospheric conditions, Atmos. Chem. Phys., 20, 7429–7458, https://doi.org/10.5194/acp-20-7429-2020, 2020.
Wikipedia: LambdaMOO, https://en.wikipedia.org/wiki/LambdaMOO (last access: 11 August 2025), 2025.
Wolfe, G. M., Marvin, M. R., Roberts, S. J., Travis, K. R., and Liao, J.: The Framework for 0-D Atmospheric Modeling (F0AM) v3.1, Geosci. Model Dev., 9, 3309–3319, https://doi.org/10.5194/gmd-9-3309-2016, 2016.
- Abstract
- Introduction
- Reactant specification and single-step reactions
- Full mechanism derivation
- Derivation of multi-generation mechanisms
- Derivation of lumped mechanisms
- Discussion
- Code and data availability
- Author contributions
- Competing interests
- Disclaimer
- Acknowledgements
- Financial support
- Review statement
- References
- Supplement
- Abstract
- Introduction
- Reactant specification and single-step reactions
- Full mechanism derivation
- Derivation of multi-generation mechanisms
- Derivation of lumped mechanisms
- Discussion
- Code and data availability
- Author contributions
- Competing interests
- Disclaimer
- Acknowledgements
- Financial support
- Review statement
- References
- Supplement