Harmonized Emissions Component (HEMCO) 3.0 as a versatile emissions component for atmospheric models: application in the GEOS-Chem, NASA GEOS, WRF-GC, CESM2, NOAA GEFS-Aerosol, and NOAA UFS models

Abstract. Emissions are a central component of atmospheric
chemistry models. The Harmonized Emissions Component (HEMCO) is a software
component for computing emissions from a user-selected ensemble of emission
inventories and algorithms. It allows users to re-grid, combine, overwrite,
subset, and scale emissions from different inventories through a
configuration file and with no change to the model source code. The
configuration file also maps emissions to model species with appropriate
units. HEMCO can operate in offline stand-alone mode, but more importantly
it provides an online facility for models to compute emissions at runtime.
HEMCO complies with the Earth System Modeling Framework (ESMF) for
portability across models. We present a new version here, HEMCO 3.0, that
features an improved three-layer architecture to facilitate implementation
into any atmospheric model and improved capability for calculating
emissions at any model resolution including multiscale and unstructured
grids. The three-layer architecture of HEMCO 3.0 includes (1) the Data Input
Layer that reads the configuration file and accesses the HEMCO library of
emission inventories and other environmental data, (2) the HEMCO Core that
computes emissions on the user-selected HEMCO grid, and (3) the Model
Interface Layer that re-grids (if needed) and serves the data to the
atmospheric model and also serves model data to the HEMCO Core for
computing emissions dependent on model state (such as from dust or vegetation). The HEMCO Core is common to the implementation in all models, while
the Data Input Layer and the Model Interface Layer are adaptable to the
model environment. Default versions of the Data Input Layer and Model
Interface Layer enable straightforward implementation of HEMCO in any simple
model architecture, and options are available to disable features such as
re-gridding that may be done by independent couplers in more complex
architectures. The HEMCO library of emission inventories and algorithms is
continuously enriched through user contributions so that new inventories
can be immediately shared across models. HEMCO can also serve as a general
data broker for models to process input data not only for emissions but for
any gridded environmental datasets. We describe existing implementations of
HEMCO 3.0 in (1) the GEOS-Chem “Classic” chemical transport model with
shared-memory infrastructure, (2) the high-performance GEOS-Chem (GCHP)
model with distributed-memory architecture, (3) the NASA GEOS Earth System
Model (GEOS ESM), (4) the Weather Research and Forecasting model with
GEOS-Chem (WRF-GC), (5) the Community Earth System Model Version 2 (CESM2),
and (6) the NOAA Global Ensemble Forecast System – Aerosols
(GEFS-Aerosols), as well as the planned implementation in the NOAA Unified Forecast
System (UFS). Implementation of HEMCO in CESM2 contributes to the
Multi-Scale Infrastructure for Chemistry and Aerosols (MUSICA) by providing
a common emissions infrastructure to support different simulations of
atmospheric chemistry across scales.


The Harmonized Emissions Component (HEMCO), originally developed by Keller et al. (2014) and formerly called the Harvard-NASA Emissions Component, computes emissions customized to user needs through a configuration file and a 70 database library. Atmospheric chemistry modelers can use HEMCO either off-line in a standalone mode to compute and archive emissions, or on-line to compute emissions at runtime and serve them to the model at each time step. HEMCO can select, modify, re-grid, combine, and supersede emission inventories and algorithms without changing the source code of their model. Built-in algorithms called 'extensions' compute emissions dependent on environmental data and model state variables such as for vegetation, dust, lightning, and oceans. Subgrid processing of emissions to account for fast chemistry, as in ship 75 plumes (Vinken et al., 2011), is also done in extensions. Selection of HEMCO extensions is done in the configuration file and the computations are done independently of the atmospheric model, allowing for immediate portability to other models.
HEMCO was originally developed for the GEOS-Chem atmospheric chemistry model (Bey et al., 2001;Eastham et al., 2018), where the current version HEMCO 2.0 has two different implementations. The 'Classic' version of GEOS-Chem with single-80 node shared-memory parallelization (OpenMP) and rectilinear latitude-longitude grids (Bey et al., 2001) uses HEMCO to its full extent including reading, re-gridding, and temporal interpolation of input files. In GEOS-Chem 'Classic', HEMCO is used not only for emissions but as a general data broker to read and process all model input data, including meteorological fields and initial conditions. This has enabled in particular the FlexGrid algorithm to run GEOS-Chem on any custom nest selected at runtime (Li et al., 2021). This implementation of HEMCO is also used in the recently developed coupling of GEOS-Chem 85 with the Weather Research and Forecasting (WRF) model (WRF-GC; Lin et al., 2020, Feng et al., 2021. The "High-Performance" version of GEOS-Chem (GCHP; Eastham et al., 2018) used a different implementation of HEMCO 2.0. GCHP is designed for multi-node massively parallel computation using a distributed-memory parallelization (MPI) enabled by the NASA Model Analysis and Prediction Layer (MAPL, Suarez et al., 2007) which acts as the model's 90 infrastructure and handles inter-node communication. MAPL is built upon the Earth System Modeling Framework (ESMF, Hill et al., 2004) and handles data read, re-gridding, and interpolation, so the corresponding HEMCO routines are disabled.
This implementation of HEMCO is also presently used by the GOCART aerosol model operating within the MAPL-based NASA Goddard Earth Observing System (GEOS) Earth Science model (Rienecker et al., 2008;Randles et al., 2017).

95
HEMCO 2.0 has several limitations that limit its portability to other models. First, the re-gridding capability is limited to latitude-longitude grids. Second, its implementation of HEMCO in distributed-memory environments uses MAPL-specific features. Third, it has a multiplicity of model access points that introduce unnecessary dependency on model code. Fourth, it requires that emissions be computed on the model grid, which may introduce inaccuracies in masking of regional inventories and in non-linear computations, and further necessitates duplicate copies of HEMCO to handle different resolutions in multi-100 scale model applications such as WRF-GC. HEMCO 3.0 overcomes all these limitations of HEMCO 2.0. Construction of HEMCO 3.0 was motivated by interest from the Community Earth System Model Version 2 (CESM2; Pfister et al., 2020) and the NOAA Unified Forecast System (UFS; Campbell et al., 2020) in using HEMCO as an emissions component. This led us to develop a more modularized and powerful structure to increase accuracy and portability to different atmospheric models including with multi-scale and unstructured 105 grids. HEMCO 3.0 preserves a shared common core for calculating emissions by selecting, adding, superseding (masking), and scaling emission inventories as specified by the user. Other parts of HEMCO are modularized to facilitate the incorporation of HEMCO into the specific software environment of the target model. A three-layer architecture is created to separate (1) input and re-gridding of data, (2) emission calculations using the HEMCO Core, and (3)   HEMCO 3.0 is modularized into a three-layer architecture as shown in Fig. 1, consisting of a Data Input Layer, the HEMCO 120 Core, and the Model Interface L ayer. The Data Input Layer reads the configuration file and the database library of emission inventories and other environmental information, and re-grids the data to a user-defined HEMCO grid (finer than or identical to the model grid). The HEMCO Core assembles the emissions on the basis of instructions in the configuration file including adding, scaling, and masking of individual inventories, and computing emissions dependent on model state variables and environmental data (through algorithms referred to as HEMCO extensions). The Model Interface Layer communicates 125 HEMCO output (including computed fields such as emission fluxes and diagnostics) to the target atmospheric model (hereafter referred to as "the model"), with re-gridding to the model grid if needed, and takes in and re-grids model variables to the HEMCO grid for use in HEMCO extensions. The Data Input Layer and the Model Interface Layer have different implementations depending on the architecture of the model. The HEMCO Core, where emissions are computed, is the same in all cases. HEMCO operates on a horizontal "HEMCO grid" that may be any user-desired grid configuration (e.g., finer than 130 the model grid, or the finest element of a multi-scale model grid), with the other layers handling the re-gridding to/from the HEMCO grid as necessary. 2-D (horizontal) emissions can be released at the surface or allocated vertically on the model grid as specified by the user through the HEMCO configuration file. 3-D emission databases (such as for aircraft emissions) are re-gridded vertically to the model grid through the Data Input Layer. 135 Figure 2. Sample HEMCO configuration file. The HEMCO configuration file is organized in three sections: (1) switches for "collections" of data containers, (2) the data containers to be used in the model simulation, optionally organized into "collections", and (3) the scaling and masking rules to be used. Entries in both sections are organized in a similar format, including a number and/or name, the data source (netCDF 140 file and variable name, numbers, or mathematical expressions), the temporal range options and spatial dimensions, and their category and https://doi.org/10.5194/gmd-2021-130 Preprint. Discussion started: 3 May 2021 c Author(s) 2021. CC BY 4.0 License. hierarchy (in the same category, data entries with higher hierarchy take precedence). For data containers, scaling factors and masks are applied by referencing the numbered scaling factor and mask entries (colored text).
The HEMCO configuration file (example in Fig. 2) controls the operation of all HEMCO layers, fully describing the 145 relationship between the input data read by the Data Input Layer, the processing by the HEMCO Core, and the data passed to the model by the Model Interface Layer. It is organized as individual entries for data, scaling factors, and masks. Each entry is numbered or named and includes information about the source of data (usually a netCDF file name but may be a number or mathematical expression in simple cases). For netCDF data files, each entry specifies the netCDF variable name to be read (allowing the mapping from netCDF input species to model species), the temporal range, refresh frequency, cycling option 150 (whether to continuously cycle the data or require an exact date match) and spatial dimension (2-D or 3-D data). This information is used by the Data Input Layer. It also includes the model species name, the scaling factors to be applied, and the hierarchy (priority order used for masking). If a data entry does not include a species name, the entry is treated as generic data and is read into HEMCO, scaled, masked, and made available to the model upon request. This information is used by the HEMCO Core for processing and by the Model Interface Layer for export to the model. For organization and easy control of 155 inventories, entries may be organized in the form of "collections", which can be enabled or disabled in bulk using switches.
HEMCO comes with a default database library of emission inventories and environmental data sets to which users can add their own. A detailed specification of the HEMCO configuration file is available on the GEOS-Chem Wiki (http://wiki.seas.harvard.edu/geos-chem/index.php/The_HEMCO_User%27s_Guide, retrieved 7 January 2021).

160
The HEMCO configuration file makes it possible for different models with different chemical species to share a single set of emissions input data (the "HEMCO database library"), without manually pre-processing the files for each mechanism. The variable name option in each entry allows for the mapping of the species name in the netCDF file to the model species name.
One can also partition a class of species from the inventory into individual model species. For example, total alcohols in the CEDS inventory (Hoesly et al., 2018) are to be emitted as 15% methanol and 85% ethanol in the CAM-chem model (Emmons 165 et al., 2020). In that example, as illustrated in Fig. 2, HEMCO scaling factors are used to scale the same input variable by 15% and 85% into the CH3OH and C2H5OH model species. The scaling factor functionality in HEMCO can be used for temporal scaling (diurnal, day-of-week, seasonal, interannual) or to convert units from the emission inventory in the HEMCO database library to the target model. For example, emissions may be provided as kilograms of NOx on an NO 2 mass basis in the inventory file, but emitted as NO and NO 2 in the model. 170 The Data Input Layer processes each enabled entry in the HEMCO configuration file, reads the corresponding files from disk, and re-grids them to the HEMCO grid. The Data Input Layer then passes the data to the HEMCO Core in the form of data containers corresponding to each entry in the configuration file.

HEMCO Core
The HEMCO Core calculates emissions with summations, masks, and scaling factors specified in the HEMCO configuration file. Masking is done by specifying an inventory hierarchy, so that default inventories may be overwritten by different inventories available only for a particular region or period. For example, an inventory specific for China in 2018 may overwrite a global default inventory. A simulation for later years may retain the Chinese inventory for 2018, scale it up or down, or 180 default to the global inventory, as specified in the configuration file. Emissions dependent on model state such as dust or lightning may be computed on-line in the HEMCO Core by using algorithms called HEMCO extensions supplied with the HEMCO Core. For example, the current HEMCO Core includes as default the DEAD dust emission extension implementing the algorithm from Zender et al. (2003) but users may select other available extension options (such as the Ginoux et al. (2001) algorithm) or they can add a new algorithm as an extension. Alternatively, users may pre-compute their emissions based on 185 off-line input data and disable the HEMCO extension. Both approaches are routinely used in GEOS-Chem (Weng et al., 2020).
At the beginning of the run, the Model Interface Layer provides the model species list to the HEMCO Core along with any physical properties needed for computation of state-dependent emissions (for example, Henry's law constants for ocean fluxes). It also provides information to the HEMCO Core on the model environment, such as the model clock and time step 190 size.
At every HEMCO time step, when HEMCO is called by the model, the HEMCO Core performs the requested calculations, loading the latest available input data using the Data Input Layer as necessary. Emission fluxes are summed by species, and non-emissions data are stored individually by their data container name (UV albedo example in Fig. 2). The Model Interface 195 Layer then exports the computed fields to the model, interpolating the data to the model grid if it is different than the HEMCO grid. The Model Interface Layer also passes updated model state information to the HEMCO Core for use in extensions (for example, wind speeds to calculate dust emissions). That model state information is re-gridded to the HEMCO grid for purpose of the extension computations.

200
HEMCO 3.0 computes vertically distributed (3-D) emissions in the same way as 2-D. HEMCO 2.0 assumed the 72-level or the 47-level GEOS grid when reading vertically distributed emissions. This required pre-processing of the emission inventory files from their original vertical grids to the supported GEOS grid. Such pre-processing is not required anymore in HEMCO 3.0, which reads vertical emission data on any sigma-pressure grid described in the input netCDF inventory file. The input data are then vertically re-gridded on-line to the model vertical grid by the Data Input Layer using the MESSy NCREGRID 205 package (Jöckel, 2006). This functionality is used in the WRF-GC and CESM2 models, which have user-configurable vertical grids. HEMCO also includes as diagnostic capability a netCDF output component to archive selected emissions at specfied time steps to disk. This may also include custom diagnostic quantities, such as lightning flash rate from the Lightning NO x extension. 210 The output component is used when HEMCO operates off-line in standalone mode, and also in GEOS-Chem Classic where HEMCO handles emission diagnostics. Other target models may have their own diagnostic packages on the model grid, in which case HEMCO diagnostics can be disabled. Using the 4×finer (2 o x2.5 o ) HEMCO grid allows for better resolution of the US-Mexico border and hence for more of the US inventory to be used near the border. This means that the emissions in the model grid cell which straddles the border will be a mix of emissions from both sides, rather than being uniquely assigned.

240
HEMCO 3.0 provides the ability for computing emissions on a horizontal grid finer than the model grid. Previous versions of HEMCO assumed its operation on the same grid as the model. If the model operated multiple grids simultaneously at runtime, as is the case in WRF-GC, multiple instances of HEMCO were used, thus increasing computational and memory cost.

245
In HEMCO 3.0, a single instance of HEMCO reads and processes data on a user specified HEMCO grid. When data are requested by the model, the Model Interface Layer re-grids emissions and other data from the HEMCO grid to the model grid.
This allows HEMCO to (1) provide data to model components on different grids, (2) operate on a higher resolution for masking and scaling purposes, thus achieving greater accuracy at boundaries between different inventory domains, and (3) use highresolution environmental data sets when computing emissions through extensions. 250 Figure 3 illustrates the benefit of using a finer HEMCO grid at the boundaries between inventories. When the HEMCO grid is disabled, HEMCO runs at model resolution, and all input data, including masks, are re-gridded by the Data Input Layer to the model resolution before emissions are computed by the HEMCO Core. In the example of Fig. 3, where a national inventory for the US is to overwrite a global default inventory, this overwriting can be done only for grid cells that are fully in the US. 255 Grid cells straddling the border must retain the global default in order to avoid under-or over-accounting, but this then loses information from the US inventory. Using a finer, intermediate-resolution grid -the HEMCO grid -allows emissions at the model grid scale to more accurately blend the contributions from the two sides of the border in a single grid cell. This also ensures greater consistency when using the same model simulations at different resolutions. As long as the HEMCO grid is kept at a single resolution, the calculated emissions will be consistent between simulations -no matter what model grid 260 resolution is selected.
Another advantage of using a finer HEMCO grid is for emissions computed with extensions and dependent on both the model variables (provided on the model grid) and environmental data (provided on the HEMCO grid). If there is non-linear dependence of emissions on the environmental data variables, then a finer HEMCO grid will produce more accurate emissions. 265 This is the case for example in dust emission algorithms that use land type as a categorical variable.
There is a limit to the resolution of the HEMCO grid because of the need for HEMCO to store the different inventories in memory and process the data at higher resolution, which may be computationally expensive, even for data which may not need scaling and masking. One can circumvent the problem by pre-processing the emissions on their native grids using HEMCO in 270 off-line (standalone) mode, but this adds an additional step in running the model.
While any unstructured grid may be used as the HEMCO grid, it may be desirable to use a rectilinear latitude-longitude grid for prototyping HEMCO in new models. This is because the default Data Input Layer provided with HEMCO only supports rectilinear latitude-longitude grids, and most input data available in the HEMCO database library are also on rectilinear 275 latitude-longitude grids. By choosing such a grid, the default Data Input Layer can be readily used for quick prototyping of a new HEMCO implementation, which may then be improved upon if another HEMCO grid is more desirable.

Data broker functionality
HEMCO 3.0 has the capability to process any model input data other than emissions such as meteorological fields, land use maps, boundary conditions, etc. These data can be selected, subsetted (masked), added, and scaled in the same way as 280 emissions. GEOS-Chem has long used this general data broker functionality in HEMCO, but this was previously done by interfacing directly with HEMCO's internal data containers. As this approach bypassed the HEMCO Core, processing of data by the HEMCO Core was also not supported. HEMCO 3.0 standardizes the code for models to retrieve arbitrary data from HEMCO through the Model Interface Layer, thus processing all data from the Data Input Layer through the HEMCO Core. In this manner, HEMCO 3.0 can serve as a general data broker for models if desired. with the same HEMCO Core. We describe below the particularities of implementation for each model as a guide for implementation in other models. broker. The main model driver routine main.F90 successively calls each GEOS-Chem module and HEMCO in a timestep loop. HEMCO 3.0 reads all input data through the default Data Input Layer, processes the data through the HEMCO Core, and exports the data to each GEOS-Chem module through the HEMCO default Model Interface Layer. The Model Interface Layer also receives data from GEOS-Chem modules that are used by HEMCO to compute state-dependent emissions through extensions. 300 GEOS-Chem 'Classic' (Bey et al., 2001) is an off-line chemical transport model (CTM) driven by NASA GEOS meteorological data. It operates on global or regional (nested) rectilinear latitude-longitude grids. It uses OpenMP sharedmemory parallelization on a single node without a dedicated coupler. Figure 4 shows the implementation of HEMCO 3.0 in GEOS-Chem 'Classic' as both an emissions component and a general input data broker. The default Data Input Layer is used to read and re-grid all input data, which are then processed through the HEMCO Core. When the HEMCO grid is different 305 than the model grid (Sect. 2.4), the Model Interface Layer performs horizontal re-gridding between the two grids for the data flowing through it. HEMCO 3.0 in GEOS-Chem 'Classic' is used for all gridded input data including not only emissions but also meteorological fields, chemical boundary conditions (for regional runs), initial conditions, and other environmental data sets such as land type, 310 leaf area index, and sea surface salinity. It serves as a re-gridding and subsetting tool for these data. This has in particular enabled the FlexGrid capability in GEOS-Chem where regional nested domains are selected at runtime and all input data are processed for these domains (Li et al., 2021;Shen et al., 2021).

320
GCHP (Eastham et al., 2018;Zhuang et al., 2020) is a high-performance version of GEOS-Chem that takes advantage of the grid-independent structure of the model (Long et al., 2015) to apply a distributed-memory MPI parallelization enabling efficient simulations with thousands of cores. Implementation of MPI is through the GEOS MAPL environment on cubedsphere grids. MAPL is a modeling toolkit built upon ESMF that provides additional tools for interfacing between ESMF and 325 the model code (Suarez et al., 2007). It serves as a coupler for individual model components, referred to as "gridded components", and provides input and cubed-sphere re-gridding capabilities for all external data through the ExtData component. GCHP advection is computed by the FV3 dynamical core gridded component (Putman and Lin, 2007). Figure 5 shows the implementation of HEMCO 3.0 in GCHP. In the MAPL environment, all data are read and re-gridded to 330 the model grid by the ExtData component. Thus, HEMCO in MAPL uses the MAPL Data Input Layer, which simply retrieves data from ExtData. Unlike in GEOS-Chem 'Classic', meteorological data are not processed by HEMCO in GCHP, as these data are provided to GEOS-Chem through ExtData. There is also no option for HEMCO to operate on a HEMCO grid different from the model grid because ExtData re-grids all data to the model grid.

335
After emissions data are processed by the HEMCO Core, GEOS-Chem receives the emissions from HEMCO through the HEMCO default Model Interface Layer, in the same manner as GEOS-Chem 'Classic'. In this manner, the interface between the GEOS-Chem emissions module and HEMCO is the same for GEOS-Chem 'Classic' and GCHP, facilitating the maintenance of a single GEOS-Chem code. The GEOS ESM (Rienecker et al., 2008) provides the platform for Earth system data analysis at NASA through the GEOS 350 Data Assimilation System (GEOS-DAS). It has several modules for on-line representation of atmospheric chemistry including GEOS-Chem (Hu et al., 2018) and GOCART aerosols (Chin et al., 2002;Randles et al., 2017). GOCART is used in the operational GEOS-DAS as a fast module for aerosol data assimilation. The GEOS-Chem module is used in the GEOS chemical forecast product (GEOS-CF;  and in research applications. It has exactly the same code as the off-line GEOS-Chem but with all transport routines disabled, since chemical transport is done as part of the GEOS ESM atmospheric 355 dynamics. Figure 6 shows the implementation of HEMCO 3.0 in the GEOS ESM to serve both the GEOS-Chem and GOCART modules.   (Skamarock et al., 2008) with GEOS-Chem, in the same manner as the coupling of WRF with WRF-Chem (Grell et al., 2005;Fast et al., 2006). It uses a WRF-GC coupler 380 separate from the WRF and GEOS-Chem parent models to interface between the two models, converting the state between WRF and GEOS-Chem as necessary to drive both models. This coupling structure enables independent updates of each model in WRF-GC. Chemical advection is done by WRF but convection and planetary boundary layer (PBL) mixing are done by GEOS-Chem using input data from WRF, following the practice in WRF-Chem. Aerosol effects on WRF radiation and cloud physics are treated by passing GEOS-Chem aerosol information to WRF through the WRF-GC coupler. Multi-scale WRF 385 grids communicating by 2-way nesting are also supported by WRF-GC. emissions within HEMCO, and exports emissions and other input data to the CAM physics buffer. HEMCO operates on its own highresolution grid, with re-gridding routines to pass data to/from CAM. Environmental data for computing emissions and dry deposition (e.g., land type) may either be read from HEMCO data libraries or provided to HEMCO by the CAM state.

WRF-GC 370
The Community Earth System Model Version 2 (CESM2) is an open-source model enabling a wide range of Earth science 405 simulations including atmospheric chemistry. The atmospheric component of CESM2 is the Community Atmosphere Model (CAM), including the CAM-chem module to simulate chemistry (Emmons et al., 2020). The new MUSICA initiative at NCAR (Pfister et al., 2020) seeks to expand the capabilities and versatility of the chemical simulation within CAM, including use of GEOS-Chem as an alternative module. 410 Figure 8 shows the implementation of HEMCO 3.0 in CESM2. HEMCO serves emissions to CAM-chem, GEOS-Chem, and potentially to any representation of atmospheric chemistry in CAM. The Model Interface Layer includes routines to export data processed by HEMCO to CAM's physics buffer, a temporary storage space for model components to share data at runtime. HEMCO within CESM2 can use the existing HEMCO emissions database library out-of-the-box and it can also import new 415 emission inventories and extensions. The CAM-chem implementation uses a configuration file with the appropriate CAMchem species mapping as described by Emmons et al. (2020). An example for alcohols was described in Sect. 2.1. For CESM with GEOS-Chem chemistry (CESM2-GC), HEMCO works out-of-the-box with configuration files from GEOS-Chem since CESM2-GC uses the same species.

420
Environmental data needed for computing emissions and dry deposition, such as land type and leaf area index, are normally provided by the CESM state to HEMCO through the Model Interface Layer. This is required for coupled chemistry-biosphereclimate simulations, where atmospheric chemistry affects ecosystem state (both directly and indirectly through the climate), which in turns affects atmospheric chemistry. Alternatively, one may want to use independent environmental data specified through the HEMCO configuration file in order to compare the CESM simulation to an independent simulation of atmospheric 425 chemistry such as with GEOS-Chem 'Classic'. Both capabilities are supported by HEMCO within CESM2. As described in NOAA's 2018 Strategic Implementation Plan for next-generation modeling systems (https://www.weather.gov/sti/stimodeling_nggps_implementation), "a unified emission system with the capability of providing model-ready, global anthropogenic and natural source emissions inputs for aerosol and gas phase atmospheric composition across scales is needed." The next generation modeling systems of NOAA are being realized as the Unified 440

NOAA GEFS-Aerosol and NOAA UFS
Forecast System (UFS; https://ufscommunity.org), which will replace the current suite of forecast models in the National Weather Service (NWS) over the next few years. The UFS does not refer to a single modeling system, but rather describes a unified software infrastructure that permits the exchange of model components between different application models. To respond to the requirement for a unified emissions system, HEMCO 3.0 is being tested to serve as the core of the NOAA Emissions and eXchange Unified System (NEXUS) component (Campbell et al., 2020). NEXUS will provide emissions and 445 broader surface exchange information for NOAA's UFS global and regional aerosol and atmospheric composition (AAC) models.  (Lin et al., 1994;Lin and Rood, 1996;Lin, 2004;Putman and Lin, 2007;Chen et al., 2013;Harris and Lin, 2013;Harris et al., 2016;Zhou et al., 2019). The operational version of GEFS-Aerosol is run by the NWS as a special unperturbed forecast of the Global Ensemble Forecast System version 12, which provides an ensemble forecast product four times per day. 455 Ongoing development of NEXUS includes adaptation to provide emissions for three future UFS applications: i) a global aerosol model, adapted from GEFS-Aerosol, that will be part of a sub-seasonal to seasonal forecast capability, and which uses the NASA GOCART aerosol model; ii) an on-line, regional-scale air quality model with fully coupled gas-aerosol chemistry derived from the Community Multiscale Air Quality (CMAQ) model (U. S. EPA, 2018); and, iii) an on-line, regional-scale 460 rapid refresh forecast system (RRFS) that includes fully coupled smoke and dust emissions and transport and aerosol-weather interactions. Further planned development of NEXUS for UFS AAC models includes the integration of HEMCO 3.0 as a dynamic, on-line emissions processor for both anthropogenic inventories as well as natural, process-based sources (e.g., windblown dust, wildfire smoke, sea-salt, etc.) As the UFS AAC models evolve over time, expanding global chemical simulation capabilities and including a broader suite of chemical/aerosol options in the regional models, NEXUS will be fully 465 capable of providing emissions for both research and operational applications. A longer-term goal for NEXUS includes the harmonization of emissions-related processes with the surface-atmosphere exchange and boundary-layer processes in the land surface modeling system. The current vision for the NEXUS architecture is evolving as the UFS AAC models are being developed but will rely on established coupling and integration infrastructures, such as the the National Unified Operational Prediction Capability (NUOPC) Layer (Theurich et al., 2016; http://earthsystemmodeling.org/nuopc/). 470

Conclusions
We presented an updated version 3.0 of the Harmonized Emissions Component (HEMCO 3.0) for atmospheric models.
HEMCO is a versatile on-line emissions processor originally developed for the GEOS-Chem chemical transport model but is now portable to any atmospheric model. HEMCO allows users to select an ensemble of emission inventories and statedependent emission algorithms (extensions) with capabilities for re-gridding, adding, masking, and scaling emission data and 475 mapping them to model species. HEMCO 3.0 addresses limitations of the previous version, HEMCO 2.0, which used features specific to the GEOS-Chem 'Classic' or MAPL environments, and was limited to operating on the model grid. HEMCO 3.0 has a modular structure to facilitate its implementation in models with different software engineering protocols. It features an optional high-resolution grid that may be finer than the model grid for more accurate masking, more accurate computation of emissions with nonlinear algorithms, and the serving of emission data to multi-grid models with greater computational and 480 memory efficiency. HEMCO 3.0 can also serve as a general data broker to process all input data in the atmospheric model, not just emissions.
HEMCO 3.0 modularizes the original HEMCO (Keller et al., 2014) into three layers: the Data Input layer, the HEMCO Core, and the Model Interface Layer. The Data Input Layer reads a configuration file that defines the emission environment desired 485 by the user, extracts the necessary inventory and other data files in netCDF format from the HEMCO database library, and regrids the data to the model grid or to the higher-resolution HEMCO grid. The HEMCO Core subsets, adds, masks, and scales the different data sets as specified by the configuration file. The Model Interface Layer collects the emissions data from the HEMCO Core to pass on to the atmospheric model, and also passes model state variables to the HEMCO Core for computing emissions through extensions. The HEMCO Core and database library are common to HEMCO implementations across all 490 models. The Data Input Layer and Model Interface Layer may be used out-of-the-box or modified to fit a model's specific architecture.
HEMCO 3.0 has been implemented in several models: the GEOS-Chem CTM in both 'Classic' and High-Performance (GCHP) configurations, the NASA GEOS ESM, WRF with GEOS-Chem chemistry (WRF-GC), the CESM2 model with either CAM-495 chem or GEOS-Chem chemistry, and the NOAA GEFS-Aerosol model as an offline emissions pre-processor. GEOS-Chem 'Classic' relies on the default implementations of the Data Input Layer and Model Interface Layer, and these defaults may be used for quick implementation of HEMCO 3.0 in any model. GCHP and the GEOS ESM use the MAPL coupler built on ESMF to read and re-grid data; in that case the corresponding functionalities are removed from the HEMCO Data Input Layer with no editing of code in the HEMCO Core. HEMCO 3.0 is planned for inclusion in the NOAA UFS as the core of the 500 NEXUS component that will serve emission and surface exchange information to the suite of NOAA aerosol and atmospheric composition forecast models. This will add a new dimension to HEMCO capabilities to include surface deposition and twoway exchange of chemical species. Implementation of HEMCO 3.0 in the CESM2 model is an important step in the development of MUSICA, a flexible modeling framework for the next-generation CESM allowing for versatile use of different atmospheric chemistry simulation components 505 on any grid and scale (Pfister et al., 2020). HEMCO within CESM can operate on multi-scale or unstructured grids, can serve data to any CESM atmospheric component, and can interface with any chemical mechanism by mapping emitted species to the mechanism species. Through HEMCO, CESM users can readily use and combine any ensemble of emission inventory data and algorithms that they choose, independently of their chemical mechanism or other aspects of the chemical simulation.
HEMCO thus provides a general vessel for the treatment of emissions in MUSICA, and could also in the future provide a 510 general input data broker facility for CESM.