Improved methodologies for Earth system modelling of atmospheric soluble iron and observation comparisons using the Mechanism of Intermediate complexity for Modelling Iron (MIMI v1.0)

Herein, we present a description of the Mechanism of Intermediate complexity for Modelling Iron (MIMI v1.0). This iron processing module was developed for use within Earth system models and has been updated within a modal aerosol framework from the original implementation in a bulk aerosol model. MIMI simulates the emission and atmospheric processing of two main sources of iron in aerosol prior to deposition: mineral dust and combustion processes. Atmospheric dissolution of insoluble to soluble iron is parameterized by an acidic interstitial aerosol reaction and a separate in-cloud aerosol reaction scheme based on observations of enhanced aerosol iron solubility in the presence of oxalate. Updates include a more comprehensive treatment of combustion iron emissions, improvements to the iron dissolution scheme, and an improved physical dust mobilization scheme. An extensive dataset consisting predominantly of cruise-based observations was compiled to compare to the model. The annual mean modelled concentration of surfacelevel total iron compared well with observations but less so in the soluble fraction (iron solubility) for which observations are much more variable in space and time. Comparing model and observational data is sensitive to the definition of the average as well as the temporal and spatial range over which it is calculated. Through statistical analysis and examples, we show that a median or log-normal distribution is preferred when comparing with soluble iron observations. The iron solubility calculated at each model time step versus that calculated based on a ratio of the monthly mean values, which is routinely presented in aerosol studies and used in ocean biogeochemistry models, is on average globally one-third (34 %) higher. We redefined ocean deposition regions based on dominant iron emission sources and found that the daily variability in soluble iron simulated by MIMI was larger than that of previous model simulations. MIMI simulated a general increase in soluble iron deposition to Southern Hemisphere oceans by a factor of 2 to 4 compared with the previous version, which has implications for our understanding of the ocean biogeochemistry of these predominantly iron-limited ocean regions.

Abstract. Herein, we present a description of the Mechanism of Intermediate complexity for Modelling Iron (MIMI v1.0). This iron processing module was developed for use within Earth system models and has been updated within a modal aerosol framework from the original implementation in a bulk aerosol model. MIMI simulates the emission and atmospheric processing of two main sources of iron in aerosol prior to deposition: mineral dust and combustion processes. Atmospheric dissolution of insoluble to soluble iron is parameterized by an acidic interstitial aerosol reaction and a separate in-cloud aerosol reaction scheme based on observations of enhanced aerosol iron solubility in the presence of oxalate. Updates include a more comprehensive treatment of combustion iron emissions, improvements to the iron dissolution scheme, and an improved physical dust mobilization scheme. An extensive dataset consisting predominantly of cruise-based observations was compiled to compare to the model. The annual mean modelled concentration of surfacelevel total iron compared well with observations but less so in the soluble fraction (iron solubility) for which observations are much more variable in space and time. Comparing model and observational data is sensitive to the definition of the average as well as the temporal and spatial range over which it is calculated. Through statistical analysis and examples, we show that a median or log-normal distribution is preferred when comparing with soluble iron observations. The iron solubility calculated at each model time step versus that calculated based on a ratio of the monthly mean values, which is routinely presented in aerosol studies and used in ocean biogeochemistry models, is on average globally one-third (34 %) higher. We redefined ocean deposition regions based on dominant iron emission sources and found that the daily variability in soluble iron simulated by MIMI was larger than that of previous model simulations. MIMI simulated a general increase in soluble iron deposition to Southern Hemisphere oceans by a factor of 2 to 4 compared with the previous version, which has implications for our understanding of the ocean biogeochemistry of these predominantly iron-limited ocean regions.
1 Introduction 2 D. S. Hamilton et al.: Mechanism of Intermediate complexity for Modelling Iron (MIMI v1.0) trient for nitrogen fixation by diazotrophs (Capone et al., 1997;Moore et al., 2013Moore et al., , 2006. Atmospheric deposition of bioavailable iron (i.e. the fraction of the total iron deposited that is readily available for ocean biota uptake) contained in aerosol is an important source of new iron for the remote open ocean (Duce and Tindale, 1991;Fung et al., 2000); therefore, iron impacts the ability of oceans to act as a sink of atmospheric carbon dioxide (Jickells et al., 2014;Moore et al., 2013).
Several definitions for bioavailable iron have been proposed. The solubility of iron is considered to be a key factor modulating its bioavailability (Baker et al., 2006a, b); therefore, we consider bioavailable iron to be dissolved (labile) iron in either a (II) or (III) oxidation state, and we define this as the soluble iron concentration throughout the paper. However, since most aerosol iron is insoluble at emission the processing of insoluble iron to a soluble form must occur during atmospheric transport. The acidic processing of iron contained in aerosol is one pathway through which soluble iron can be liberated from an insoluble form with decreasing pH (Duce and Tindale, 1991;Solmon et al., 2009;Zhu et al., 1997). Organic ligands, in particular oxalate, also increase iron solubility by weakening or cleaving the Fe-O bonds found in iron oxide minerals via complexation (Li et al., 2018;Panias et al., 1996), and in nature this reaction proceeds most rapidly in a slightly acidic aqueous medium, such as cloud droplets (Cornell and Schindler, 1987;Paris et al., 2011;Xu and Gao, 2008). Organic ligand processing has been estimated to increase soluble iron concentrations by up to 75 % more than is achievable with acid processing alone (Ito, 2015;Johnson and Meskhidze, 2013;Myriokefalitakis et al., 2015;Scanza et al., 2018). However, there is no single mechanism that describes the observed inverse relationship of higher iron solubilities with decreasing iron concentrations (Sholkovitz et al., 2012). Rather, Mahowald et al. (2018) used a 1-D plume model to demonstrate that the observed trend can be explained by either the differences in iron solubility at emission or the atmospheric dissolution of insoluble iron. Thus, there is no observational constraint to indicate which is more likely unless spatial distribution is also considered.
A recent multi-model evaluation of four global atmospheric iron cycle models (Myriokefalitakis et al., 2018) showed that total iron deposition is overrepresented close to major dust source regions and underrepresented in remote regions compared with observations from all four models. This is consistent with previous model intercomparison studies that demonstrated the difficulty of simultaneously simulating both atmospheric concentrations and deposition fluxes of desert dust (Huneeus et al., 2011). Importantly, none of the atmospheric iron processing models can capture the high ( > 10 %) solubilities measured over the Southern Ocean; this is potentially because the model processes associated with transport and ageing of aerosol iron require further development (Ito et al., 2019). Conclusions from Myriokefalitakis et al. suggest that future model improvements should focus on a more realistic aerosol size distribution and the representation of mineral-to-combustion sources of iron. Most of the development of the Mechanism of Intermediate complexity for Modelling Iron (MIMI), as described herein, focused on these points. First, we transitioned from a bulk aerosol scheme to a two-moment modal aerosol scheme , and second, we re-evaluated pyrogenic iron emissions from anthropogenic combustion and fires. The modal aerosol scheme was used to calculate both aerosol mass and number at each time step within an updated global aerosol microphysics model, and both the fire and anthropogenic combustion emissions from Luo et al. (2008), which are likely to be underestimated (Conway et al., 2019;Ito et al., 2019;Matsui et al., 2018), were improved upon.
Ocean observations of iron and its soluble fraction are limited both spatially and temporally owing to the significant costs and logistical constraints associated with accumulating data from scientific cruises. Thus, there is an inherent disparity in attempting to compare climatological means calculated from temporally chronological model results with observational means calculated from temporally limited and sporadic observations (e.g. Mahowald et al., 2008Mahowald et al., , 2009). This is important because natural aerosol emissions are variable on seasonal, annual, and decadal timescales in terms of both primary natural iron emission sources (mineral dust and wildfires) and the source of aerosol acidity. For example, sulfuric acid from the oxidation of dimethyl sulfide and fire SO 2 (Bates et al., 1992;Chin and Jacob, 1996) has been observed to aid iron dissolution when far from anthropogenic acid sources (Zhuang et al., 1992). The limitations associated with the collection of continuous annual or inter-annual ship-based data across multiple remote ocean regions are immutable at present, which hinders the required derivation of the basic statistical properties of such highly variable data (Smith et al., 2017). Attention could therefore be given to the methodologies with which such model-observation comparisons are undertaken instead.
The paper is presented in four parts. The first part (Sect. 2) introduces updates made to the Bulk Aerosol Module (BAM) iron scheme of Scanza et al. (2018) and its implementation within the Modal Aerosol Module (MAM), with four modes (MAM4), within the Community Earth System Model (CESM). In the second part (Sect. 3), we compare iron concentrations and the fractional solubility of iron with the observational data. Then the third part (Sect. 4) compares our updated version of the model with its predecessor. Finally, we suggest further developments for atmospheric iron modelling and for comparing model results with sporadic observations (Sect. 5).

Aerosol model
The present study improves upon the previous atmospheric iron cycle module developed for the Community Atmosphere Model (CAM) version 4 (CAM4) embedded in the CESM; we will refer to this version as BAM-Fe  herein. We incorporated the iron module within the MAM framework (Liu et al., , 2016 currently in the Department of Energy's Energy Exascale Earth System Model (E3SM; Golaz et al., 2019) and the CAM versions 5 and 6 (CESM-CAM5-6; Neale et al., 2010); we refer to this new version of the iron model by its name (MIMI) herein. Table 1 serves as a reference and summarizes the modifications made for MIMI, which are discussed throughout the paper.
We use MAM4 with four simulated log-normal aerosol size modes: three modes (Aitken, accumulation, and coarse) containing iron and a fourth primary carbonaceous mode. Table 2 details the new pyrogenic iron (i.e. from fires and anthropogenic combustion) modal aerosol properties, while those of mineral dust iron follow existing dust aerosol properties . Generally, the modelled density of iron is similar to size-resolved ambient aerosol densities measured in eastern China (Hu et al., 2012), which has significant dust and combustion aerosol sources. MIMI was initially implemented and tested within a development branch of CAM 5.3, as per Wu et al. (2017, using Cheyenne (Computational and Information Systems Laboratory, 2017) and closely resembles CESM version 1.2.2. We used a 2.5 • ×1.9 • horizonal (longitude by latitude) resolution and 56 vertical layers up to 2 hPa. Stratiform microphysics followed a two-moment cloud microphysics scheme Morrison and Gettelman, 2008). The other major aerosol species black carbon (BC), organic carbon, sea salt, and sulfate (SO 4 ) were also simulated but are not explicitly examined here because we are focused on iron aerosol modelling. However, atmospheric iron processing in MIMI requires both sulfate and (secondary) organic aerosols to be simulated as they act as proxies for the reactant species of [H + ] and oxalate, respectively. In CAM5 sulfate aerosol is present in all three hydrophilic aerosol modes, while secondary organic aerosol is only present in the fine Aitken and accumulation modes (Liu et al., , 2016. Aerosol microphysics was applied in the same way to the new iron aerosol tracers as the base aerosol species (Liu et al., , 2016. Fire emissions were vertically distributed between six injection height ranges: 0-0.1, 0.1-0.5, 0.5-1.0, 1.0-2.0, 2.0-3.0, and 3.0-6.0 km, as per AeroCom recommendations (Dentener et al., 2006). Fire emissions were uniformly distributed in model levels between height limits. Unless otherwise stated, aerosol and precursor gas mass emissions were from the Climate Model Intercomparison Program (CMIP5) inventory . Major gas-phase oxidants (O 3 , OH, NO 3 , and HO 2 ) were supplied offline and were also from Lamarque et al. (2010). Meteorology (U , V , and T ) was nudged to Modern-Era Retrospective analysis for Research and Applications (MERRA) data for 2006-2011. Unless otherwise stated, the last 5 years were used for analysis.
The model used in this study performed well when compared to observations from a variety of different environments and produced aerosol concentrations that were close to those of the multi-model mean of similarly complex aerosol models (Fanourgakis et al., 2019).

Dust aerosol modelling
Mineral dust aerosol was modelled via the Dust Entrainment and Deposition model (DEAD; Zender et al., 2003), which was previously updated to include the brittle fragmentation theory of vertical dust flux (Kok, 2011) on mineral size fractions (Albani et al., 2014;Scanza et al., 2015). We further improved the emissions of dust in MAM to follow a physically based vertical flux theory (Kok et al., 2014a), which has been shown to significantly improve dust emissions (Kok et al., 2014b). Note that this method allowed for the removal of the soil erodibility map approach previously employed by the DEAD scheme (Table 1) and still provided more accurate simulations of regional dust emissions and concentrations (Kok et al., 2014b). Dust aerosol optical depth (AOD) was calculated using mineralogy-based radiation interactions as described by Scanza et al. (2015). Dust emissions were tuned such that a global annual mean dust AOD of ∼ 0.03 was attained, as recommended by Ridley et al. (2016) and matching values in Scanza et al. (2015) for a similar model configuration.
mineral species further refined following the brittle fragmentation theory .

Iron aerosol modelling
The simulated life cycle of iron can be grouped into three main stages: (1) iron emission to the atmosphere, (2) physical-chemical iron processing during transport, and (3) final iron deposition and thus loss from the atmosphere. In the following sections, we describe the emissions and subsequent atmospheric dissolution of iron (stages 1 and 2), while the effects of this on the magnitude of oceanic soluble iron deposition (stage 3) in MIMI are examined and compared to BAM-Fe in Sect. 4. Iron optical properties are currently considered to reflect those of hematite because this mineral contains 97 % of the iron aerosol mass fraction (see Sect. 2.3.1).

Iron aerosol emissions
MIMI contains three major iron emission sources: mineral dust, fires (defined here as the sum of wildfires and humanmediated biomass burning), and anthropogenic combustion (defined here as the sum of industrial and domestic biofuel burning). In the BAM-Fe version of the model, fire and anthropogenic combustion emissions were combined into a single static monthly mean value. In MIMI, fire emissions of iron were updated to be distinct from other pyrogenic iron sources and were parameterized to track the BC emissions from fires using an Fe : BC ratio. Fire BC emissions were simulated to be time varying on a monthly scale, resulting in a much more pronounced seasonality to fire iron emissions (e.g. Giglio et al., 2013) compared to BAM-Fe wherein seasonality was not imposed.
For all iron species in each mode, the aerosol number emissions (Fe emit,num ) were calculated from the mass emissions within the same mode (Fe emit,mass ) using the properties in Table 2 and following Liu et al. (2012): (1)

Iron emissions within mineral dust aerosol
Based on previous research by Journet et al. (2008) and Ito and Xu (2014), the iron fraction in each mineral species was prescribed at emission as follows: 57.5 % in hematite, 11 % in smectite, 4 % in illite, 0.24 % in kaolinite, 0.34 % in feldspar, and 0 % in the remaining three mineral species (Table 3), which has been shown to improve the accuracy of the modelled total iron fraction estimated from mineral dust Zhang et al., 2015). The mass of each of the eight mineral dust species advected at each model time step was the residual mineral mass (i.e. after the removal of the iron mass) such that the sum of all eight minerals and the total iron from mineral dust equalled unity and hence the original total singular dust mass emitted from the land surface. Iron emissions from the five iron-bearing mineral dust species (three dust minerals contain no iron) were then partitioned into the four advected mineral-dust-bearing iron aerosol tracers (Table 3); iron tracers were defined as being (in)soluble and by the speed of the atmospheric reaction rate acting on them: slow or medium . Note that slow-and medium-soluble iron is only produced by non-reversible atmospheric processing within the model; therefore, computational costs can be reduced by not creating a separate iron tracer representing the fraction which is already soluble at emission (i.e. "fast" reacting) but instead adding an initial medium-soluble iron processed emission burden which is equivalent to the assumed fast-reacting iron fraction.

Iron aerosol emissions from fires
Following Luo et al. (2008), we used observed Fe : BC mass ratios to estimate fine-and coarse-mode iron emissions from fires. An additional difference between BAM (CAM4) and MAM (CAM5) is the emission dataset used to estimate global fire emissions of aerosol and trace gases. The BAM model uses adjusted AeroCom fire emissions (Dentener et al., 2006;Scanza et al., 2018), while MAM uses CMIP5 fire emissions . Base fire BC emissions within the CMIP5 database are 2.55 Tg a −1 BC; however, the scaling of emissions from fires has been shown to be necessary to improve model-to-observed (aerosol optical depth and particulate matter) BC ratios (Reddington et al., 2016;Ward et al., 2012). Therefore, we globally scaled the fire iron emissions by a uniform factor of 2, which is comparable with the overall lower scaling factor from a review of the literature by Reddington et al. (2016; Table 2). Fine-mode iron emissions from fires were then segregated to assign 10 % of the fine-sized mass to the Aitken mode, with the remaining 90 % assigned to the accumulation mode. Luo et al. (2008) used a single Amazonian observational dataset in their study to determine the flux of iron aerosol from fires (Fe : BC). We extended this to incorporate other Amazonian fire (Fe : BC) data and, importantly, non-Amazonian biome fire (Fe : BC) data, which are likely to have different combustion properties and hence iron emissions (e.g. Akagi et al., 2011). From Table 4, we suggest that after adding 11 more data inventory values, Luo et al. likely underrepresented the global fine-mode Fe : BC ratio at 0.02.
We instead used the global mean Fe : BC ratio from the additional data of 0.06. Conversely, Luo et al. likely overrepresented the coarse-mode Fe : BC ratio at 1.4. By including additional observational information from Artaxo et al. (2013) we reduced this to 1.0. Using size-segregated wet season (i.e. representing a locally transported emission source) observation data from Artaxo et al. (2013), we estimated that the amount of BC mass in the coarse mode was 37 % of fine-mode mass. Overall this doubles the fractional contribution of fine-mode (BAM: 0.1-1 µm size bin, MAM: sum of Aitken and accumulation modes) iron emissions from fires (BAM-Fe: fine is 7 % of total mass, MIMI: fine is 14 % of total mass).
Using the soluble Fe : BC ratio of 0.02 reported in  resulted in 33 % solubility of fine-mode iron from fires at emission, which is lower than the 46 % reported in Oakes et al. (2012) and higher than the 12 % reported in Ito (2013). As few data exist in the literature pertaining to coarse-mode BC, or more importantly its ratio to iron, we retained the 4 % solubility of iron in the coarse mode at emission, as suggested by Luo et al. Total iron emissions from fires in MIMI were 2.2 Tg Fe a −1 (Aitken: 0.02 Tg a −1 , accumulation: 0.28 Tg a −1 , coarse: 1.9 Tg a −1 ), representing an approximate increase in iron emissions from fires of around 25 % compared with those from BAM-Fe, with most of the mass (86 %) still in the coarse mode. The lower 25 % increase between BAM-Fe and MIMI iron emissions, compared to the doubling of the fire iron emissions themselves within MIMI, is due to different underlying fire emission inventories used in each model. Aerosol number concentrations were then calculated using Eq. (1) and the physical properties listed in Table 2. We adopted the methodology of Wang et al. (2015) by assuming that the density of iron aerosol from fires (and anthropogenic combustion) in the Aitken and accumulation modes matches that of BC, while in the coarse mode it matches that of mineral dust. The vertical distribution of iron emissions from fires was also updated in MIMI (BAM-Fe emitted all iron from fires at the surface) to account for pyro-convection, which lofts aerosol to higher altitudes at the point of emission within the model (Rémy et al., 2017;Sofiev et al., 2012;Wagner et al., 2018).

Iron emissions from anthropogenic combustion sources
Separate lines of evidence (Conway et al., 2019;Ito et al., 2019;Matsui et al., 2018) have shown that anthropogenic industrial iron emissions are highly likely to be larger than previously estimated (e.g. Ito, 2015;Luo et al., 2008;Myriokefalitakis et al., 2018). Therefore, anthropogenic combustion emissions of iron in MIMI were the same as those in BAM-Fe, as first reported by Luo et al. (2008), uniformly multiplied by a factor of 5 to bring them into closer agreement with observations of industrial magnetite emissions in line  Table 3. Mass fraction of iron in each simulated iron-bearing dust mineral species and allocation to each mineral iron tracer at emission. At emission medium-soluble iron is equivalent to the fast-soluble iron fraction (i.e. the fraction which is already assumed to be soluble at emission). Residual mineral dust mass is then advected as its respective tracer.

Global
Mean Fe : BC ratio = 0.06 with Matsui et al. (2018). Resulting fine-mode anthropogenic combustion emissions were 0.50 Tg Fe a −1 and coarse-mode emissions were 2.8 Tg Fe a −1 . Similar to fire emissions, 10 % of fine-sized emissions were partitioned into the Aitken mode at emission; the remaining 90 % of fine-sized emissions were emitted into the accumulation mode, and 100 % of coarsesized emissions were emitted to the coarse mode. We retain the Luo et al. (2008) estimate of 4 % combustion iron solubility at emission (Chuang et al., 2005). Calculations of aerosol number concentrations of combustion iron followed the same procedure as described for fire emissions in Sect. 2.3.2.

Acid and organic ligand processing
Once airborne, iron undergoes a series of physical and chemical processing steps within the atmosphere, each working to alter the soluble iron fraction (i.e. its solubility). The MIMI atmospheric iron dissolution scheme is presented in Table 5, with a full description reported previously by Scanza et al. (2018). Within each of the three iron-bearing aerosol size modes, six tracers of iron were advected within the model: medium-insoluble and medium-soluble mineral dust iron (containing both readily released and medium-reactive mineral dust iron; Scanza et al., 2018), slow-insoluble and slow-soluble mineral dust iron, and insoluble and soluble pyrogenic (sum of fires and anthropogenic combustion) iron, which was assumed to be medium-reactive . Both proton-and organic-ligand-promoted iron dissolution mechanisms were modelled. The proton-promoted dissolution scheme was dependent upon an estimated [H + ], calculated from the ratio of sulfate to calcite, and the simulated temperature. Organic ligand dissolution was dependent upon the simulated secondary organic carbon concentration as oxalate (the main reactant) itself was not modelled. Both the sulfate and secondary organic carbon aerosol ( Fig. S1 in the Supplement), which the iron processing requires, are fundamental components of aerosol models (e.g. Kanakidou et al., 2005;Mann et al., 2014). In CAM sulfate is mainly formed via the oxidation of SO 2(aq) with a smaller contribution from H 2 SO 4 condensation on aerosol, while secondary organic aerosol is formed via the partitioning of semi-volatile organic gases . Neither gas-to-particle production processes are structurally modified from the description of CAM5 by Liu et al. (2012Liu et al. ( , 2016 by the incorporation of MIMI. A structural model improvement was that MAM (CAM5) advected separate tracers for the interstitial and cloud-borne aerosol phases, so the proton-and organicligand-promoted dissolution reactions were applied to each aerosol phase, respectively. Dust aerosol moving through areas containing acidic gases, with a pH 1-2, increases the solubility of the iron Table 5. Summary of atmospheric processing reaction equations from Scanza et al. (2018). Here l represents either medium-or slow-reacting iron aerosol (combustion iron is modelled as medium). The pH calculation is updated to be calculated within each mode, and oxalate (C 2 O 2− 4 ) concentrations are calculated based only on the secondary organic aerosol (SOA) concentrations.

Reaction equation
Reaction rate constituents a(H + ) is the proton concentration, with an empirical reaction order m l m med = 0.39; m slow = 0.50 If [SO 4 ] > [calcite] then pH = 1 in Aitken and accumulation modes or 2 in coarse; otherwise, pH = 7.5 f (∇G r ) accounts for dissolution rate change with variation from equilibrium (equals 1 for simplicity; Luo et al., 2008) A l is the specific surface area (m 2 g −1 ) MW l is the molecular weight (g mol −1 ) For longitude (i), latitude (j ), and level (k): contained within it (Ingall et al., 2018;Longo et al., 2016;Meskhidze et al., 2003;Solmon et al., 2009), and mineralogy is a key factor determining the rate of dissolution at a given pH (Journet et al., 2008;Scanza et al., 2018). Modelled aerosol pH in MIMI was parameterized to depend only on the ratio of the calcium to sulfate aerosol concentration  , then aerosol was assumed to be well buffered (Böke et al., 1999) and the pH = 7.5. In MIMI, we updated the pH calculation from BAM-Fe in two ways: (1) in BAM-Fe, pH was calculated as the mean across all four size bins (0.1-10 µm), while in MIMI, pH was calculated separately for each interstitial aerosol size mode.
(2) Aerosol measurements of pH have shown that intersti-tial aerosol is likely to be more acidic than was assumed in BAM-Fe (Longo et al., 2016;Weber et al., 2016), even when taking into account declining sulfate levels (Weber et al., 2016); therefore, we have lowered the aerosol pH to 1 (from 2) in both the Aitken and accumulation modes wherein sulfate aerosol dominates. However, in the coarse mode, wherein dust dominates, we retained the lower pH boundary of 2. Furthermore, MAM aerosol was simulated as an internally mixed aerosol; therefore, the SO 4 : Ca ratio included the mixing of these aerosol components within each mode. See Sect. 4.2 for a comparison of acid processing in MIMI with the literature and the previous model (BAM-Fe). All aerosol species in the host CAM5 framework are carried in either an interstitial (i.e. not associated with water) or cloud-borne (i.e. associated with water) phase. The organic ligand reaction only proceeds within MIMI if the condition that cloud is present in the grid cell is first met. If cloud is present then only the iron aerosol which is associated with water undergoes organic ligand processing (i.e. the interstitial aerosol component remains unchanged). Any future development of MIMI within an aerosol model which does not advect a separate tracer for the cloud-borne phase of aerosol would therefore need to adjust the reaction to take account of this. An assumed oxalate concentration in MIMI was estimated based on the modelled organic carbon concentration and could not exceed a maximum concentration threshold of 15 µmol L −1 . In BAM-Fe, oxalate was derived from the sum of both the primary and secondary organic carbon aerosol concentrations, while in MIMI this was updated to be dependent only upon the secondary organic carbon source because oxalate is itself a product of the oxidation of volatile organic carbon gases (Myriokefalitakis et al., 2011). An additional term was added to the reaction mechanism to account for the small amount of organic ligand processing proceeding by species other than oxalate . See Sect. 4.2 for a comparison of in-cloud organic dissolution in MIMI with the literature and the previous model (BAM-Fe).

Computational costs
Earth system models are generally characterized by having a heavy computational burden in simulating atmospheric processes. The inclusion of MIMI requires eight dust mineral tracers (a net addition of seven) and six iron tracers. The total number of new aerosol tracers is 39 (13 in each of the three aerosol modes) if dust mineralogy is not already present or 18 new aerosol tracers if it is (e.g. NASA GISS model; Perlwitz et al., 2015a, b). The additional computational cost of MIMI within CESM-CAM5 is approximately a doubling of the required core hours; around half of that is associated with dust mineralogy speciation and the other half with iron speciation and processing (Table 6). Note that additional computational tuning, or changes in configuration, could modify these computational change estimates. For example, with dust mineralogy (MAM4DU8) there is an approximate tenfold increase in required core hours due to model structural differences when transitioning from CAM5 to CAM6.

Spatially aggregating limited observations
The observations of total iron concentrations and the fractional solubility of iron used in this study are joint totals (1524 records) of those reported in Mahowald et al. (2009) and Myriokefalitakis et al. (2018). However, many of these observations represent averages of only one or a few days of iron and soluble iron measurements and can thus be difficult to compare against annual, or longer, mean time periods calculated within the model. Furthermore, building empirical distributions of iron properties from observations requires a larger sample size than currently available in many regions. We therefore tested how aggregating the observations spatially, sometimes termed "super-obbing", altered our model evaluation. Our objective was to capture the small regionalscale properties of iron and not those at a point source; therefore, we assume that the benefits gained by aggregating in this way, which include helping to produce a statistically useful number of observations, outweigh any potential biases.

Variations in model temporal averaging
The model was run at a 30 min time resolution. At each 30 min time step, soluble iron, total iron, and the ratio of soluble to total iron (iron solubility) were computed. The model output was S i , (daily mean soluble iron concentration on day i), T i (daily mean total iron concentration on day i), and R i (daily mean iron solubility on day i). Note that R i is the daily mean of the calculated 30 min solubilities and hence is not equal to S i /T i . We define online solubility as the average of ratios calculated as follows: where n represents the total number of records over which the average was calculated. Online solubility is reported throughout this study. In Sect. 3.4, we then compare the average of ratios to the ratio of averages (defined as offline solubility), calculated as follows: where S and T are the grid cell averages of soluble and total iron concentrations, respectively, over the total time period considered in this study (2007 to 2011). While Eq. (7) is common within the literature, this methodology can produce larger variability in iron solubility across grid cells because it is based on both soluble and total iron annual mean concentrations. In the online method, variability is reduced as extreme values in soluble and total iron concentrations generally do not occur at the same time. We can define the occurrence of extreme values, with respect to the time frame considered, by analysing a relative Z-score metric calculated as follows: where Fe is either total (Fe t ) or soluble (Fe s ) iron. The relative normalized Z score CE1 can then be calculated as follows: where Z t,i and Z s,i are the Z scores of total and soluble iron concentrations, respectively, at each grid cell for each time step i. The Z-score metric provides a relative direction and distance of an instantaneous value with respect to its mean. The Z score is reported in multiples of the standard deviation (Eq. 8); therefore, a Z score of zero indicates that the data point value is identical to the mean value. To assess the relative difference in the variability at a given time between the modelled total and soluble iron concentration and its mean, we calculated the difference in Z scores between total and soluble iron concentrations and normalized it using the Z score of total iron concentration (Eq. 9). Note that the Z score of the soluble iron concentration could also be used to normalize the difference. This method allows for the examination of how the occurrence of extreme concentration values in total and soluble iron influences the method of solubility calculation (Eq. 6 vs. Eq. 7).

Iron ocean deposition source apportionment
An ocean deposition source apportionment sub-study was designed to classify ocean deposition regions according to the dominant atmospheric soluble iron source, rather than ocean basins defined from a more traditional physical oceanographic viewpoint (e.g. Gregg et al., 2003). By incorporating recent model estimates for dust and the importance of pyrogenic iron emissions Matsui et al., 2018) the seven large-scale source regions defined in Mahowald et al. (2008) were modified slightly to separate the major dust iron source regions from fire and anthropogenic combustion iron source regions. This resulted in a total of 10 iron emission source regions ( Fig. 1; see also Table S1 for details).
Simulations in the source apportionment study used BAM-Fe, as described in Scanza et al. (2018), with slight modification. Briefly, anthropogenic combustion iron emissions were increased by a uniform factor of 5, and iron from fires followed the updated Fe : BC ratio (Table 4) and seasonal variability in the fire BC emissions, all as per MIMI. Aerosols were externally mixed in BAM, and therefore altering the regional aerosol loading did not affect aerosol transport or deposition in the more significant way it could in MAM, in which aerosols are internally mixed. This information was then used in Sect. 4.3 to compare differences in the daily mean deposition of soluble iron between the BAM-Fe and MIMI models within each defined ocean region.

Modelled dust and iron aerosol concentrations compared to observations
In terms of Earth system modelling and the biogeochemistry that connects the land-atmosphere-ocean components, we are ultimately motivated to improve the magnitude of the atmosphere-to-ocean iron deposition flux and its fractional solubility (from which the soluble iron flux can be derived). We compare the model results with a series of observations and herein highlight some of the problems discovered when directly comparing with a sporadic (in both space and time) observation dataset, as is currently common practice (Myriokefalitakis et al., 2018).

Global dust comparisons
Comparison of dust AOD with regional dust AOD observations (Fig. 2) from the AERONET observational datasets (Holben et al., 1998), as subsampled in Albani et al. (2014), shows good agreement globally (correlation: r 2 = 0.64). This results in MAM annual global mean emissions of 3250 ± 77 Tg dust a −1 (Aitken 16 Tg a −1 , accumulation 36 Tg a −1 , coarse 3198 Tg a −1 ), which is at the higher end of literature estimates of ∼ 500-4000 Tg dust a −1 (Bullard et al., 2016;Huneeus et al., 2011;Kok et al., 2017). Dust emissions in MAM are 84 ± 4 % higher than our previous mean  (Table S2), which affects coarse-mode dust aerosol (wherein 98 %-99 % of total dust mass is emitted) more than fine-mode dust aerosol. Globally, both dust concentrations (correlation: r 2 = 0.89) and deposition (correlation: r 2 = 0.83) are simulated well compared to observations within MIMI. A higher correlation of modelled dust concentrations with observations is calculated in the Northern Hemisphere (NH; r 2 = 0.89) compared to the Southern Hemisphere (SH; r 2 = 0.67), but the gradient of the line of best fit is further from 1 : 1 (NH: 1.22 vs. SH: 1.07). Conversely, for dust deposition a lower correlation with observations is simulated in the NH (r 2 = 0.75) compared to the SH (r 2 = 0.60) but with a gradient of the line of best fit closer to 1 : 1 (NH: 1.07 vs. SH: 0.72). Overall, the results presented in this study suggest an improvement on previous dust modelling complications related to underestimating dust deposition when tuned to dust concentration (Huneeus et al., 2011).

High-latitude dust and iron aerosol
Including the parametrization of Kok et al. (2014a) removes the requirement of a soil erodibility map (Table 1). In addition, in previous versions of the model, high-latitude dust sources were zeroed because there were no observations at that time to support high-latitude sources of dust (Albani et al., 2014). However, more recent observations have suggested that high-latitude dust sources do exist (Bullard et al., 2016;Crusius et al., 2011;Tobo et al., 2019), often related to glacial processes (Bullard, 2017) with a higher fraction of bioavailable iron relative to lower-latitude dust sources (Shoenfelt et al., 2017). Thus, for the new version of the model we have allowed for the inclusion of high-latitude dust sources (Fig. 3). In general, aerosol dust and iron concentrations peak closest to the coastlines and during summer. Emissions of dust from > 50 • N are ∼ 1.3 ± 0.2 % of the global dust total, which is half of the estimates derived from field and satellite data at 2 %-3 % of the global total (Bullard, 2017;Bullard et al., 2016). However, the resulting magnitude and seasonality of dust concentrations have been shown in a recent study to be consistent with observed measurements from Svalbard (Tobo et al., 2019).

Global iron aerosol concentration and fractional solubility
There are several propositions explaining sources of soluble iron and the inverse relationship between total iron amount and iron solubility (Sholkovitz et al., 2012). While total iron mass concentrations are dominated by desert dust sources, soluble iron can be a product of mineral dust processed in the atmosphere or emitted from pyrogenic sources (Chuang et al., 2005;Guieu et al., 2005;Ito et al., 2019;Luo et al., 2008;Meskhidze et al., 2003;Schroth et al., 2009). Previous studies have shown that either of these can explain the inverse relationship and that the spatial distribution of data is required to provide more information . Therefore, we explored how to best use the spatial data to compare with the model results. The 5-year (2007 to 2011) mean iron concentration from MIMI is compared to an extensive dataset of observations of total iron and its fractional solubility (Fig. 4). The model captures the global mean observational total iron concentration well; however, relatively low regional correlations (r 2 < 0.4) occur in the south Indian (r 2 = 0.0), South Atlantic (r 2 = 0.34), North American (r 2 = 0.35), and high-latitude (r 2 = 0.06) ocean regions, suggesting that future model improvements can be focused here.
In the absence of iron atmospheric process modelling, ocean biogeochemistry models with an iron component (e.g. Aumont et al., 2015;Moore et al., 2004) have estimated iron solubility from offline dust modelling by means of an assumption that it contains 3.5 % iron by weight, of which 2 % is soluble. Iron solubility is highly temporally and spatially variable, however, and in the absence of spatial atmo- spheric emission information, pyrogenic iron sources, and atmospheric processing of iron, an estimate of 2 % solubility leads to underestimates of observed iron solubility in nearly all HNLC ocean regions (Fig. 4).
Aggregating observations onto a lower-resolution grid (sometimes termed super-obbing) compared with the model can help reduce the representation error when comparing with such limited observations (Schutgens et al., 2017). Fig. 5 uses an observational resolution one-third that of the model, and the model-to-observation comparison of the mean state is thus improved. Persistent observation-based features of the local environment become more obvious, while less frequent ones conversely diminish. At this observational resolution, the low total iron concentrations in the North Atlantic ∼ 30 • N, as seen in Fig. 4, are perhaps not a common feature, and the model much more precisely represents the climatological state here than Fig. 4 might suggest. However, examining the North Pacific reveals that the model imprecisely represents the mean state here. Potential missing iron sources in remote regions, such as the North Pacific, include the following: (1) shipping emissions (Ito, 2013), which have a high soluble iron content from oil combustion (Schroth et al., 2009); (2) volcanic emissions, which provide a localized "fertilizer" to the surface ocean owing to the macronutrients and trace metal nutrients contained within them (Achterberg et al., 2013;Langmann et al., 2010;Rogan et al., 2016); and (3) low Asian and South American aerosol concentrations, either through underrepresenting combustion emission sources  or in the transport and deposition of aerosol within these regions (Wu et al., 2018). These are discussed in more detail in Sect. 5.1 and 5.2.
In terms of iron solubility (soluble iron concentration / total iron concentration), the model is not capturing the observational mean state in many regions (Fig. 5). A detailed examination of the observation point at 18 • N and 330 • E (anomalous green point surrounded by blue points in the North African outflow plume in Fig. 4) and the nine model grid cells co-located with it in Fig. 6 shows how a single high observation (155 % percent solubility) is causing a representation issue (see also Sect. 4.3 regarding soluble iron deposition). Both model and observation histogram distributions are similar, as are the median (model: 1.8, observation: 0.9) and geometric mean (model: 2.1, observation:

1.3) values.
However, the arithmetic means are not similar (model: 2.5, observation: 9.6) and while a high observation value of 155 % is likely to be an outlier and should be at most 100 %, it still informs us about what is possible, and simply discounting it (even at an adjusted 100 %) would require strong justification. It is therefore advisable to instead alter the estimator of the average. Comparing model to observation differences calculated using the median or geometric mean reveals that they are similar in magnitude, as one would expect for log-normally distributed data (Fig. 6 insert). Although the median is robust with respect to outliers, the model results may not exhibit a uniform Gaussian distribution (Fig. 6 insert; solid compared to dashed lines), and often the number of available observations is also low (Fig. 7), suggesting that its use also requires careful consideration. An equivalent methodology to the geometric mean in Fig. 7 would be to first log transform the data before calculating the arithmetic mean. Arguments pertaining to the appropriate methodology for comparing model results to temporally limited observations extend beyond the iron aerosol examination in this study to all aerosol comparisons with limited observations.

Calculating iron solubility
It is interesting to note the effect that the order of operations (taking the average of ratios compared to the ratio of aver- ages) has when calculating iron solubility (Fig. 8). Throughout this study, the percent of iron solubility was calculated at each model time step (30 min) and then the daily mean output was analysed (online; Eq. 6) at an annual or 5-year mean time resolution. It is also acceptable to use the simulated soluble and total iron concentrations to generate the annual or 5-year mean iron solubility in a post-processing step (offline; Eq. 7). The resulting differences between methods are not insignificant (Fig. 8); however, the offline method creates a distribution in which low iron solubility is generally lower and the highest (> 18 %) iron solubilities are generally higher. Overall, the global annual mean iron solubility calculated online is one-third (34 %; NH = 40 %, SH = 29 %) higher than when calculated offline.
The average relative Z score (Eqs. 8 and 9) is around zero for most model grid cells (Fig. 9), indicating that they mostly followed similar temporal and relative magnitude trends. However, even if the average relative Z scores are around zero and the ratio of relative standard deviations is around one, the ratio of online-to offline-calculated iron solubility is most likely > 1. Temporal differences in the soluble and total iron concentration might therefore be controlling the overall solubility at each model grid cell. We also find that the ratio of online and offline solubility is > 1 for most of the cases when the ratio of the relative standard deviations of soluble and total iron is < 1 (Fig. S2), indicating that the differences in both methods of iron solubility calculation are sensitive to the differences in the relative size of the tails of the distribution. That is, if soluble iron has narrower tails compared to total iron at any grid cell, it is highly likely that a higher solubility will be obtained in the online method compared to offline. The extreme ratio of the tails of soluble and total iron is only found in specific regions with the highest temporal variability in emissions and modelled solubilization of insoluble iron (Fig. S2).
Field measurements have generally suggested an inverse relationship between total and soluble iron concentrations (Myriokefalitakis et al., 2018). This means that high total iron concentrations are generally accompanied by low soluble iron concentrations and vice versa. By assuming that the field measurements faithfully represented the actual average values of soluble and total iron concentration at those locations, we implicitly assume that all the measurements have a Z score of zero. In Fig. 9 we show that this is not the case with the modelled results, and the two variables can be relatively farther from their respective means even when averaged over the modelled time period.
The sensitivity of a result to the order of operations extends beyond iron solubility to any variable that is calculated in a similar manner, and current multi-model intercomparison project (MIP) protocols do not explicitly account for this. However, the effects of outliers, in both online and offline methods, can be reduced by employing the geometric mean and has been used in some MIPs (e.g. Mann et al.,

2014
). It will also be important to consider differences in the solubility of iron induced by the choice of the order of operations as ocean biogeochemical models move away from using offline results from global climate or chemistry transport models to online results within Earth system models, which are designed to couple the two components at each time step. For short-term interactions between deposited iron and ocean biota, shorter-term averaging may be more important (e.g. Guieu et al., 2014), but for the long-term accumulation of iron that is (re)cycling in the oceans, the longer-term average may be more appropriate . One should be aware, however, that iron is readily removed from the ocean mixed layer, and thus the lifetime of iron may well be short enough for the online calculation to be more appropriate much of the time (Guieu et al., 2014).

MIMI vs. BAM-Fe
In this section, we discuss how the new modal aerosol-mode version of MIMI compares to its predecessor bulk aerosol model version (BAM-Fe) throughout all three stages of the atmospheric iron life cycle.

Iron emission comparison
Globally averaged emissions of dust (3200 Tg a −1 ) and its iron component (126 Tg a −1 ) are within the current multimodel range ( Table 7). The simulated annual mean iron in dust percentage is 4.1 %, with the highest percent occurring in the coarse mode at 6.5 % and the lowest percent occurring in the Aitken mode at 1.1 %. Accounting for dust mineralogy therefore increases the global mean iron percent by weight above the currently well-used global mean estimate of 3.5 % (e.g. Jickells et al., 2005;Shi et al., 2012).
Compared to BAM-Fe, MIMI dust emissions are ∼ 80 % higher and the iron it contains is ∼ 120 % higher (Table 7). Although both the BAM-Fe and MIMI models are globally tuned to a similar dust AOD (∼ 0.03) and based within the same host model (CESM), changing from a bulk aerosol scheme (e.g. Albani et al., 2014;Scanza et al., 2015) to a modal aerosol scheme reduces the aerosol lifetime significantly   Table S2). The spatial distribution of dust emissions is also different following the move to the Kok et al. (2014a, b) parameterization (Table 1), resulting in the spatial distribution of dust AOD also being altered (Fig. S3). Total pyrogenic iron emissions (sum of fires and anthropogenic combustion activity) in MIMI are higher than previous estimates by a factor of between 2 and 3 (Table 7), reflecting the growing evidence that they have been previously underestimated (Conway et al., 2019;Ito et al., 2019;Matsui et al., 2018).

Iron atmospheric processing comparison
There is a much lower aerosol pH in the fine aerosol modes (Aitken and accumulation) in MIMI compared to that in BAM-Fe (Fig. 10). This is due to a combination of resolving pH in each aerosol size mode in MIMI and the subsequent lowering of the pH value (1) being applied in the two fine aerosol modes (Aitken and accumulation). Conversely, dust dominating the coarse aerosol mode provides more of an opportunity for [calcite] > [SO 4 ] in this aerosol size fraction, resulting in most continental areas having a high coarsemode aerosol pH in MIMI compared with the higher pH being much more localized to the major desert regions in BAM-Fe. Acidic processing of iron in MIMI therefore proceeds faster globally in the fine-sized aerosol modes (Aitken and accumulation) compared to the BAM-Fe fine-sized bin (0.1-1 µm), but it is generally slower over continental regions in the coarse mode than in BAM-Fe coarse-sized bins (1-10 µm).  Table 7. Dust, fire, and anthropogenic combustion emissions of iron and relevant co-emitted aerosol emissions (to two significant figures). Multi-model emission range from the four global atmospheric iron models (including BAM-Fe) reported in Myriokefalitakis et al. (2018). Fine (sum of Aitken and accumulation modes) and coarse (coarse mode) mass emissions also given for dust, fire iron, and combustion iron. Comparison of Fig. 10 to modelled pH estimates by Myriokefalitakis et al. (2015) shows generally good agreement in the NH, but in the SH MIMI simulates less acidic coarse-mode aerosol over continental regions and more acidic aerosol over marine regions. As iron models are unable to capture the high observed iron solubility (> 10 %) over SH marine regions (Myriokefalitakis et al., 2018) and in the absence of remote pH aerosol observations, we suggest that our basic parameterization captures an aerosol pH which is suitable for use in Earth system models Model physics, and hence simulated cloud cover, are significantly different between CAM4 and CAM5. Figure 11a shows the relative model difference in the oxalate distribution between MIMI, which also includes an increase in the tuning factor by an order of magnitude (from 15 to 150; Table 5), and BAM-Fe by normalizing by the simulated cloud fraction in each model. The effect of oxalate on iron dissolution is therefore larger in MIMI over extratropical ocean regions, where iron models underrepresent solubility (Myriokefali-takis et al., 2018), and land regions which are dense in tropical vegetation or industry (both centres of large aerosol precursor gas emissions). Compared to observations (Myriokefalitakis et al., 2011; Table S3) modelled oxalate concentrations are well represented at high observed concentrations but are biased low when observed concentrations are low (Fig. 11b). The low model bias is stronger within remote observational regions (marine vs. urban observation sites), suggesting that the removal of secondary organic aerosol may be too strong within the model and/or that there is a missing marine aerosol precursor gas emission source (Facchini et al., 2008;O'Dowd and de Leeuw, 2007) in this model which significantly lowers simulated secondary organic aerosol, and thus oxalate, concentrations.
Comparison of mineral dust and pyrogenic sources of modelled soluble iron (sum of emissions and atmospheric dissolution; Fig. 12) with the four iron models (including BAM-Fe) reported by Myriokefalitakis et al. (2018) shows that the spatial distribution in MIMI is broadly similar for Figure 9. Relationship of online-to offline-derived iron solubility to the relative Z score for total (ZFe t ) and soluble (ZFe s ) iron and the relative standard deviation (σ Fe/Fe) at each grid cell for the year 2007. Figure 10. Surface-level annual mean interstitial aerosol pH. If [SO 4 ] > [calcite] then pH = 1 in Aitken and accumulation modes or 2 in coarse; otherwise, pH = 7.5 (Table 5). most regions of the world. A notable difference exists in the North Pacific region, where the soluble iron source in MIMI is lower than all other iron models, similar to total iron concentrations when compared to observations (Figs. 4 and 5). Future development of MIMI should thus be focused on the North Pacific, including the addition of shipping soluble iron emissions, which are relatively concentrated in this region (Ito, 2013). An improvement for MIMI can be seen over the Atlantic region directly downwind of Saharan soluble iron sources. In general, iron models are overrepresenting iron solubility close to dust sources compared to observations (Myriokefalitakis et al., 2018), and in order for BAM-Fe to reach better agreement with observed iron solubility in this region dust emissions of soluble iron had to be scaled downwards (Conway et al., 2019). We suggest that this improvement is linked to the improved modal representation of aerosol pH in MIMI (Fig. 10).

Iron ocean deposition flux comparison
Similar to the previous study by Scanza et al. (2018), we report the amount of total and soluble iron deposited in each of the major ocean basins (Table 8) as defined by Gregg et al. (2003). We find that in MIMI the amount of total iron deposited to all ocean basins is approximately double that estimated in BAM-Fe (26 vs. 12 Tg Fe a −1 , respectively), while soluble iron deposition is similar (∼ 0.5 Tg Fe a −1 in both models). The larger mineral dust emission flux in MIMI (3200 Tg dust a −1 compared to BAM-Fe dust emissions of 1800 Tg dust a −1 ) is driving most of the increases in total iron deposition because it is the primary iron source (Table 7). In general, the magnitude of soluble iron deposition to the oceans is more evenly distributed across hemispheres in MIMI owing to a major reduction (approximately onehalf) in the equatorial north-central Atlantic basin deposition flux and increases to SH ocean deposition fluxes of a factor of 2 to 4. In MAM4 dust is treated as internally mixed aerosol with sea salt, leading to higher rates of wet deposition than when dust is externally mixed aerosol  as it is in CAM4. The internally mixed treatment of dust aerosol in MAM4 is thus an important factor leading to the lower simulated dust lifetime when compared to BAM-Fe (Table S2). Over the north-central Atlantic region, the combination of a lower soluble iron source (Fig. 12 compared to Fig. S4b by Myriokefalitakis et al., 2018), dust atmospheric lifetime (Table S2), lower aerosol pH (Fig. 10), and lower relative organic ligand processing (Fig. 11) will all work towards reducing the magnitude of atmospheric soluble iron deposition flux in MAM4 compared to BAM-Fe. There are significant increases in anthropogenic combustion iron deposition in all equatorial and NH ocean basins driven by the fivefold increase in combustion emissions implemented in MIMI. The percent contribution from pyrogenic iron to total iron deposition between MIMI and BAM-Fe is, however, more similar for all northern and equatorial oceanic regions than southern oceanic regions. Beyond the correction to anthropogenic combustion emissions, which are NH dominated, this could be due to differences in the emissions of both dust and fire aerosol, structural differences between models relating to the aerosol size and composition, which alters aerosol deposition rates, or a lower soluble iron source (Fig. 12); it is most likely to be a combination of all three.
The fraction of fire aerosol which is injected above the boundary layer is crucial for determining its capacity for long-range transport (e.g. Turquety et al., 2007). Vertically distributing fire iron emissions in MIMI, as compared to emitting all iron from fires at the surface as in BAM-Fe, increases the long-range transport of iron aerosol to remote  Table S3 in Myriokefalitakis et al. (2011) and reported with uncertainty where given. TS4 Table 8. Global and regional ocean basin deposition (Gg a −1 ) of total and soluble iron in BAM-Fe  and MIMI (this study). Deposition was multiplied by the ocean fraction of model grid cells and is reported at two significant figures. The percent contribution from pyrogenic (sum of fires and anthropogenic combustion) iron sources to deposition is also given. Ocean basins are those defined by Gregg et al. (2003) and previously used by Scanza et al. (2018).
Dust and comb. deposition; Gg a −1 Percent iron from pyrogenic sources; % ocean regions (Fig. 13). In general, vertically distributing fire emissions results in small increases in soluble iron deposition (between 0 % and 20 %) in SH ocean regions and a larger increase (between 20 % and 40 %) in NH oceans, with converse lower land deposition close to the major regions of fire activity. The exception is in the sub-Arctic North Pacific, an HNLC region, where iron deposition from fires significantly increased until it was more than double that when surface fire emissions are used. The dry deposition flux is sensitive to aerosol properties, surface roughness, and modelled turbulence. Although increasing the vertical resolution has been shown to increase surface PM 10 concentration (Menut et al., 2013) and better simulate the dust vertical profile (Teixeira et al., 2016), it is not yet clear if this would correspondingly increase the dry deposition flux.

Source region comparison
The eight regions in Fig. 14 are chosen based on 10 (one for each region in Fig. 1) simulations undertaken using the modified version of BAM-Fe described in Sect. 2.6. The emission region (Fig. 1) with the highest fractional contribution to the total soluble deposition flux in each grid cell was exam-  ined, and from this the boundaries of each region in Fig. 14 were delineated. The resulting eight ocean iron deposition regions are split equally into four in the NH and four in the SH. Note, however, that the NH-SH divide sits at 15 • S and not the Equator, which is due to transport differences in each hemisphere and the position of the Intertropical Convergence Zone. Of the four regions that can be defined as major dust deposition receptors (Fig. 14; bottom panel bar chart), the north Indian Ocean (no. 1), North Atlantic and central Pacific (no. 4), and South American dust (no. 7) regions have a single dominant source each, while the North Pacific (no. 3) region is more variable. These dust-dominated iron deposition regions are similarly reproduced by other global iron models (Ito et al., 2019;Myriokefalitakis et al., 2018). The regions of the mixed Southern Hemisphere oceans (no. 5) and Australian and South Pacific (no. 6) receive similar amounts of mineral dust and pyrogenic iron, suggesting that the iron sources are spatially closer and thus share much more similar transport pathways than the South East Asian ocean (no. 2) and South American pyrogenic (no. 8) regions, which have a much more distinct pyrogenic iron source signal. Deposition regions are more clearly defined when using this methodology compared to those from a more traditional classification of ocean basins based on physio-geographical oceanography (Fig. S4). This information can be used to assess which ocean regions are most likely to be affected by anthropogenic perturbations to the magnitude of iron sources within different regions, whether through land use, land cover change, or industrialization.
The variability in the daily soluble iron deposition flux to each of the eight ocean regions, as seen in Fig. 14, is much larger in MIMI than it is in BAM-Fe (Fig. 15), reaching over 10 orders of magnitude between the minimum and maximum flux in many regions. This is due in part to the increased variability in fire emissions, which was improved in MIMI to track the BC emitted from fires, and switching from the offline soil erodibility map used in BAM-Fe to the Kok et al. (2014a) physically based emission parametrization used in MIMI. Anthropogenic combustion emissions are tempo-rally static in both model frameworks and therefore do not affect the variability in this study as much as fires and mineral dust but will in the future if this is changed to represent a seasonal emission cycle. We can see that each of the dust and fire updates in MIMI have a large impact by comparing the Patagonian dust-dominated South American dust (SADU) region and the fire-dominated South American pyrogenic (SAPY) region. Most of the dust deposited (30 % to 90 %) in the ocean occurs during large dust events that are on just 5 % of the days (Mahowald et al., 2009), resulting in large differences between median and mean deposition amounts in all regions, as seen in Fig. 15. It is important to note that the mean is always above the interquartile range, further supporting our previous arguments pertaining to the modelled mean not being an ideal estimate of the average as it does not represent the log-normal distribution of aerosol. Comparing the mean-to-median ratio suggests that extreme dust events are also more pronounced in MIMI (CAM5) than in BAM-Fe (CAM4).

Future directions
The purpose of model-to-observation comparisons is to identify situations (regions, times, model settings, or combinations thereof) in which the model output is inconsistent with observed realities, with the goal being to further refine the model in the future. Each individual observation represents a snapshot of the atmospheric state at a specific point in space and time, and when an observation falls outside the distribution of model output values from the same location and time, we can view this as evidence of a model misspecification. For the example of iron modelling, constraining current model-observation discrepancies would benefit from further exploring the model sensitivity of simulated iron and its solubility to uncertainties in five major parameter sets: dust iron emissions, pyrogenic iron emissions, atmospheric iron dissolution chemistry, dry deposition rates, and wet deposition rates. In general, improving the modelled representation of secondary organic aerosol (including oxalate) and aerosol pH, particularly for remote regions, is an important task for aerosol modelling and one which would have co-benefits for iron aerosol modelling. Comparisons of the soluble fraction of other aerosol species with observations could also be used to guide model development.
Here we discuss some of the model parameters which are likely important for improving modelled iron emissions and deposition in MIMI, and thus iron process models in general, in the future.

Improving iron aerosol emissions
Downwind of significant mineral dust sources iron models generally overestimate the observed amount of total iron (Myriokefalitakis et al., 2018), and soluble iron comparisons are highly sensitive to the assumed initial solubility of mineral dust iron at emission (Conway et al., 2019). Conversely, in remote ocean regions, improving the representation of combustion emissions has been shown to be a necessary step towards more accurate representations of observed high iron solubilities at low iron concentrations (Ito et al., 2019).

Mineral dust iron aerosol emissions
In Fig. 4 the high model estimates of total iron, compared to observations, downwind of North African mineral dust sources could be due to uncertainties in the magnitude of hematite emissions within the model. Hematite contains by far the largest fraction of iron of any mineral in MIMI (Table 3), with a major source in the Sahel (Fig. S5). The Sahel is a borderline dust source, and emissions from this region have been shown to be sensitive to different model dynamics, even when forced with reanalysis winds, for example between CAM4 and CAM5 . Other studies have shown a large sensitivity of dust generation to the details of the soil erodibility map (e.g. Cakmur et al., 2006). For CAM5 with the DEAD emissions scheme,  showed that improvements in estimating the direct radiative forcing of mineral dust could be achieved by assuming that hematite is only emitted from clay minerals and not silt, with an effective reduction of ∼ 30 % from the coarse-mode emission of hematite. Although MIMI has employed an updated dust emission scheme (Table 1; Kok et al., 2014a) the model is still sensitive to assumptions within the offline mineralogy maps and applications of the brittle fragmentation theory therein. For instance, the single-scattering albedo, which is a critical parameter in estimating the direct radiative forcing (e.g. Di Biagio et al., 2009), becomes more comparable to observations (Kim et al., 2011) if the same assumption as in Scanza et al. (2015) is applied (Fig. S6). Quantifying the uncertainty on the climate response to different assumptions in mineralogy and dust emissions, and any reanalysis meteorology driving them, is therefore an important task. Matsui et al. (2018) recently showed that combustion iron emissions have been underestimated in current models. One possible reason for this underestimate is that the anthropogenic combustion iron emissions from Luo et al. (2008) are for 1996. Taking steel-making and coal consumption (which are also linked to iron emissions) as a proxy for economic development (Ghosh, 2006;Lee and Chang, 2008) shows that growth in these sectors boomed exponentially post-2000, particularly in Asia and India (Ghosh, 2006;Lee and Chang, 2008). Therefore, 1996 emissions do not capture recent industrial developments, and updating the anthropogenic combustion iron emission inventory for use in the 21st century is a critical next step.

Pyrogenic iron aerosol emissions
During a fire, the iron contained in leaves and wood (Price, 1968) will be released to the atmosphere along with the iron contained in the surrounding soil, whether entrained from the ground due to pyro-convective updrafts (Wagner et al., 2018) or through a remobilization of terrigenous particles which have previously been deposited onto vegetation (Gaudichet et al., 1995;Paris et al., 2010). All sources are subsequently internally mixed within the smoke plume before any downwind observation occurs. Differentiating the iron contribution from biomass which is burnt to that from entrained dust was not considered in any of the studies in Table 4 but would be required to define the correct mineralogy and solubility of iron from fires. If we assume that biomass contains low concentrations of iron relative to the surrounding soils then we could expect a difference in observed Fe : BC ratios between a cerrado (savannah) environment, where surrounding soils are dry and dust is easily mobilized, compared to a tropical environment, where soils are wet and dust is not as easily mobilized. But we do not see this in Table 4, and both regions have a similar range which spans around 2 orders of magnitude from low to high. However, no concrete conclusions can be drawn from such a limited dataset, so more observations are needed to distinguish which source (biomass or dust) is contributing most to the iron measured downwind of fires.
The physical, chemical, and biological properties of the underlying soil are also impacted by fires (Certini, 2005) and it can be years after the fire has occurred until a return to the pre-fire state is achieved. For example, the removal of vegetation and the surface crust by fires from dune regions will create a new opportunity for dust mobilization (Strong et al., 2010), and higher-intensity fires can also increase the erodibility of soils and the availability of fine particles through breaking down the soil structure (Levin et al., 2012). Furthermore, under high temperatures the fire can transform the underlying soil mineralogy, with iron decreases in clay minerals and increases in magnetic iron oxide minerals (Crockford and Willett, 2001;Ketterings et al., 2000;Ulery and Graham, 1993). The amount of dust emitted from post-fire landscapes is potentially very significant, with Wagenbrenner et al. (2017) estimating that an extra 12-352 Tg of dust as PM 10 (40 % of which was estimated to be PM 2.5 ) was emitted to the atmosphere in 2012 from post-fire landscapes in the western US alone. The impact of fires on total and soluble iron emissions in dust from within post-burn regions is also likely to be different but requires further study, although it likely depends on the fire regime and the time since the fire occurred.
The most advanced iron processing models currently consider industrial, domestic, wildfire, and shipping pyrogenic emissions (Myriokefalitakis et al., 2018). An emerging discussion is on the importance of volcanic ash and the iron it contains for ocean biogeochemistry (Langmann, 2013). Figures 4 through 7 show that MIMI underrepresents both total iron and its solubility in the remote extratropical Pacific where volcanic emissions may be an important missing iron source. Future understanding of volcanic iron sources is potentially important as once deposited to the ocean, particularly in regions that are iron limited or seasonally iron limited, volcanic inputs have been shown to alter satellite chlorophyll (Hamme et al., 2010;Rogan et al., 2016) and the drawdown of macronutrients (Lindenthal et al., 2013). The volume of metals released by a volcano is subject to many uncertainties, including both the nature of the volcano and its eruption type and strength, leading to estimates which can vary by many orders of magnitude (Mather et al., 2006(Mather et al., , 2012. To date most studies have focused on ocean inputs from shorterterm explosive eruptions rather than continuous inputs from quiescent passive degassing volcanoes which are likely to be most important only for the central Pacific region downwind of volcanoes located within the "ring of fire" (Olgun et al., 2011).

Aerosol deposition
An examination of aerosol dry deposition in CAM5 by Wu et al. (2018) showed that the deposition velocity for Aitkenand accumulation-sized BC particles is potentially an order of magnitude too high. It is highly likely that this will also be the case for dust. As the largest discrepancies between models and observations are in remote ocean regions, improving the model long-range transport of iron by investigating deposition rates is an important constraint to be applied to the model.

Conclusion
It is important to accurately model the atmospheric iron cycle because of the impacts of iron on human health, ocean biogeochemistry, and climate. Atmospheric iron process modelling suitable for use in global climate and Earth system modelling is a new model development area, and as such it is currently undergoing rapid development. Here we have detailed the development of the Mechanism of Intermediate complexity for Modelling Iron (MIMI v1.0) such that it now represents iron emissions, atmospheric processing, and deposition within a global modal aerosol microphysics framework.
The solubility of iron depends on the underlying aerosol iron properties, such as dust mineralogy and combustion fuel type, and the degree to which dissolution from an insoluble to soluble iron form has occurred in the atmosphere. Which of these is the dominant factor for describing the observed inverse relationship between the solubility of iron to the total iron mass is currently unknown . Updating the mineral dust emission scheme to a physically based parameterization, however, has improved model performance by increasing total iron close to mineral dust sources, where solubility is observed to be low (Figs. 4 through 7). Updating pyrogenic iron emissions from fires increases the long-range transport of soluble iron to remote ocean regions, where observed solubility is higher (Figs. 4 through 7), while increasing anthropogenic combustion iron emissions by a factor of 5 brings the total in line with more recent evaluations of their magnitude (Conway et al., 2019;Matsui et al., 2018). Emission updates have also increased the variability in soluble iron deposition (Fig. 15). Improvements to the atmospheric iron processing scheme in MIMI also increase iron dissolution in more remote regions relative to mineral dust sources, again in line with observations.
Comparisons with observations (Figs. 4 through 7) show that in general MIMI simulates total iron concentrations well. However, comparison of modelled iron solubility to observations reveals that while the model captures many regional features, some are missed. It is unclear, however, whether this problem arises from the model or observational representation of the system owing to the insufficient number of observations available to build a robust observational result for such a highly variable quantity in the Earth system, even when aggregating over small regional scales. There are significant differences in calculating iron solubility based on the order of the averaging operation. When calculating at each model time step, global annual mean iron solubility is one-third (34 %; NH = 40 %, SH = 29 %) higher than when calculated from monthly mean values. Earth system models are designed to integrate land-atmosphere-ocean-ice components at each time step and could thus yield different results based on the coupling time step length employed. Furthermore, the mean is shown not to be an accurate representation of the average atmospheric state due to the non-Gaussian distribution of aerosol concentrations. In many regions, however, there are just a few (fewer than five) observations, often only one, so while the use of the median is robust with respect to extreme values, a limited observational dataset cannot truly discriminate if extreme values are outliers or the norm. Use of the mean also significantly overestimates the average atmospheric soluble iron deposition to the ocean and is always larger than the upper quartile of the distribution in daily deposition. However, this bias may be tempered due to ocean biogeochemistry processes likely being relevant over timescales which are longer than those in the atmosphere. Future work will need to consider how best to compare models to sporadic observations, potentially making use of distributions rather than a more limited absolute average.
The main sources of soluble iron deposition vary both between and within ocean basins. The redefinition of ocean basins based on the dominate iron deposition source, rather than a traditional physio-geographical ocean basis, can therefore aid in determining where continental anthropogenic activity will have the greatest impact on ocean biogeochemistry and which source region is linked to where modelobservation comparisons are poor. For example, modelling of total iron and its solubility in the South Atlantic could be improved by further improving our understanding of industrial combustion and fires within South America. Furthermore, soluble iron deposition to Southern Hemisphere oceans in MIMI, whereby combustion and fire emissions have a significant impact, is between a factor of 2 to 4 higher compared to BAM-Fe, which is the model simulating the largest atmospheric fluxes to the ocean of the comparable models studied in Myriokefalitakis et al. (2018). As integrated Earth system models develop in the future, taking a holistic view to understanding how dust and fires are coupled in terms of feedbacks on iron emissions is an important step for predicting how future changes in climate will alter the Earth system response to human perturbations of the natural system. Code and data availability. Model code (emissions and atmospheric processing for MIMI v1.0) and data are available at http://www.geo.cornell.edu/eas/PeoplePlaces/Faculty/mahowald/ dust/Hamiltonetal2019/ TS5 . Observational iron data are available from Mahowald et al. (2009) and Myriokefalitakis et al. (2018). Observational oxalate data are available from Myriokefalitakis et al. (2011).
Author contributions. DSH developed MIMI, which incorporates model code previously developed by RAS, YF, JFK, XL, and MW. DSH undertook all model simulations and wrote the paper with support from NMM, JG, and SDR. DSH prepared all figures and tables apart from Fig. 1 and Table S1 (JSW), Figs. S3 and S6 (LL), and Figs. 9 and S2 (SDR). All authors edited the paper text.
Competing interests. The authors declare that they have no conflict of interest.
Remarks from the language copy-editor CE1 This is unhyphenated in accordance with our house standards.

Remarks from the typesetter TS1
Please note that we do not differentiate authors with the same last name in the text unless their reference has the same year.

TS2
Please note that with your explanation, the lines have been adjusted to match our standards and your intended meaning.

TS3
According to our standards, changes like this must first be approved by the editor, as data have already been reviewed, discussed, and approved. Please provide a detailed explanation for this change that can be forwarded to the editor. Please note that this entire process will be available online after publication. Upon approval, we will make the appropriate changes. Thank you for your understanding.

TS4
As the new version of Fig. 11 contains several differences compared to the current one (which was already approved by the editor), we need the editor's approval before we can replace it. Please provide a statement why this figure needs to be replaced.

TS5
Please provide date of last access.