First description of the Minnesota Earth System Model for Ocean biogeochemistry (MESMO 1.0)

Here we describe the first version of the Minnesota Earth System Model for Ocean biogeochemistry (MESMO 1.0), an intermediate complexity model based on the Grid ENabled Integrated Earth system model (GENIE1). As with GENIE-1, MESMO has a 3D dynamical ocean, energy-moisture balance atmosphere, dynamic and thermodynamic sea ice, and marine biogeochemistry. Main development goals of MESMO were to: (1) bring oceanic uptake of anthropogenic transient tracers within data constraints; (2) increase vertical resolution in the upper ocean to better represent near-surface biogeochemical processes; (3) calibrate the deep ocean ventilation with observed abundance of radiocarbon. We achieved all these goals through a combination of objective model optimization and subjective targeted tuning. An important new feature in MESMO that dramatically improved the uptake of CFC-11 and anthropogenic carbon is the depth dependent vertical diffusivity in the ocean, which is spatially uniform in GENIE-1. In MESMO, biological production occurs in the top two layers above the compensation depth of 100 m and is modified by additional parameters, for example, diagnosed mixed layer depth. In contrast, production in GENIE-1 occurs in a single layer with thickness of 175 m. These improvements make MESMO a wellcalibrated model of intermediate complexity suitable for investigations of the global marine carbon cycle requiring long integration time.


Introduction
Earth system Models of Intermediate Complexity (EMICs) occupy a unique and important position within the hierarchy of climate models (Claussen et al., 2002).In many Correspondence to: K. Matsumoto (katsumi@umn.edu)ways, EMICs represent a compromise between high resolution, comprehensive coupled models of atmospheric and oceanic circulation, which require significant computational resources, and conceptual (box) models, which are computationally very efficient but represent the climate system in a highly idealized manner.A critical difference between comprehensive coupled models and box models is the absence of dynamical feedbacks in the latter.In box models, large scale circulation is typically prescribed and not allowed to change over the course of a simulation.The lack of dynamical feedbacks makes box models unsuitable for realistic simulations of transient climate change.On the other hand, comprehensive coupled models are so computationally intensive that their behavior within a given parameter space is difficult to fully explore.EMICs nicely fill this gap by retaining important dynamics while remaining computationally efficient, which is typically achieved by reducing spatial resolution and/or number of processes compared to high resolution coupled models.
The effectiveness of EMICs is evident in the numerous publications that have successfully employed them in studying past, present, and future climates (Ganopolski and Rahmstorf, 2001;Ganopolski et al., 1998;Joos et al., 1999;Knutti et al., 2002;Nusbaumer and Matsumoto, 2008;Plattner et al., 2001).Also, the important role that EMICs played in understanding the postindustrial carbon cycle changes is highlighted in the two recent IPCC science reports TAR (Houghton et al., 2001) and AR4 (IPCC, 2007).
Here we document development of the first version of the Minnesota Earth System Model for Ocean biogeochemistry (MESMO 1.0) based on an existing and successful EMIC called GENIE-1.Our immediate motivation for this work is to possess a tool to investigate postindustrial changes in the natural ocean carbon cycle.Our efforts were thus geared toward improving representation of marine biogeochemistry and distributions of natural and anthropogenic transient tracers in the oceans.These improvements, combined with a Published by Copernicus Publications on behalf of the European Geosciences Union.Edwards and Marsh (2005).Control parameters from NGSA-II tuning exercise.In target tuning, vertically constant vertical diffusivity is modified according to Eq. (1).more highly resolved upper ocean and reasonable seasonal sea ice formation, represent significant steps toward reaching our immediate objective and making MESMO useful for future investigations of the global ocean carbon cycle.The entire MESMO code is available in Supplemental Materials.

A brief description of GENIE-1 relevant for MESMO
GENIE is a new EMIC developed primarily in the UK (http: //www.genie.ac.uk/) with the goal of making it as modular as possible so that in theory one can choose to construct a model with any permutation of the existing modules (e.g., slab or 3-D dynamical ocean module coupled to energy balance or 3D dynamical atmospheric module).Following Ridgwell et al. (2007), we will refer to GENIE-1 as a model configuration that consists of the physical climate module C-GOLDSTEIN, a simple atmospheric chemistry module ATCHEM, and a marine biogeochemistry module BIOGEM.As described by Edwards and Marsh (2005), C-GOLDSTEIN is itself a standalone, coarse gridded, efficient climate model that is comprised of a 3-dimensional circulation model of the world ocean, an energy and moisture balance model of the atmosphere, and a dynamic and thermodynamic model of sea ice.The ocean model is on a 36×36 equal-area horizontal grid with 10 • increments in longitude and uniform in sine of latitude; latitude spacing increases from about 3 • at the equator to about 20 • at the poles.There are 8 levels in the vertical with the top layer being 175 m thick.Ocean dynamics is based on the thermocline or planetary geostrophic equations with the addition of a linear drag term in the horizontal equations.The resulting "frictional geostrophic" equations (Ed-wards et al., 1998) are therefore similar to classical general circulation models with momentum acceleration and advection neglected.The model includes the Gent-McWilliams (GM) eddy mixing parameterization according to Griffies (1998) that reduces excessive vertical mixing in coarse gridded models (Duffy et al., 1997;England and Rahmstorf, 1999).The momentum flux that drives the surface ocean circulation is based on the annual NCEP reanalysis wind stress and therefore has no seasonality.
The atmospheric component of C-GOLDSTEIN is an energy and moisture balance model following Weaver et al. (2001).External forcing by shortwave solar radiation is temporally constant, annually averaged and thus has no seasonality.Without explicit atmospheric dynamics, eddy diffusion coefficients of heat and moisture become important in determining the atmospheric distributions of temperature and humidity, both prognostic variables in the model.Edwards and Marsh (2005) identify twelve underconstrained yet critical parameters, including the two eddy diffusion coefficients in the atmosphere, that largely determine the climate state of C-GOLDSTEIN (Table 1).Through a large ensemble of 2000-year C-GOLDSTEIN simulations where the values of the twelve parameters are randomly changed within specified ranges, they determined a set of parameter values that minimizes the misfit between observations and simulations of surface air temperature and humidity and ocean temperatures and salinities.This exercise gives a degree of objectivity to tuning and credibility to the model.
To this physical model C-GOLDSTEIN, Ridgwell et al. (2007) coupled ATCHEM and BIOGEM, making GENIE-1 a global model of carbon and climate.The production scheme of BIOGEM is based on Michaelis-Menton type phosphate (PO 4 ) uptake kinetics, modified by the availability of light and sea ice.The partitioning of uptake into two thirds dissolved phase and one third particulate organic phase follows the Ocean Carbon cycle Model Intercomparison Project Phase 2 (OCMIP-2) BIOTIC protocol (Najjar et al., 2007;Najjar and Orr, 1999).An important advance made by Ridgwell et al. (2007) is the objective calibration of BIOGEM through data assimilation of PO 4 and alkalinity (ALK).GENIE-1 thus benefited from having C-GOLDSTEIN and BIOGEM both calibrated objectively.

Rationale for improving GENIE-1
In recent years, GENIE-1 has been used increasingly in climate and carbon cycle studies of the present, future, and past (Lenton et al., 2007;Lenton et al., 2006;Matsumoto, 2007;Ridgwell, 2007;Ridgwell et al., 2007).Also, the computational efficiency of GENIE-1 has allowed it to sweep entire parameter spaces with respect to the Atlantic meridional overturning circulation (MOC) so that the model's MOC behavior is now fairly well characterized (Marsh et al., 2004).
We have been motivated to build on the success of GENIE-1 for a number of reasons, especially when we begin to consider the response of marine biology to climate change.First, the 175 m top layer of the ocean model is too thick compared to typical depths of the euphotic zone and mixed layer in the ocean.This prevents a more realistic representation of marine production for example, because nutrient uptake cannot be given a dependence on the mixed layer depth.The importance of this dependence was shown in a seminal work by Sverdrup (1953).Also, the large vertical gradients of nutrients observed today in the upper ocean are completely lost in the 175 m thick layer such that it is not clear what "surface" nutrient concentration in GENIE-1 really means.
Second, there is still significant room to improve the transient tracer uptake in GENIE-1.The model-predicted 1994 global ocean inventories of CFC-11 is 0.88×10 9 mol and anthropogenic carbon is 171 Pg C (Ridgwell et al., 2007).The corresponding observational estimates based on WOCE and JGOFS surveys are much lower at 0.55±0.08×10 9mol CFC-11 (Willey et al., 2004) and 118±19 Pg C (Sabine et al., 2004).If we consider the roughly +7% systematic bias in the anthropogenic carbon inventory (Matsumoto and Gruber, 2005), the overestimation of this transient tracer uptake by GENIE-1 is even more significant.While these uptakes reflect excessive intermediate water ventilation in GENIE-1, there also needs to be a good match in the deep ocean ventilation.
This brings us to the third reason, which is that the deep ocean ventilation of GENIE-1 ought to be validated with the observed abundance of natural radiocarbon ( 14 C).A community-wide OCMIP-2 study that compared 19 ocean carbon cycle models used deep ocean radiocarbon as a met-ric to evaluate the models (Matsumoto et al., 2004).This metric has since been used in subsequent descriptions of new carbon cycle models (Muller et al., 2006;Schmittner et al., 2005) and is arguably the most effective means to evaluate ocean models with respect to deep ocean circulation.
The final motivation for improving GENIE-1 is to incorporate as much seasonality as possible.The simplicity of GENIE-1 with the momentum and radiation forcings being annual averages makes the model more appropriate for seeking mean climate states in long term integrations.Examples include studies of the geologic past to document changes in marine sedimentation (Chikamoto et al., 2008;Ridgwell, 2007).For postindustrial changes, the presence of seasonality adds more credibility in general and helps achieve more realistic processes such as seasonal production and formation of polar sea ice.

Description of MESMO
The starting point of our model development is Version 6 of CB-GOLDSTEIN (Ridgwell et al., 2007), the non-modular version of GENIE-1.MESMO is identical to Version 6 unless noted otherwise.We decided not to use the word "GENIE" in our model name so as to avoid confusion with the ongoing efforts of the GENIEfy project to develop various flavors of GENIE.The GENIEfy project uses its SVN-controlled code and aims to modularize the different model modules, neither of which applies to our efforts with MESMO.The Bern 3D ocean model is also derived from C-GOLDSTEIN and also does not have the descriptor "GE-NIE" (Muller et al., 2006).
We describe MESMO's physical climate model (Sect.4.1) first, followed by its biogeochemistry model (Sect.4.2).In addition to describing the new features and modifications we adopted in MESMO, we will also briefly note two features that we evaluated but ultimately discarded (Sect.4.3).Our dead-end efforts may be of some interest in future development efforts by other groups.

New features in MESMO physical model
First, the vertical resolution in the ocean is increased from 8 layers to 16.To allow biological production to depend on changes in stratification, it is preferable to have at least two layers in the euphotic zone above the critical depth where net production is positive.Therefore, we chose a vertical resolution that contains two complete layers in the top 100 m, which we took as the compensation depth (see Sect. 4.2 below).The midpoints of the 16 layers are: 23,72,133,208,300,412,550,720,927,1182,1494,1877,2347,2923,3630, and 4497 m.The increased vertical resolution is concentrated in the upper ocean such that the bottom topography in MESMO is very similar to GENIE-1, as shown in Fig. 1 of Ridgwell et al. (2007).
www.geosci-model-dev.net/1/1/2008/Geosci.Model Dev., 1, 1-15, 2008 Second, our vertical diffusivity K v in the ocean is given a depth dependence following Bryan and Lewis (1979).Their K v profile has much higher values in the deep ocean compared to the upper ocean and is used to this day in GFDL MOM Versions 3 (Pacanowski and Griffies, 1999) and 4 (Griffies et al., 2004).Indeed a number of studies, such as an in situ SF 6 tracer mixing experiment by Ledwell et al. (2000) and analysis of tidal dissipation by Sjoberg and Stigerbrandt (1992), indicate high mixing rates in the deep ocean.In GENIE-1, K v is constant with depth, which may contribute to excessive transient tracer uptake (Ridgwell et al., 2007).Following Bryan and Lewis (1979), our depth dependent K v is much greater in the deep ocean than in the upper ocean, with an arctangent transition at 2000 m: where K v0 is the value of K v at the bottom.Our choice of K v0 yeilds a depth-averaged K v equal to the NGSA-II value.
The upper ocean K v is set somewhat lower (Table 1) than Bryan and Lewis' value to be consistent with recent observations (Ledwell et al., 1993;Ledwell et al., 1998).Third, we have activated seasonal variation in the incoming short wave radiation in the existing code.The incoming radiation was held constant to annual mean values in studies using the 8-level ocean in GENIE-1.As shown below, this helps achieve seasonal polar sea ice formation.
In addition, we have made a modification to the GM eddy mixing parameterization.The Griffies (1998) parameterization, which greatly reduces diapycnal leakage by replacing horizontal mixing with mixing along isopycnal surfaces, can lead to negative nutrient concentrations in the top layer of MESMO where isopycnal slopes are large.This causes the BIOGEM CO 2 chemistry code to crash.We found that it was necessary to reduce the ssmax parameter, which defines the maximum square of the isopycnal slope above which horizontal mixing rather than isoneutral mixing is applied, from 10 to 1 (ssmax=1 is equivalent to ∼300 m/degree of latitude).

New features in MESMO biogeochemistry model
First, the dependence of export production, J prod , on light and PO 4 in GENIE-1 was expanded in MESMO to include dependence on temperature, nutrient limitation by nitrate (NO 3 ) and CO 2 (aq), biomass turnover, and mixed layer depth following Doney et al. (2006): where the optimal nutrient uptake timescale τ is 15 days.Nutrient uptake is set to occur only above a fixed compensation depth z c of 100 m where photosynthesis is assumed to exceed respiration.For reference, OCMIP-2 protocol uses z c of 75 m.The ratio of z c to the mixed layer depth z ml would allow a bloom-like increase in production as the mixed layer shoals during the spring season, for example.We diagnose z ml in MESMO using the σ t density gradient (0.125) criterion (Levitus, 1982).
The temperature dependence term is given by: where T is temperature ( • C) as in HAMOCC (Maier-Reimer, 1993).The temperature dependence allows higher rates of nutrient uptake in warmer waters to account for universally observed temperature dependent metabolic rates.This dependence, analogous to the heuristic Q 10 =2 relationship, where the rate doubles for every 10 • C increase, was evaluated in GENIE-1 before (Matsumoto, 2007) but is a permanent feature of MESMO.
In addition to PO 4 we included NO 3 and CO 2 (aq) as possible limiting nutrients: where K PO 4 , K NO 3 , and K CO 2 (aq) are half-saturation constants to be determined.Nitrate dependency is needed to examine the impact of increased river runoff of nitrogen over the industrial period in a future study.Another reason to incorporate NO 3 is that it plays a larger role as the limiting nutrient than PO 4 in the modern ocean.Since the ratio of total oceanic inventories of NO 3 to PO 4 is less than the typical elementary stoichiometry of N to P of 16:1 in phytoplankton, NO 3 becomes more limiting on a global scale.Therefore, global production in GENIE-1, which is based only on PO 4 , leads to negative NO 3 concentrations at the surface when PO 4 uptake is directly related to NO 3 by the stoichiometry of 16.Export production in GENIE-1 based on PO 4 compared to NO 3 is larger by about 4%.As noted in Sect.5.2 below, the nutrient limitation in MESMO at this time is effectively entirely due to NO 3 .In the future we will add iron as a limiting nutrient, which will make Equation 4 more meaningful.
In the present form, MESMO has constant NO 3 inventory (i.e., without denitrification and nitrogen fixation).However, A. Ridgwell has already coded denitrification in BIOGEM and we in Minnesota have coded a simple nitrogen fixation scheme according to the distribution of N * (Gruber and Sarmiento, 1997), a quasi-conservative tracer used to infer the regions of nitrogen fixation.Our diagnostic scheme adds NO 3 lost by denitrification back to the system to preserve the initial oceanic inventory.We have also developed the 15 N isotope to accompany these processes.However, these Ncycle processes are not activated at this time in MESMO, as they would require significant calibration effort, which we will expend in the future when they become necessary to address the question at hand.Equation ( 4) also includes aqueous CO 2 as a nutrient, so that we may examine possible fertilization effects due to increasing atmospheric CO 2 content in the future.Under optimal light and nutrient conditions, CO 2 (aq) can limit the photosynthetic rate even with abundant DIC, if the diffusive CO 2 transport is sufficiently slow (Riebesell et al., 1993).
The light or solar irradiance limitation is given by: where I is the seasonally variable solar short-wave irradiance, which decays exponentially from the ocean surface with a 20 m depth scale, and K I (20 W/m 2 ) is the light limitation term (Doney et al., 2006).
The proxy for biomass B (µmol/L) is the concentration of the limiting nutrient, determined in Eq. ( 4).The basic idea, following Doney et al. (2006), is that the higher the concentration of the limiting nutrient, the larger the phytoplankton biomass or population that can be supported.
Second, remineralization of labile particulate organic matter (POM) in MESMO is now expressed by its sinking rate and temperature dependent remineralization rate (Matsumoto, 2007).This is a more process oriented representation over the original expression in GENIE-1 that uses a predefined remineralization profile like the so-called 'Martin curve' (Martin et al., 1987).The sinking rate in our formulation is a parameter to be determined by calibration.The temperature dependent remineralization rate follows the Q 10 =2 relationship.It allows for a climate-carbon cycle feedback, which in the context of glacial-interglacial cycles can be important (Matsumoto, 2007).
Third, the artificial limitation on air-sea gas exchange that Ridgwell et al. (2007) employed to achieve numerical stability (existed in CB-GOLDSTEIN Version 6 but not described in the literature) has been removed.The gas exchange coefficient in GENIE-1 was based on the work of Wanninkhof (1992).We retained the Wanninkhof coefficient, because it remains the de facto standard in existing ocean carbon cycle models.However, a recent reanalysis of bomb 14 C inventory in the ocean, which constrains the global gas exchange coefficient, suggests that the Wanninkhof rate may be too high (Sweeney et al., 2007).As shown by Sarmiento et al. (1992), gas exchange would have a more significant impact on "slow" equilibration gases such as 14 CO 2 than on "fast" gases such as CO 2 and CFCs, a result we confirmed with MESMO.Therefore, if a reduction in the global gas exchange is truly in order, our choice of the Wanninkhof coefficient would tend to overestimate the deep ocean 14 C abundance with minimal effect on anthropogenic CO 2 and CFCs.

Attempted but ultimately discarded improvements to MESMO
Briefly we note two attempts to improve the physical climate model that in the end we did not adopt: deepening the Drake Passage and reducing the diffusivity of heat in the atmosphere at 60   to homogenize the Deep South Atlantic and South Pacific by allowing a larger transport of Antarctic Circumpolar Current (ACC) through the passage.Reduced atmospheric diffusivities have the effect of thermally isolating Antarctica from the rest of the globe and thereby cooling and forming Antarctic Bottom Water (AABW).As shown in Table 2, we found that these did not significantly improve MESMO beyond the large improvements gained by our depth dependent K v profile.This combined with the rather ad hoc nature of the improvements lead us to not implement these into MESMO.

Calibration and control run of MESMO
The equilibrium runs for all our model configurations are obtained by running the model many thousand years until steady state is reached.Metrics used to evaluate the equilibrium runs are the distributions of temperature and salinity, Atlantic MOC, and Southern Ocean MOC (  OCMIP-2 HISTORICAL protocol.In the protocol, the physical model state remains unchanged (i.e., without radiative feedback), while the atmospheric CFC-11 and pCO 2 concentrations are prescribed to follow observations, so that the total oceanic uptake of CFC-11 and anthropogenic CO 2 can be modeled and compared to data-derived estimates for the year 1994 (Table 2).These are also diagnostic of the physical model, because their oceanic uptake is determined to first order by abiotic processes (solubility and vertical mixing).
First, we consider what would happen if we were to just take the GENIE-1's 8-level model parameter values and use them in the new 16-level configuration (Experiment G1-16lev, Table 2).Compared to GENIE-1, G1-16lev equilibrium run shows some improvement with older 14 C everywhere in the deep ocean as a result of reduced Southern Ocean MOC.However, the mismatch relative to observations is still significant (deep Pacific still about 50% too young).There is also minor improvement in matching the observed temperatures and salinities as well as CFC-11 and anthropogenic carbon inventories.
The improvements realized in G1-16lev over GENIE-1 are insufficient (Table 2) so that a more thorough tuning process for MESMO is needed.Our strategy is to do it in three steps (Fig. 1).First, we calibrate the physical model by an objective tuning procedure NGSA-II that only uses the physical climatology fields as targets (Sect.5.1.1).Second, we employ a subjective, target tuning to further improve the physical model (Sect.5.1.2).Third, we take the best physical model from the first two steps and tune the biogeochemistry model by seeking to minimize errors in two model outputs, export production and interior oxygen distribution (Sect.5.2).
A number of factors lead to the adoption of this three-step tuning procedure.A lesson from OCMIP-2 was that the uptake of anthropogenic transient tracers is quite variable in different ocean carbon cycle models because their physics, defined in the broadest sense (e.g., resolution, forcings, numerics, sea ice, seasonality, GM), was so diverse (Doney et al., 2004;Dutay et al., 2002;Matsumoto et al., 2004).OCMIP-2 showed that biogeochemical tracers are very effective in evaluating the ventilation rate of the ocean interior, because the tracers have built-in clocks.So they must be part of our metrics.While it is more preferable to include biogeochemical tracers as targets in the objective tuning (i.e., combine steps one and two), we were guided on practical grounds to keep the first two steps separated because objective tuning using only physical climatology fields exists already in the form of NGSA-II.Earlier tuning studies of GENIE-1 also used only physical fields (Edwards and Marsh, 2005;Lenton et al., 2006).We are also guided by the expectation that objective tuning will likely not give us the most desirable model configuration outright and that some form of subjective tuning based on expert judgment is necessary.The limitations of objective tuning in GENIE-1 is evident, for example, in Lenten et al. (2006) who demonstrated that hand-tuning can do better than objective tuning.Also, in their first objective optimization of C-GOLDSTEIN, Edwards and Marsh (2005) found that the acceptable range of parameter values after optimization remained as large as the range across the initial ensemble.

First step: multi-objective NSGA-II tuning
Our goal is to find a combination of physical model parameters that minimize the mismatch between model-simulated fields and equivalent observed fields.The observed fields are air temperature, air humidity, ocean temperature, and ocean salinity.The non-linear response of the model to its parameters and the possible conflicts between the objectives make this task a challenge.We apply a multi-objective optimization method, which uses a population based algorithm to seek pareto-optimal solutions in the parameter space.For each field i an objective measure of the model's mismatch to the observational data is evaluated at the end of a simulation as where the squared difference between the model field s i and equivalent observational data S i is weighted by the variance in the observational data σ 2 i .The optimization process seeks to minimize the value of this function over each of the four physical fields, producing a pareto-optimal set of parameter sets, the solutions that are better than all the rest in at least one of the objectives.A post-processing of the result set then yields an optimal version of the model.Previous parameter estimation techniques applied to the C-GOLDSTEIN class of problem include a Latin Hypercube sampling (Edwards and Marsh, 2005), the Ensemble Kalman Filter (Hargreaves et al., 2004), the proximal analytic centre cutting plane method (Beltran et al., 2005) and kriging (Price et al., 2007) which all seek to minimize a composite error function.The multiobjective method has the advantage that it avoids the need to select a priori the weighing factors used to evaluate the single error value.
For the purposes of the MESMO calibration, individual model runs were integrated over 5000 years to ensure the system reached quasi-equilibrium.With 16 vertical levels in the ocean, the typical execution time for a simulation was between 150 and 250 min CPU time on the range of compute resource available.This represents an increase in the CPU wall-time of a simulation by a factor of ∼4 over the C-GOLDSTEIN model studied in previous exercises.A standard application of the NSGA-II algorithm (Deb et al., 2002) would therefore have required several weeks of compute time to achieve a high quality result.In order to reduce the required time, we employed the NSGA-II method with surrogate modeling.Previous work (Price et al., 2006) has shown that the use of surrogate models with the NSGA-II algorithm can reduce, by an order of magnitude, the total number of simulation years required for a high quality result in the calibration of the C-GOLDSTEIN composition.
The optimization process is illustrated in Fig. 2. The method consists of an initial sampling of the parameter space from which surrogate models of the underlying objective functions are built.These computationally cheap surrogate models are then searched extensively using the NSGA-II algorithm to generate a set of update points that are evaluated on the true objective functions.The algorithm iterates by refining the surrogate models and performing the NSGA-II search over the new models until convergence criteria are satisfied or available computational budget is exhausted.At the end of the process the pareto-optimal points in the objective space of the problem are returned for further analysis.The technical details of the optimization process are described as a supplemental material and in Price et al. (2006).
Figure 3 shows the results of the optimization process.A total of 137 members of the data cache comprise the Pareto front, indicated by red dots in the figure.Following Price et al. (2006) 6) for the fields ocean temperature, ocean salinity, surface temperature and surface humidity.The red points indicate model evaluations that lie on the Pareto front and dominate the other evaluations marked with blue points.The progress of the algorithm is displayed in plot (g) as measured by the minimum of the weighted root mean square of the four objective values across all members in the data cache after each update.over C-GOLDSTEIN comes from the increase in the number of grid cells upon doubling the vertical layers.It is therefore noted that the doubling of the ocean levels in this version of the model has increased the bias of this measure further towards the ocean fields.The optimal point we have selected therefore sits closest to the origin of the T vs. S plot and achieves the best ocean interior at the expense of a less optimal surface humidity representation.
The version of C-GOLDSTEIN that was optimized using NSGA-II was the original version of Edwards and Marsh (2005) modified to include our new 16 vertical levels and GM parameter ssmax.The twelve "best" model parameter values from NSGA-II are listed in Table 1.The model runs using the NSGA-II values constitute our Control experiment (Table 2).Compared to GENIE-1, the equilibrium Control run shows modest improvements in simulating the observed deep ocean 14 C as well as ocean temperature and salinity distributions.The deep 14 C is still too young though, especially for the Circumpolar Deep Water (CDW, −99%) and the North Pacific Deep Water (NPDW, −142%) compared to observations (−155% and −226%, respectively).Improvement in matching the observed temperatures is quite modest, because SST and global ocean temperatures in GENIE-1 were already reasonably good (r 2 of 0.9 and higher).The Control run shows more improvement over GENIE-1 in ocean salinities, although r 2 for salinity is in general much lower than for temperature.However, we are dissatisfied with our transient Control run in that it overestimates the transient tracer uptake (1.10×10 9 mol CFC-11 and 176 Pg of anthropogenic carbon) even more than GENIE-1 (0.92×10 9 mol CFC-11 and 171 Pg-C) .
5.1.2Second step: targeted tuning Given our original motivation, MESMO must show improvements in transient tracer metrics in particular.We therefore build on NSGA-II tuning with three targeted tunings (K v prof, FWflux, and Season) aided by some knowledge from experience about how they will affect the results (Table 2).
The single most important change that improved the transient tracer uptake was K v prof.This alone brought the model-predicted transient tracer uptake very close to databased estimates: CFC-11 inventory is 0.68×10 9 mol in the model compared to 0.55±0.12×10 9mol based on data (Willey et al., 2004); anthropogenic carbon inventory is 120 Pg-C in the model compared to 118±19 Pg-C based on data (Sabine et al., 2004).The reduced transient uptake is a direct consequence of the low K v in the upper ocean in K v prof.It is an expected consequence, given that CFCs and CO 2 have relatively short timescale of air-sea gas equilibration, so their uptake is controlled predominantly by vertical exchange in the ocean rather than air-sea gas exchange kinetics.The reduced upper ocean K v has the added and significant benefit of making the deep ocean older with respect to 14 C.The ventilation of the deep Southern Ocean ( 14 C=−164‰) and Pacific Ocean ( 14 C=−223‰) are now consistent with observations (Table 2).One drawback in K v prof is reduced Atlantic MOC, which is now 9 Sv compared to 16 Sv in both GENIE-1 and the Control run (Table 2).
Equation (1) represents our choice of the K v profile that is adopted in MESMO, but we have explored the effects of changing the shape of the profile (Fig. 4).We find that shoaling the inflection point (say, by 1000 m; compare profiles A and B which are otherwise identical) increases ventilation of CDW ( 14 C=−148‰ for A and −139‰ for B) and NPDW ( 14 C=−200‰ and −181‰), without little effect on the NADW (both 14 C=−109‰).Likewise, the sharper transition from low to high values in profile C as compared to profile D yields older CDW 14 C (−173‰ for C and −167‰ for D) and NPDW (−233‰ for C and −224‰for D); the sharper transition keeps K v above the inflection point to a lower value, to which the deep ocean ventilation is sensitive.As noted above, our choice of the K v profile was guided by observations (Ledwell et al., 1993;Ledwell et al., 1998), history (Bryan and Lewis, 1979), and NGSA-II results (Table 1).
In addition to K v prof, we made two modifications to arrive at our best model, which constitutes the physical model of MESMO.First, we implemented FWflux in order to restore the Atlantic MOC that was reduced to 9 Sv in K v prof.With FWflux, the Atlantic-to-Pacific freshwater flux is increased by 25% from the value given by NGSA-II (Table 1).As shown earlier for GENIE-1 (Marsh et al., 2004(Marsh et al., , 2007)), this has the effect of increasing the Atlantic MOC by increasing salinity and thus density of the Atlantic surface waters.The interbasin water transport is an needed in GENIE-1 to compensate for the insufficient transport in the energy moisture balance model of the atmosphere (Edwards and Marsh, 2005).This is an artificial flux adjustment, which should ideally be gotten rid of.Our decision to increase it therefore represents a drawback in MESMO, as a larger adjustment helps achieve a better match with observations in the equilibrium at the likely expense of (weak) transient response.
Second, we implemented Season in order to simulate the seasonality and spatial coverage of sea ice (Fig. 5, see animation of seasonality in Supplemental Materials).In the Control and K v prof-only models, which lack seasonal variability in solar radiation, sea ice is largely permanent where it exists (i.e., 100% in Fig. 5a, c).With Season, polar sea ice extent is seasonal (Fig. 5b, d).In our best model (Fig. 5d), sea ice coverage around Antarctica is approximately 4.5×10 6 km 2 during summer and 36×10 6 km 2 during winter.These overestimate somewhat the satellitederived sea ice climatology for the 1979-2000 period that shows summer coverage of 3-4×10 6 km 2 and winter coverage of 17-20×10 6 km 2 (National Snow and Ice Data Center, http://www.nsidc.colorado.edu).However, the percent change in sea ice coverage in the two seasons is comparable in MESMO and observations.
In our best physical model of MESMO (Table 2; salient features shown in Fig. 6), the benefits of K v prof in terms of deep ventilation (consistent in 14 C) and transient tracer uptake (0.67×10 9 mol-CFC-11 and 118 Pg-C) are still preserved.The Atlantic MOC is reasonable, again at the expense of a larger flux adjustment, and a more realistic seasonality of polar sea ice is achieved.

Biogeochemistry model calibration
We now calibrate our best physical model in terms of biogeochemistry using two targets, global particulate organic carbon (POC) export and the global ocean distribution of dissolved oxygen (O 2 ).Both are representative of marine biogeochemistry and important diagnostics of vertical nutrient supply and remineralization (Gnanadesikan et al., 2004).We also used NO 3 and ALK as targets but these are not as sensitive to change and therefore of secondary importance to POC export and O 2 .Model parameters to be determined are the half saturation constants in Eq. ( 4) and the sinking rate of POC.This is done by sweeping a parameter space defined by these parameters and seeking a combination of parameters that best reproduces the observed POC export and O 2 .We start by assuming that K NO 3 and K PO 4 are simply related by a stoichiometry of 16.This makes NO 3 rather than PO 4 the more important limiting nutrient, but otherwise, the two macronutrients are qualitatively very similar in behavior at this time.In the future, when iron, nitrogen fixation, and denitrification become operational, NO 3 and PO 4 will become more meaningful as independent nutrient tracers.As for K CO 2 (aq) , we use the value of 0.5×10 −6 mol kg −1 (Riebesell et al., 1993) and keep it constant, since little is actually known about CO 2 limitation and much of what is known comes from the work of Riebesell.
Our parameter space is therefore defined by K NO 3 and the POC sink rate (Fig. 7).For the chosen ranges of these parameters, our model gives global POC export production between 6 and 11 Pg-C yr −1 (Fig. 7a).The range in data-based estimates using satellite products and inverse modeling is roughly 9 to 13 Pg-C yr −1 (Gnanadesikan et al., 2004;Laws et al., 2000;Schlitzer, 2002) with possibly significant uncertainties (Najjar et al., 2007).In our model, export production is sensitive to POC sinking rate within the parameter space, and it indicates that our sinking rate should be in the lower range to be consistent with data-based estimates.
Comparison of the global O 2 distribution between our model and observations (Levitus and Boyer, 1994a) indicates that spatial correlation is also higher for lower POC sinking rates (Fig. 7b).The r 2 value for the global ocean reaches as high as 0.7 for sinking rate of 50 m day −1 .
A closer look at the absolute value of dissolved O 2 in the upper ocean and deep ocean indicates that the choice of K NO 3 is also important (Fig. 8).Above water depths of 1500 m (Fig. 8a), deviation from observations becomes as small as 5 µmol kg −1 when K NO 3 is about 0.3×10 −5 mol kg −1 .Larger K NO 3 gives larger errors, as large as 30 µmol kg −1 .Below 1500 m (Fig. 8b), O 2 becomes significantly depleted with higher POC sinking rate and consequently more POC being remineralized in the deeper waters.
In MESMO, the total inventory of DIC is 36.737Pg-C and of ALK is 3.2×10 18 mole-eq.

Discussion and summary
Significant improvements are achieved in MESMO over its predecessor GENIE-1.Whereas GENIE-1 had excessive ventilation ( 14 C was too high in all basins, global CFC inventory was almost twice the observation, and anthropogenic carbon inventory was about 50 Pg-C too large), MESMO is consistent with all these data-based metrics (Table 2).Export POC production in MESMO is 10.6 Pg-C yr −1 , which is in good agreement with the data-based estimates.Dissolved O 2 in the ocean interior is modestly improved in MESMO over GENIE-1 in terms of model-data correlation.The improvements in interior ventilation rates as indicated by 14 C  and dissolved O 2 are not unrelated, because the latter reflects a balance between the rate of ventilation that supplies O 2 to the interior and the production of POC that ultimately depletes O 2 .These improvements were achieved while increasing the vertical resolution of the upper ocean, which has important implications.Again, the thickness of the topmost ocean layer is 45 m in MESMO compared to 175 m in GENIE-1.In MESMO, production occurs in the top two layers in 100 m and is modified by diagnosed mixed layer depth (Fig. 9a, see animation of seasonality in Supplemental Materials).The physical setting that defines production is thus better represented in MESMO.The reduced thickness of the topmost layer is also important in setting the right physical conditions for air-sea gas exchange.Tracers such as CFC-11 have relatively short gas equilibration time scale and surface waters are fairly close to being saturated, especially away from the polar sea ice.It is desirable for the CFC-saturated, topmost layer, to be appropriately thin in order to get the tracer inventory correct for the right reason.
In calibrating MESMO, we used the results of NGSA-II as starting point, from which significant improvements were made by targeted tuning based on experience.The most important targeted, physical tuning was the depth dependent vertical mixing K v prof, followed by Season to achieve a better seasonal variability of polar sea ice, and FWflux, to  maintain an adequate vigor in the Atlantic MOC.In biogeochemistry calibration, we defined a parameter space defined by two key parameters, and chose a combination of these parameters that minimized the error in reproducing the dissolved O 2 distribution and global POC production.
We believe that MESMO, with significant improvements over its predecessor, makes it a well calibrated EMIC for studies of the global carbon cycle of both today and the recent past.The improved physical setting for surface production makes it suitable to examine the feedbacks of marine biology to climate change, although the larger freshwater flux adjustment in MESMO may reduce the sensitivity of the Atlantic MOC to certain perturbations.
We expect to further develop MESMO in the future.There are two specific improvements on the horizon.First, incorporation of iron would likely help improve the spatial distribution of export production and thus interior O 2 distribution.For example, Fig. 9b shows that the upper ocean of the eastern equatorial Pacific is low in O 2 as we expect but not low enough to initiate denitrification as observed.An accurate representation of the dissolved O 2 is critical for a reasonable marine nitrogen cycle, which in turn allows nitrogen and phosphate cycles to become decoupled.This would allow NO 3 and PO 4 along with iron to be meaningful limiting nutrients in MESMO.Second, forcing MESMO with seasonal wind stresses will make it a more truly dynamically seasonal model.We have in fact begun to do this with the ECMWF seasonal wind stress fields , although preliminary efforts have produced too much interior ventilation, indicated by deep radiocarbon and transient tracer uptake, and too much export production.

Fig. 2 .
Fig. 2. Flow chart for the NSGA-II with surrogate modeling optimization process.

Fig. 5 .
Fig. 5. Sea ice coverage from equilibrium runs of (a) Control, (b) Season, (c) K v prof, and (d) MESMO.Percentage of sea ice coverage at each grid point over a course of one year is shown.So a 50% coverage may be due to: 100% coverage for half a year, 50% coverage for all year, or some intermediate combination between the two extreme cases.See Supplemental Materials for animation of the seasonal cycle.

Fig. 8 .
Fig. 8. Model-data difference in dissolved O 2 (a) above 1500 m water depth and (b) below 1500 m in µmol kg −1 .We seek the smallest error from observations in both.

Fig. 9 .
Fig. 9. MESMO results: (a) annually averaged, diagnosed mixed layer depth in meters and (b) dissolved O 2 in the upper 1000 m.See Supplemental Materials for animation of the mixed layer depth.

Table 2 .
Comparisons of key physical model diagnostics.

Table 2 )
. The postindustrial transient run is obtained by running the model from the preindustrial state, taken to be year 1765 and represented by the equilibrium run, to the year 1994 following the