Geoscientific Model Development GENIE-M : a new and improved GENIE-1 developed in Minnesota

Here we describe GENIE-M, a new and improved version of the Grid ENabled Integrated Earth system model (GENIE), which is a 3-D earth system model of intermediate complexity. Main development goals of GENIE-M were to: (1) bring oceanic uptake of anthropogenic transient tracers within data constraints; (2) increase vertical reso5 lution in the upper ocean to better represent near-surface biogeochemical processes; (3) calibrate the deep ocean ventilation with observed abundance of radiocarbon. We achieved all these goals through a transparent process of calibration that mostly consisted of objective model optimization. An important new feature in GENIE-M that dramatically improved the uptake of CFC-11 and anthropogenic carbon is the depth 10 dependent vertical diffusivity in the ocean, which is spatially uniform in GENIE-1. In GENIE-M, biological production occurs in the top two layers above the compensation depth of 100 m and is modified, for example, by diagnosed mixed layer depth. In contrast, production in GENIE-1 occurs in a single layer with thickness of 175 m. These improvements make GENIE-M a well-calibrated model of intermediate complexity suit15 able for investigations of the global marine carbon cycle requiring long integration time.


Introduction
Earth system Models of Intermediate Complexity (EMICs) occupy a unique and important position within the hierarchy of climate models (Claussen et al., 2002).In many ways, EMICs represent a compromise between high resolution, comprehensive coupled models of atmospheric and oceanic circulation, which require significant computational resources, and conceptual (box) models, which are computationally very efficient but represent the climate system in a highly idealized manner.A critical difference between comprehensive coupled models and box models is the absence of dynamical feedbacks in the latter.In box models, large scale circulation is typically prescribed and not allowed to change over the course of a simulation.The lack of dynamical feedbacks makes box models unsuitable for realistic simulations of transient climate change.On the other hand, comprehensive coupled models are so computationally intensive that their behavior within a given parameter space is difficult to fully explore.EMICs nicely fill this gap by retaining important dynamics while remaining computationally efficient, which is typically achieved by reducing spatial resolution and/or number of processes compared to high resolution coupled models.
The effectiveness of EMICs is evident in the numerous publications that have successfully employed them in studying past, present, and future climates (Ganopolski and Rahmstorf, 2001;Ganopolski et al., 1998;Joos et al., 1999;Knutti et al., 2002;Nusbaumer and Matsumoto, 2008;Plattner et al., 2001).Also, the important role that EMICs played in understanding the postindustrial carbon cycle changes is highlighted in the two recent IPCC science reports TAR (Houghton et al., 2001) and AR4 (IPCC, 2007).
Here we document development of GENIE-M, a new version of an existing and successful EMIC called GENIE-1.Our immediate motivation for this work is to possess a tool to investigate postindustrial changes in the natural ocean carbon cycle.Our efforts were thus geared toward improving representation of marine biogeochemistry and distributions of natural and anthropogenic transient tracers in the oceans.These improvements, combined with more highly resolved upper ocean and reasonable seasonal sea ice formation, represent significant steps toward reaching our immediate objective and making GENIE-M useful for future investigations of the global ocean carbon cycle.

A brief description of GENIE-1 relevant for GENIE-M
GENIE is a new EMIC developed primarily in the UK (http://www.genie.ac.uk/) with the goal of making it as modular as possible so that in theory one can choose to construct a model with any permutation of the existing modules (e.g., slab or 3-D dynamical ocean module coupled to energy balance or 3-D dynamical atmospheric module).Following Ridgwell et al. (2007), we will refer to GENIE-1 as a model configuration that consists of 3 the physical climate module C-GOLDSTEIN, a simple atmospheric chemistry module ATCHEM, and a marine biogeochemistry module BIOGEM.The starting point of our model development is this GENIE-1, which in terms of the actual code is Version 6 of CBS-GOLDSTEIN as distributed by A. Ridgwell in 2006.Unless noted otherwise, GENIE-M is identical to Version 6.
As described Edwards and Marsh (2005), C-GOLDSTEIN is itself a stand-alone, coarse gridded, efficient climate model that is comprised of a 3-dimensional circulation model of the world ocean, an energy and moisture balance model of the atmosphere, and a dynamic and thermodynamic model of sea ice.The ocean model is on a 36×36 equal-area horizontal grid with 10 • increments in longitude and uniform in sine of latitude.There are 8 levels in the vertical with the top layer being 175 m thick.Ocean dynamics is based on the frictional geostrophic equations (Edwards et al., 1998) and includes the Gent-McWilliams (GM) eddy mixing parameterization according to Griffies (1998) that reduces excessive vertical mixing in coarse gridded models (Duffy et al., 1997;England and Rahmstorf, 1999).The momentum flux that drives the surface ocean circulation is based on the annual NCEP reanalysis wind stress and therefore has no seasonality.The atmospheric component of C-GOLDSTEIN is an energy and moisture balance model following Weaver et al. (2001).External forcing by shortwave solar radiation is temporally constant, annually averaged and thus has no seasonality.Without explicit atmospheric dynamics, eddy diffusion coefficients of heat and moisture become important in determining the atmospheric distributions of temperature and humidity, both prognostic variables in the model.Edwards and Marsh (2005) identify twelve underconstrained yet critical parameters, including the two eddy diffusion coefficients in the atmosphere, that largely determine the climate state of C-GOLDSTEIN (Table 1).Through a large ensemble of 2000-year C-GOLDSTEIN simulations where the values of the twelve parameters are randomly changed within specified ranges, they determined a set of parameter values that minimizes the misfit between observations and simulations of surface air temperature and humidity and ocean temperatures and salinities.This exercise gives a degree of objectivity to tuning and credibility to the model.
To this physical model C-GOLDSTEIN, Ridgwell et al. (2007) coupled ATCHEM and BIOGEM, making GENIE-1 a global model of carbon and climate.The production scheme of BIOGEM is based on Michaelis-Menton type phosphate (PO 4 ) uptake kinetics, modified by the availability of light and sea ice.The partitioning of uptake into dissolved and particulate organic phases follows the Ocean Carbon cycle Model Intercomparison Project Phase 2 (OCMIP-2) BIOTIC protocol (Najjar et al., 2007;Najjar and Orr, 1999).An important advance made by Ridgwell et al. (2007) is the objective calibration of BIOGEM through data assimilation of PO 4 and alkalinity (ALK).GENIE-1 thus benefited from having C-GOLDSTEIN and BIOGEM both calibrated objectively.

Rationale for improving GENIE-1
In recent years, GENIE-1 is increasingly used in climate and carbon cycle studies of the present, future, and past (Lenton et al., 2007(Lenton et al., , 2006;;Matsumoto, 2007;Ridgwell, 2007;Ridgwell et al., 2007).Also, the computational efficiency of GENIE-1 has allowed it to sweep entire parameter spaces with respect to the Atlantic meridional overturning circulation (MOC) so that the model's MOC behavior is now fairly well characterized (Marsh et al., 2004).
We have been motivated to build on the success of GENIE-1 for a number of reasons, especially when we begin to consider the transient response of marine biology to climate change.First, the 175 m top layer of the ocean model is too thick compared to typical depths of the euphotic zone and mixed layer in the ocean.This prevents a more realistic representation of marine production for example, because nutrient uptake cannot be given a dependence on the mixed layer depth.The importance of this dependence was shown in a seminal work by Sverdrup (1953).Also, the large vertical gradients of nutrients observed today in the upper ocean are completely lost in the 175 m thick layer such that it is not clear what "surface" nutrient concentration in 5 GENIE-1 really means.
Second, there is still significant room to improve the transient tracer uptake in GENIE-1.The model-predicted 1994 global ocean inventories of CFC-11 is 0.88×10 9 mol and anthropogenic carbon is 171 Pg C (Ridgwell et al., 2007).The corresponding observational estimates based on WOCE and JGOFS surveys are much lower at 0.55±0.08×109 mol CFC-11 (Willey et al., 2004) and 118±19 Pg C (Sabine et al., 2004).If we consider the roughly +7% systematic bias in the anthropogenic carbon inventory (Matsumoto and Gruber, 2005), the overestimation of this transient tracer uptake by GENIE-1 is even more significant.While these uptakes reflect excessive intermediate water ventilation in GENIE-1, there also needs to be a good match in the deep ocean ventilation.
This brings us to the third reason, which is that the deep ocean ventilation of GENIE-1 ought to be validated with the observed abundance of natural radiocarbon ( 14C).A community-wide OCMIP-2 study that compared 19 ocean carbon cycle models used deep ocean radiocarbon as a metric to evaluate the models (Matsumoto et al., 2004).This metric has since been used in subsequent descriptions of new carbon cycle models e.g., (Muller et al., 2006;Schmittner et al., 2005) and is arguably the most effective means to evaluate ocean models with respect to deep ocean circulation.
The final motivation for improving GENIE-1 is to incorporate as much seasonality as possible, although as we note below not all of our efforts bore fruit.The simplicity of GENIE-1 with the momentum and radiation forcings being annual averages makes the model more appropriate for seeking mean climate states in long term integrations.Examples include studies of the geologic past to document changes in marine sedimentation (Chikamoto et al., 2008 1 ;Ridgwell, 2007).For postindustrial changes, the presence of seasonality adds more credibility in general and helps achieve more realistic processes such as seasonal formation of polar sea ice.

Description of GENIE-M
Much of the effort to develop this new version of GENIE-1 took place in Minnesota, so we will name our version GENIE-M to distinguish it from its predecessor, GENIE-1 as well as from other ongoing efforts to develop a different 16 level model.Although we started development from CBS-GOLDSTEIN Version 6 and not from the modularized, SVN code of the GENIEfy project, we retain the name "GENIE", which immediately conveys the origin of our model.This is important and useful information that could be obscured with a completely different name.For example, it is not immediately clear that the Bern 3-D model (Muller et al., 2006) is actually derived from the ocean model of C-GOLDSTEIN.
We split our description of GENIE-M as it relates to the physical climate model (Sect.4.1) and biogeochemistry (Sect.4.2).This reflects our sequential efforts to calibrate and tune the physical model first before the biogeochemistry model.In addition to describing the new features and modifications we adopted in GENIE-M, we will also briefly note features that we incorporated initially and evaluated but ultimately discarded (Sect.4.3).We believe documentation of our dead-end efforts is of some interest in future development efforts by other groups.

New features in GENIE-M physical model
First, the vertical resolution in the ocean is increased from 8 layers to 16.To allow biological production to depend on changes in stratification, it is preferable to have at least two layers in the euphotic zone above the critical depth where net production is positive.Therefore, we chose a vertical resolution that contains two complete layers in the top 100 m, which we took as the compensation depth (see Sect. 4.2 below).The midpoints of the 16 layers are: 23, 72, 133, 208, 300, 412, 550, 720, 927, 1182, 1494, 1877, 2347, 2923, 3630, and 4497 m.The increased vertical resolution is concentrated in the upper ocean such that the bottom topography in GENIE-M is very similar to GENIE-1, as shown in Fig. 1 of Ridgewell et al. (2007).7 Second, our vertical diffusivity K v in the ocean is given a depth dependence following Bryan and Lewis (1979).Their K v profile has about 10 times higher values in the deep ocean compared to the upper ocean and is used to this day in GFDL MOM Versions 3 (Pacanowski and Griffies, 1999) and 4 (Griffies et al., 2004).Indeed a number of studies, such as an in situ SF 6 tracer mixing experiment by Ledwell et al. (2000) and analysis of tidal dissipation by Sjoberg and Stigerbrandt (1992), indicate high mixing rates in the deep ocean.In GENIE-1, K v is constant with depth, which may contribute to excessive transient tracer uptake (Ridgwell et al., 2007).Following Bryan and Lewis (1979), our depth dependent K v is ten times greater in the deep ocean than in the upper ocean, with an arctangent transition at 2000 m: where K v0 is the depth averaged K v and user-supplied parameter.Third, we have modified the incoming short wave radiation forcing from temporally constant, annually averaged to seasonally varying (Season, Table 2).As shown below, this helps achieve seasonal polar sea ice formation.
In addition, we have altered the value of a GM isoneutral diffusion flux limitation parameter "ssmax" to achieve numerical stability and increased the Atlantic-to-Pacific freshwater flux to maintain healthy Atlantic MOC.With regard to GM in GENIE-1, we discovered that ssmax can cause negative divergence of nutrients in the top layer as layer thickness is reduced while maintaining the same time step.The resulting negative nutrient concentration will cause GENIE-1 to crash within the BIOGEM CO 2 chemistry code.Negative divergence becomes possible when the vertical concentration gradient between the top layer and the second layer becomes large.The calculation of isoneutral diffusion flux requires horizontal and vertical averaging of tracer concentrations; the flux based on averaging can result in too large a flux out of the grid box with the lower tracer concentration.This negative divergence only occurs for nutrient tracers with concentrations near zero at the surface and appears to not affect temperature and salinity and thus the stability of C-GOLDSTEIN, which appears to be very forgiving (i.e., can continue to run under unrealistic conditions).We decided to practically address the problem and achieve numerical stability by reducing ssmax from 10 to 1.Our decision was aided by the fact that: (1) it is not clear what the appropriate value ought to be; (2) the original value of 10 was a value that just happened to work for GENIE-1; and (3) we found no obvious odd behavior by making the change.
The change made to the Atlantic-to-Pacific freshwater flux (FWflux, Table 2) is described below in Sect.5.1.It is needed to restore the Atlantic MOC to a healthy value after our NSGA-II objective tuning reduced the MOC to an unrealistically small value even while minimizing the error with respect to observations of air temperature and humidity and ocean temperature and salinity.

New features in GENIE-M biogeochemistry model
First, the dependence of export production, J prod , on light and PO 4 in GENIE-1 was expanded in GENIE-M to include dependence on temperature, nutrient limitation by nitrate (NO 3 ) and CO 2 (aq), biomass turnover, and mixed layer depth following Doney et al. (2006): where the optimal nutrient uptake timescale τ is 15 days.Nutrient uptake is set to occur only above a fixed compensation depth z c of 100 m where photosynthesis is assumed to exceed respiration.For reference, OCMIP-2 protocol uses z c of 75 m.The ratio of z c to the mixed layer depth z ml would allow a bloom-like increase in production as the mixed layer shoals during the spring season, for example.We diagnose z ml in GENIE-M using the σ t density gradient (0.125) criterion (Levitus, 1982).
The temperature dependence term is given by: where T is temperature ( • C) as in HAMOCC (Maier-Reimer, 1993).The temperature dependence allows higher rates of nutrient uptake in warmer waters to account for universally observed temperature dependent metabolic rates.This dependence was evaluated in GENIE-1 before (Matsumoto, 2007) but is a permanent feature of GENIE-M.
In addition to PO 4 we included NO 3 and CO 2 (aq) as possible limiting nutrients: where K PO 4 , K NO 3 , and K CO 2 (aq) are half-saturation constants to be determined.Nitrate dependency is needed to examine the impact of increased river runoff of nitrogen over the industrial period in a future study.Another reason to incorporate NO 3 is that it plays a larger role as the limiting nutrient than PO 4 in the modern ocean.Since the ratio of total oceanic inventories of NO 3 to PO 4 is less than the typical elementary stoichiometry of N to P of 16:1 in phytoplankton, NO 3 should be more limiting on a global scale.Therefore, global production in GENIE-1, which is based only on PO 4 , leads to negative NO 3 concentrations at the surface when PO 4 uptake is directly related to NO 3 by the stoichiometry of 16.Export production in GENIE-1 based on PO 4 compared to NO 3 is larger by about 4%.As noted in Sect.5.2 below, the nutrient limitation in GENIE-M at this time is effectively entirely due to NO 3 .In the future we will add iron as a limiting nutrient, which will make Equation 4 more meaningful.
In the present form, GENIE-M has constant NO 3 inventory (i.e., without denitrification and nitrogen fixation).However, A. Ridgwell has already coded denitrification in BIOGEM and we in Minnesota have coded a simple nitrogen fixation scheme according to the distribution of N * (Gruber and Sarmiento, 1997), a quasi-conservative tracer used to infer the regions of nitrogen fixation.Our diagnostic scheme adds NO 3 lost by denitrification back to the system so as to preserve the initial oceanic inventory.
We have also developed the 15 N isotope to accompany these processes.However, these N-cycle processes are not activated at this time in GENIE-M, as they would re-quire significant calibration effort, which we will expend in the future when they become necessary to address the question at hand.Equation 4 also includes aqueous CO 2 as a nutrient, so that we may examine possible fertilization effect due to increasing atmospheric CO 2 content in the future.Under optimal light and nutrient conditions, CO 2 (aq) can limit the photosynthetic rate even with abundant DIC, if the diffusive CO 2 transport is sufficiently slow (Riebesell et al., 1993).
The light or solar irradiance limitation is given by: where I is the seasonally variable solar short-wave irradiance, which decays exponentially from the ocean surface with a 20 m depth scale, and K I (20 W/m 2 ) is the light limitation term (Doney et al., 2006).
The proxy for biomass B (umol/L) is the concentration of the limiting nutrient, determined in Eq. ( 4).The basic idea, following Doney et al. (2006), is that the higher the concentration of the limiting nutrient, the larger the phytoplankton biomass or population that can be supported.
Second, remineralization of labile particulate organic matter (POM) in GENIE-M is now expressed by its sinking rate and temperature dependent remineralization rate (Matsumoto, 2007).This is a more process oriented representation over the original expression in GENIE-1 that uses a predefined remineralization profile like the so-called "Martin curve" (Martin et al., 1987).The sinking rate in our formulation is a parameter to be determined by calibration.The temperature dependent remineralization rate follows the heuristic Q 10 =2 relationship, where the rate doubles for every 10 • C increase.It allows for a climate-carbon cycle feedback, which in the context of glacial-interglacial cycles can be important (Matsumoto, 2007).
Third, the artificial limitation on air-sea gas exchange that existed in CBS-GOLDSTEIN Version 6 but not described by Ridgwell et al. (2007) has been removed.11 4.3 Attempted but ultimately discarded improvements to GENIE-M Briefly we note four of our attempts to improve the physical climate model that in the end we did not adopt: seasonal wind stress forcings, deepening the Drake Passage, reducing the diffusivity of heat in the atmosphere at 60 o S, and reducing the air sea gas exchange coefficient.Results of some of these attempts are shown in Table 2. First, we used the ECMWF seasonal wind stress fields (Trenberth et al., 1989) in the hope of making GENIE-M truly seasonal.Unfortunately the NSGA-II objective tuning did not produce desirable results in terms of deep radiocarbon and uptake of transient tracers.Since these tracer distributions were sufficiently improved with our other modifications, we did not pursue seasonal wind forcing.
Both the Drake Passage deepening (Drake, Table 2) and reduced atmospheric diffusivities (ATMdiff, Table 2) were suggested as possibly helpful in improving the deep ocean ventilation by T. Lenton and R. Marsh in their experiences with GENIE-1.The Drake Passage deepening has the potential to homogenize the Deep South Atlantic and South Pacific by allowing a larger transport of Antarctic Circumpolar Current (ACC) through the passage.Reduced diffusivities have the effect of thermally isolating Antarctica from the rest of the globe and thereby cooling and forming Antarctic Bottom Water (AABW).As shown in Sect.5.1, we found that these did not significantly improve GENIE-M beyond the large improvements gained by our depth dependent K v profile.In our effort to improve simulation of deep ocean 14 C distribution, we also tried a lower gas exchange coefficient (Gasex, Table 2) than the original GENIE-1 coefficient based on the work of Wanninkhof (1992) that is the de facto standard in existing ocean carbon cycle models.A recent reanalysis of bomb 14 C inventory in the ocean however suggests that the Wanninkhof rate may be about 25% too high (Sweeney et al., 2007).Again, the large improvements gained by our depth dependent K v profile did not necessitate a lower gas exchange coefficient.

Calibration and control run of GENIE-M
One of the lessons of OCMIP-2 was that the uptake of anthropogenic transient tracers is quite variable in different ocean carbon cycle models because their physics, defined in the broadest sense (e.g., resolution, forcings, numerics, sea ice, seasonality, GM), was so diverse (Doney et al., 2004;Dutay et al., 2002;Matsumoto et al., 2004).
OCMIP-2 was not so much a comparison of the marine carbon cycle as the name implied but more an evaluation of different ocean physics using biogeochemical tracers (e.g., 14 C, 3 He, anthropogenic carbon, and CFC-11).Because many biogeochemical tracers have built-in clocks, they are arguable more useful than observed temperatures and salinities to evaluate the model ventilation of the ocean interior.This lesson would suggest that we calibrate the physical model of GENIE-M first and then the biogeochemistry model.This is the approach we take.
5.1 Physical model calibration 5.1.1First step: multi-objective NSGA-II tuning The goal of tuning here is to find a combination of physical model parameters theat minimize the mismatch between model-simulated fields and equivalent observed fields.
The observed fields are air temperature, air humidity, ocean temperature, and ocean salinity.As with many design problems, the non-linear response of a model to its parameters and the possible conflicts between the objectives make this a difficult problem to solve.In this work we apply a multi-objective optimization method to the problem which uses a population based algorithm to seek pareto-optimal solutions in the parameter space.For each field i an objective measure of the model's mismatch to the observational data is evaluated at the end of a simulation as where the squared difference between the model field s i and equivalent observational data S i is weighted by the variance in the observational data σ 2 i .The optimization process seeks to minimize the value of this function over each of the four physical fields of the model, producing a pareto-optimal set of parameter sets.A post-processing of the result set then yields an optimal version of the model.Previous parameter estimation techniques applied to the C-GOLDSTEIN class of problem include a Latin Hypercube sampling (Edwards and Marsh, 2005), the Ensemble Kalman Filter (Hargreaves et al., 2004), the proximal analytic centre cutting plane method (Beltran et al., 2005) and kriging (Price et al., 2007) which all seek to minimize a composite error function.The multi-objective method has the advantage that it avoids the need to select a priori the weighing factors used to evaluate the single error value.
For the purposes of the GENIE-M calibration, individual model runs were integrated over 5000 years to ensure the system reached quasi-equilibrium.With 16 vertical levels in the ocean, the typical execution time for a simulation was between 150 and 250 min CPU time on the range of compute resource available.This represents an increase in the CPU wall-time of a simulation by a factor of ∼4 over the C-GOLDSTEIN model studied in previous exercises.A standard application of the elitist NSGA-II algorithm (Deb et al., 2002) would therefore have required several weeks of compute time to achieve a high quality result.In order to perform the parameter estimation exercise in a timely fashion we therefore employed the NSGA-II method with surrogate modeling.
Previous work (Price et al., 2006) has shown that the use of surrogate models with the NSGA-II algorithm can reduce, by an order of magnitude, the total number of simulation years required for a high quality result in the calibration of the C-GOLDSTEIN composition.In the surrogate modeling approach one seeks to build a representation of the individual objective functions of the problem using Response Surface Modeling techniques.This provides cheap surrogate models of the underlying problem which can be extensively searched at significantly less cost than the true expensive functions.The main challenge with this method is to refine the surrogate models so that they provide a sufficiently accurate representation of the objective functions, at least in the optimal regions of the parameter space.
The NSGA-II algorithm was applied to the problem following Price et al. ( 2006) using krigs (Jones et al., 1998) (also known as Stochastic Process Models or Gaussian Processes) for the Response Surface Models.For the benefit of the reader we briefly describe the calibration process undertaken (Fig. 1).The method begins by evaluating an initial sampling of the parameter space using space filling Design of Experiments methods.In this case a LP τ (Sobol) (Statnikov and Matuzov, 1995) sequence was employed.Krig surfaces are generated that represent the response of the four objective measures of model fitness to data across the parameter space.For this exercise each krig was built using a maximum of 120 data points from the global data cache with each set of hyper-parameters tuned using a 5000 generation Genetic Algorithm (population size 50) followed by refinement using a Dynamic Hill Climb (5000 iterations).The NSGA-II algorithm is then applied over the surrogate models to perform extensive multi-objective searches over the design space.For this work a population size of 100 was evaluated over 150 generations of the algorithm with a maximum of 300 data points from the data cache being used in the evaluations of the stochastic process model.A feature of the krig models is that they can provide statistical measures of the uncertainties in their representation of the response function.These measures can be used to identify points of greatest RMS error in the krig and points where evaluation would be expected to give the greatest improvement.Three multi-objective searches are therefore carried out over the krigs to find (i) pareto optimal points in the design space, (ii) areas of maximum RMS error in the krigs and (iii) points of maximum Expected Improvement in the krigs.Update points are then selected and evaluated concurrently by running the expensive simulations at the chosen points in the parameter space.In this exercise, a total of 150 points were evaluated concurrently during each update phase.These were selected from the searches of the krigs in the ratio: 15 random points (to maintain diversity in the data cache), 30 pareto optimal points; 45 maximum RMS error points; and 60 maximum Expected Improvement points.The process is made possible using Condor high throughput computing (Thain et al., 2005) which en-15 ables the update evaluations of the expensive simulations to be executed concurrently across workstations that would otherwise sit idle.This work was carried out on the University of Southampton condor pool which is composed of ∼1100 laboratory work stations and office PCs running various versions of Windows operating system.The algorithm iterates over a specified number of updates, improving the training data for the krig hyper-parameters and increasing the accuracy of the stochastic process models over which the searches are performed.At the end of the process the pareto-optimal members of the global data cache are returned for further analysis.
Figure 2 shows the results of the optimization process applied to the GENIE-M model.A total of 137 members of the data cache comprise the Pareto front and are indicated by red dots in the figure.Following Price et al. (2006) we use the C-GOLDSTEIN weighting of the objectives to evaluate a single error function value for each data point.The progress of the algorithm, as measured by this single error function, is shown in the bottom left plot.The composite function weights the objectives by the reciprocal of the variance in the observational data and by the number of grid cells in the field.Since the observational data has not changed, the only difference in this measure for GENIE-M over C-GOLDSTEIN comes from the increase in the number of grid cells upon doubling the vertical layers.It is therefore noted that the doubling of the ocean levels in this version of the model has increased the bias of this measure further towards the ocean fields.The optimal point we have selected therefore sits closest to the origin of the T vs. S plot and achieves the best ocean interior at the expense of a less optimal surface humidity representation.
The version of C-GOLDSTEIN that was optimized using NSGA-II was the original version of Edwards and Marsh (2005) modified to include our new 16 vertical levels and GM parameter ssmax.The twelve "best" model parameters from NSGA-II are listed in Table 1.We obtain our equilibrium control run by running the model with the NSGA-II values for 4000 years (Control, Table 2).Metrics used to evaluate the equilibrium run are the distributions of temperature and salinity, Atlantic MOC, and Southern Ocean MOC (Table 2).The postindustrial transient control run is obtained by running the model from the preindustrial state, represented by the equilibrium run, to the year 1994 following the OCMIP-2 HISTORICAL protocol.In the protocol, the physical model state remains unchanged, while the atmospheric CFC-11 and pCO 2 concentrations are prescribed to follow observations, so that the total oceanic uptake of CFC-11 and anthropogenic CO 2 can be modeled and compared to data-derived estimates for the year 1994 (Table 2).These are also diagnostic of the physical model, because their oceanic uptake is determined to first order by constant, abiotic processes (solubility and vertical mixing).
Compared to GENIE-1, our equilibrium control run shows modest improvements in simulating the observed deep ocean ∆ 14 C as well as ocean temperature and salinity distributions (Control, Table 2).The deep ∆ 14 C is still too young though, especially in the Southern Ocean (−99%) and Pacific Ocean (−142%) compared to observations (−155% and −226%, respectively).Improvement in matching the observed temperatures is quite modest, because SST and global ocean temperatures in GENIE-1 were already reasonably good (r 2 of 0.9 and higher).The control run shows more improvement over GENIE-1 in ocean salinities, although r 2 for salinity is in general much lower than for temperature.However, we are dissatisfied with our transient control run in that it overestimates the transient tracer uptake (1.10×10 9 mol CFC-11 and 176 Pg of anthropogenic carbon) even more than GENIE-1 (0.92×10 9 mol CFC-11 and 171 Pg-C) .

Second step: targeted tuning
Given our original motivation, GENIE-M must show improvements in transient tracer metrics in particular.We therefore build on NSGA-II tuning with six targeted tunings (Season, K v prof, ATMdiff, Gasex, FWflux, and Drake) aided by some knowledge from experience about how they will affect the results (Table 2).
The single most important change that improved the transient tracer uptake was K v prof.This alone brought the model-predicted transient tracer uptake very close to data-based estimates: CFC-11 inventory is 0.68×10 9 mol in the model compared to 17 0.55±0.12×109 mol based on data (Willey et al., 2004); anthropogenic carbon inventory is 120 Pg-C in the model compared to 118±19 Pg-C based on data (Sabine et al., 2004).We note that although K v has ten times higher value at depth than above (see Sect. 4.1), the depth-weighted averaged K v is identical to K v from NSGA-II.The choice of K v prof is therefore not entirely ad hoc.The reduced transient uptake is a direct consequence of the low K v in the upper ocean in K v prof.It is an expected consequence, given that CFCs and CO 2 have relatively short timescale of air-sea gas equilibration, so their uptake is controlled predominantly by vertical exchange in the ocean rather than air-sea gas exchange kinetics.The reduced upper ocean K v has the added and significant benefit of making the deep ocean older with respect to ∆  2).One drawback in K v prof is reduced Atlantic MOC, which is now 9 Sv compared to 16 Sv in both GENIE-1 and the control run (Table 2).
In addition to K v prof, we made two modifications to arrive at our best model, which constitutes the physical model of GENIE-M.First, we implemented FWflux in order to restore the Atlantic MOC that was too low with K v prof alone.With FWflux, the Atlantic-to-Pacific freshwater flux is increased by 25% from the value given by NGSA-II (Table 1).As shown earlier for GENIE-1 (Marsh et al., 2007(Marsh et al., , 2004)), this has the effect of increasing the Atlantic MOC by increasing salinity and thus density of the Atlantic surface waters.Second, we implemented Season in order to simulate the seasonality and spatial coverage of sea ice (Fig. 3).In our control model and K v prof-only model, which lack seasonal variability in solar radiation, sea ice is largely permanent where it exists (i.e., 100% in Fig. 3a, c).With Season, polar sea ice extent is seasonal (Fig. 3b, d).In our best model (Fig. 3d), sea ice coverage around Antarctica is approximately 4.5×10 6 km 2 during summer and 36×10 6 km 2 during winter.These overestimate somewhat the satellite-derived sea ice climatology for the 1979-2000 period that shows summer coverage of 3-4×10 6 km 2 and winter coverage of 17-20×10 6 km 2 (National Snow and Ice Data Center, http://www.nsidc.colorado.edu).However, the percent change in sea ice coverage in the two seasons is comparable in GENIE-M and observation.
Even with the addition of FWflux and Season, our best physical model (GENIE-M) preserves the benefits of K v prof in terms of deep ventilation (consistent in ∆ 14 C) and transient tracer uptake (0.67×10 9 mol-CFC-11 and 113 Pg-C), while improving the Atlantic MOC, and realizing the seasonality of polar sea ice (Table 2).We did not implement the other three modifications (ATMdiff, Gasex, and Drake, Table 2), because they did not make appreciable improvements beyond the three that we implemented and we wanted to maintain as much objectivity as possible by minimizing changes made to NGSA-II optimization.

Biogeochemistry model calibration
We now calibrate our best physical model in terms of biogeochemistry.Our two main targets are global particulate organic carbon (POC) export and the global ocean distribution of dissolved oxygen (O 2 ), both important diagnostics of vertical nutrient supply and remineralization (Gnanadesikan et al., 2004).We also used NO 3 and ALK as targets but these are not as sensitive to change and therefore of secondary importance to POC export and O 2 .Model parameters to be determined are the half saturation constants in Eq. ( 4) and the sinking rate of POC.This is done by sweeping a parameter space defined by these parameters and seeking a combination of parameters that best reproduces the observed POC export and O 2 .We start by assuming that K NO 3 and K PO 4 are simply related by a stoichiometry of 16.This makes NO 3 rather than PO 4 the more important limiting nutrient, but otherwise, the two macronutrients are qualitatively very similar in behavior at this time.In the future, when iron, nitrogen fixation, and denitrification become operational, NO 3 and PO 4 will become more meaningful as independent nutrient tracers.As for K CO 2 (aq) , we use the value of 0.5×10 −6 mol kg −1 (Riebesell et al., 1993) and keep it constant, since little is actually known about CO 2 limitation and much of 19 what is known comes from the work of Riebesell.
Our parameter space is therefore defined by K NO 3 and the POC sink rate (Fig. 5).For the chosen ranges of these parameters, our model gives global POC export production between 6 and 11 Pg-C yr −1 (Fig. 5a).The range in data-based estimates using satellite products and inverse modeling is roughly 9 to 13 Pg-C yr −1 (Gnanadesikan et al., 2004;Laws et al., 2000;Schlitzer, 2002) with possibly significant uncertainties (Najjar et al., 2007).In our model, export production is sensitive to POC sinking rate within the parameter space, and it indicates that our sinking rate should be in the lower range to be consistent with data-based estimates.
Comparison of the global O 2 distribution between our model and observations (Levitus and Boyer, 1994a) indicates that spatial correlation is also higher for lower POC sinking rates (Fig. 5b).The r 2 value for the global ocean reaches as high as 0.7 for sinking rate of 50 m day −1 .
A closer look at the absolute value of dissolved O 2 in the upper ocean and deep ocean indicates that the choice of K NO 3 is also important (Fig. 6).Above water depths of 1500 m (Fig. 6a), deviation from observations becomes as small as 5 µmol kg −1 when K NO 3 is about 0.3×10 −5 mol kg −1 .Larger K NO 3 gives larger errors, as large as 30 µmol kg −1 .Below 1500 m (Fig. 6b), O 2 becomes significantly depleted with higher POC sinking rate and consequently more POC being remineralized in the deeper waters.
Given these constraints, we chose POC sinking rate of 50 m day −1 and K NO 3 of 0.34×10 −5 mol kg −1 for GENIE-M.Export POC and CaCO 3 production in GENIE-M are respectively 10.6 Pg-C yr −1 and 0.9 Pg-C yr −1 compared to 8.9 Pg-C yr −1 and 1.2 Pg-C yr −1 in GENIE-1.The r 2 value for the global O 2 distribution between model and data is 0.70 in GENIE-M compared to 0.56 in GENIE-1.In GENIE-M, the total inventory of DIC is 36.737Pg-C and of ALK is 3.2×10 18 mole-eq.

Discussion and summary
Significant improvements are achieved in GENIE-M over its predecessor GENIE-1.Whereas GENIE-1 had excessive ventilation by all accounts (∆ 14 C was too high in all basins, global CFC inventory was almost twice the observation, and anthropogenic carbon inventory was about 50 Pg-C too large), GENIE-M is consistent with all these data-based metrics (Table 2).Export POC production in GENIE-M is 10.6 Pg-C yr −1 , which is in good agreement with the data-based estimates.Dissolved O 2 in the ocean interior is modestly improved in GENIE-M over GENIE-1 in terms of model-data correlation.The improvements in interior ventilation rates as indicated by ∆ 14 C and dissolved O 2 are not unrelated, because the latter reflects a balance between the rate of ventilation that supplies O 2 to the interior and the production of POC that ultimately depletes O 2 .
It is important to note that these improvements were achieved while increasing the vertical resolution of the upper ocean, which has important implications.Again, the thickness of the topmost ocean layer is 45 m in GENIE-M compared to 175 m in GENIE-1.In GENIE-M, production occurs in the top two layers in 100 m and is modified by diagnosed mixed layer depth (Fig. 7a).The physical setting that defines production is thus better represented in GENIE-M.The reduced thickness of the topmost layer is also important in setting the right physical conditions for air-sea gas exchange.Tracers such as CFC-11 have relatively short gas equilibration time scale and surface waters are fairly close to being saturated, especially away from the polar sea ice.It is desirable for the CFC-saturated, topmost layer, to be appropriately thin in order to get the tracer inventory correct for the right reason.
In calibrating GENIE-M, we maintained objectivity and transparency as much as possible.We feel that the objective calibration of GENIE-1 (Edwards and Marsh, 2005;Ridgwell et al., 2007) gave it a rather unique place in the family of existing EMICs (Claussen et al., 2002), and we tried to maintain the uniqueness.It should be noted, however, that objective tuning is no panacea for model calibration, as Lenten 21 et al. (2006) demonstrated that hand-tuning can do better than objective tuning in some aspects GENIE-1.Our physical model calibration was based on NGSA-II optimization, and we made just two further modifications: Season, to achieve a better seasonal variability of polar sea ice and FWflux, to maintain an adequate vigor in the Atlantic MOC.In biogeochemistry calibration, we defined a parameter space defined by two key parameters, and chose a combination of these parameters that minimized the error in reproducing the dissolved O 2 distribution and global POC production.
We believe that GENIE-M, with significant improvements over its predecessor, makes it a well calibrated EMIC for studies of the global carbon cycle of both today and the recent past.The improved physical setting for surface production makes it suitable to examine the feedbacks of marine biology to climate change, a motivation that originally prompted the development of GENIE-M.
Finally it should be noted that amongst the number of anticipated future development efforts, incorporation of iron would likely have immediate benefits.Iron would help improve the spatial distribution of export production and thus interior O 2 distribution.
For example, Fig. 7b shows that the upper ocean of the eastern equatorial Pacific is low in O 2 as we expect but not low enough to initiate denitrification as observed.An accurate representation of the dissolved O 2 is critical for a reasonable marine nitrogen cycle, which in turn allows nitrogen and phosphate cycles to become decoupled.This would allow NO 3 and PO 4 along with iron to be meaningful limiting nutrients in GENIE-M.We anticipate continuing collaboration with other members of the larger GENIE community to share our codes and to avoid duplication in effort to develop the same tracer.GENIE-M: http://www.geosci-model-dev-discuss.net/1/1/2008/gmdd-1-1-2008-supplement.zip(c, d).1994 column inventories of CFC-11 in mole m −2 (e, f) and anthropogenic carbon in mole m −2 (g, h).Natural ∆ 14 C at 3000 m in ‰ (i, j).Gridded observations and data-derived estimates (Key et al., 2004;Levitus and Boyer, 1994a, b;Sabine et al., 2004;Willey et al., 2004) were regridded to the GENIE-M grid using the built-in regridding transformations in Ferret distributed by PMEL/NOAA (http://ferret.pmel.noaa.gov/Ferret/home).

Fig. 7 .
Fig. 7. GENIE-M results: (a) annually averaged, diagnosed mixed layer depth in meters and (b) dissolved O 2 in the upper 1000 m.

Table 1 .
GENIE tuning parameters.Vertical diffusion is 10 times greater at depth than at the surface, with an arctangent inflection at 2000 m and a depth-weighted average of 7.4×10