The integrated Earth System Model Version 1 (iESM v1): formulation and functionality

The integrated Earth System Model (iESM) has been developed as a new tool for projecting the joint human/climate system. The iESM is based upon coupling an Integrated Assessment Model (IAM) and an Earth System Model (ESM) into a common modeling infrastructure. IAMs are the primary tool for describing the human–Earth system, including the sources of global greenhouse gases (GHGs) and short-lived species, land use and land cover change, and other resource-related drivers of anthropogenic climate change. ESMs are the primary scientiﬁc tools for examining the physical, chemical, and biogeochemical impacts of human-induced changes to the climate system. The iESM project integrates the economic and human dimension modeling of an IAM and a fully coupled ESM within a single simulation system while maintaining the separability of each model if needed. Both IAM and ESM codes are developed and used by large communities and have been extensively applied in recent national and international climate assessments. By introducing heretofore-omitted feedbacks between natural and societal drivers, we can improve scientiﬁc understanding of the human–Earth system dynamics. Potential applications include studies of the interactions and feedbacks leading to the timing, scale, and geographic distribution of emissions trajectories and other human inﬂuences, corresponding climate effects, and the subsequent impacts of a changing climate on human and natural systems. This paper describes the formulation, requirements, implementation, testing, and resulting functionality of the ﬁrst version of the iESM released to the global climate community.

6.0, and 8.5 W m −2 (van Vuuren et al., 2011).Each scenario was produced by a different IAM using different assumptions about land-use change through the 21st century.The research design envisioned the development of a literature that included the development of Representation Concentration Pathways (RCPs) by many IAM teams using alternative underlying socioeconomic assumptions.This variety in turn would enable IAM researchers to explore uncertainty in the socioeconomic system driving emissions because, it was argued, any underlying socioeconomic system that produced a given radiative forcing pathway could be paired with the associated climate scenarios from the CMIP5 database (Moss et al., 2010;van Vuuren et al., 2011).
But as sophisticated as this interaction has become, it is still a one-way transfer of information from IAMs to ESMs (Fig. 1).It does not allow IAMs to easily examine the climate system consequences of changes in human decisions as represented in emissions pathways.Nor does it allow the changing climate system to affect the human components of energy, water resources, or land-use in a systematic fashion.Finally, it does not allow an evaluation of how differences in human decision making might affect either climate outcomes or the actual impacts of a changing physical climate system.However, the emerging observable impacts of climate change mean that we can no longer assume that human energy and land systems that produce emissions are evolving under a static climate.
It is therefore clear that future work must enable the processes in these sectors to interact with each other and the climate system rather than remain as one-way transfers of information.If ESMs are to include better representations of the feedbacks of climate change on agriculture, land use, land cover, and terrestrial carbon cycle, as well as other human systems such as energy and the economy, then they will need the ability to incorporate the human system directly.Heretofore the tools have not existed for a fully consistent representation of the combined evolution of these two systems.
In order to advance beyond this paradigm, we have developed a new model framework, the integrated Earth System Model (iESM).The goal is to create a first-generation integrated system to improve climate simulations and enhance scientific understanding of climate impacts on human systems and important feedbacks from human activities to the climate system.The first version of the iESM described in this paper is designed to address three major science questions: (1) Is the present CMIP5 "parallel process" approach to climate assessment adequate?(2) Will human activities affect local and regional climate on scales that matter?(3) Will climate change itself affect human decision making and its implications for biogeochemical and biogeophysical processes at global scales?
The iESM is a new configuration of models previously operated separately.The iESM includes the human system components of an integrated assessment model called the Global Change Assessment Model (GCAM) (Kim et al., 2006;Calvin, 2011;Wise et al., 2014), the complete Community Earth System Model (CESM) (Hurrell et al., 2013), and the Global Land-use Model (GLM) (Hurtt et al., 2011) for rendering GCAM output onto the spatial grid and transforming land-use information for use by the Community Land Model (CLM) component of CESM (Lawrence et al., 2011;Oleson et al., 2013) (Fig. 2).GCAM and CESM are both community codes, and the resulting iESM is also being released to the global climate community.The iESM includes both one-way and two-way communication of fluxes and feedbacks among the components of the energy and land-use systems from GCAM, as well as the incorporation of their physical consequences for both biogeochemical and physical fluxes in CESM.This allows the investigation of the degree to which this linkage may change the evolution of the climate system over decades to a century.We have used the integrated Earth System Model (iESM) to investigate the climate impacts on human systems and important feedbacks from human activities to the climate system.The iESM results on impacts and feedbacks are described in a series of earlier and companion papers (e.g., Jones et al., 2012Jones et al., , 2013)).
This paper describes the scientific rationale for the construction of the iESM (Sect.2), the component models assembled to create it (Sect.3), the requirements on the assembly process (Sect.4), the technical implementation of the model (Sect.5), and the procedures used to validate the linkages among the component models and ensure the integrity of the coupled system (Sect.6).The paper concludes with future plans for further extensions and applications of the iESM (Sect.7).

Climate change impact on energy demand, supply, and production
Climate change can influence energy demand, supply, and production in several major areas.Energy demand for adaptation and mitigation measures may also increase under climate change (van Vuuren et al., 2012).Integrated assessment (IA) models can be used to explore consequences and responses of energy systems to climate change.In the IA modeling community, however, energy supply and demand are normally modeled based on historical conditions, and climate change impacts are rarely incorporated except in a static manner.Although some efforts have begun to explore climate change impacts on the energy system using IA models (Voldoire et al., 2007), only one-way coupling is usually employed, and the interactions between the energy system and climate are seldom addressed.Two-way coupling between human and Earth systems would be required to examine the impacts of climate change on (1) building energy use, (2) renewable energy potential, and (3) energy production (thermal power plants) and transmission, each of which is described in greater detail below.

Building energy use
Climate change can have important impacts on building energy systems through decreased heating and increased cooling.Previous studies are limited in addressing the effect of a changing climate on building energy demands while simultaneously considering other energy sectors in the underlying human systems.In recent years, the impacts of climate change on building energy use have been evaluated using IA models by constructing estimates of heating and cooling degree days from air temperature outputted from climate models (Isaac and van Vuuren, 2009;van Ruijven et al., 2011;Zhou et al., 2013Zhou et al., , 2014;;Yu et al., 2014).The feedback from the climate on the energy system was calculated from climate model output in advance using a one-way coupling scheme, and the impact of these changes in the energy system on climate was rarely considered in these studies.One exception is the study by Labriet et al. (2013), in which the climate change and building energy use was fully coupled with IA and climate models.However, the spatial resolution of climate outputs from this coupled modeling system was low (5 • ), and it may limit the understanding of climate change impact on building energy use.

Renewable energy production
Renewable energy plays an important role in the energy system at the regional and global levels, and it can be influenced by climate change to a large extent.In current IA modeling efforts, the availability of renewable energy (i.e., wind, solar energy, and hydropower) and its economic potential are either modeled according to the historical condition (Zhang et al., 2010;Zhou et al., 2012) or exogenously quantified using proxies such as precipitation or runoff (Golombek et al., 2012).However, renewable energy resources, such as wind and hydropower, are dependent on the local climate that can be very different from current or historical conditions under climate change.For example, previous studies found that both wind speed and variability show changing trends in the historical time period (Holt and Wang, 2012;Zhou and Smith, 2013) that can impact wind energy potential.Climate change can also alter future photovoltaic and concentrated solar power energy output through changes in temperature and solar insolation (Crook et al., 2011).Hydropower potential can be influenced by precipitation and runoff changes under climate change, and previous studies found changes in hydropower potential under climate change globally and regionally (Hamududu and Killingtveit, 2012;de Lucena et al., 2009).Interactions between climate change and bioenergy are more complex because of changes in variables such as land use, which will in turn alter surface albedo and feedback on the climate (Schaeffer et al., 2006).The change of future renewable energy under climate change is rarely captured in current IA models, and the subsequent feedback of energy system change on climate systems has not been explored.

Energy production
Climate change also has important impacts on energy production, especially thermal power plants, which are influenced by the temperature of water used for cooling (Ruebbelke andVoegele, 2013, 2011) and which might also face limits to water availability in some cases.Increasing air and water temperature under climate change can reduce the efficiency of power plants.For example, it was found in a previous study that a 1 • C increase in temperature can reduce the supply of nuclear power by about 0.5 % (Linnerud et al., 2011).In some extreme cases such as droughts and heat waves, power plants may not be able to meet temporary demand and may even shut down.Moreover, climate change such as extreme weather events and higher temperature also influence transmission lines through disruption of infrastructure or reduction of efficiency.The impact of climate change on thermal power plants was normally evaluated without consideration of the changes in other sectors in energy system in previous studies (van Vliet et al., 2012;Foerster and Lilliestam, 2010;Ruebbelke and Voegele, 2013).Therefore, these studies of climate change impact on energy production are necessarily limited without a more comprehensive understanding of the human system.IA models provide the possibility to evaluate the climate change on energy production in a comprehensive way.For example, a simple assumption has been made to evaluate the climate change impact on thermal efficiency of power plants in the study by Golombek et al. (2012), although there was still no feedback of change in energy system back to the climate in this study.

The Community Earth System Model (CESM)
The starting point for the team's development efforts was version 1.0 (now 1.1) of the Community Earth System Model (CESM).CESM is a community code and may be downloaded from the Community Earth System Model Project (2014) (URL in references).
The CESM uses a flexible coupler that couples the atmosphere, ocean, land, and ice component models.Components often use different grids, and the coupler performs the necessary interpolation of fluxes and state variables.The CESM system comprises the Parallel Ocean Program, version 2 (POP), the Community Land Model, version 4.0 (CLM 4.0), the Los Alamos sea-ice model (CICE), the Community Atmosphere Model, version 5 (CAM), and the Community Ice Sheet Model (CISM).POP and CICE are finite volume codes with semi-implicit and explicit time integration and are implemented on logically Cartesian meshes that are stretched to embed polar singularities in land regions and thereby remove these singularities from computation.
The CAM model has flexible formulations for atmospheric dynamics, and it has recently transitioned to the spectral finite element method coupled to an extensive suite of subgrid physical parameterizations in its standard configuration.CAM runs on unstructured quadrilateral grids.The CLM contains a suite of column process parameterizations running at each grid point with no communication between grid points.CLM 4.0 represents surface and subsurface water, energy, carbon, and nitrogen dynamics with a nested hierarchical sub-grid treatment that allows glaciers, lakes, urban areas, agricultural fields, forest, grassland, and shrubland to share space on each grid-cell.Incident radiation is inter- In the fully-coupled configuration, the CICE and POP component models run with a nominal displaced-pole grid spacing of 1 • (approximately 110 km at the equator and 30 km in polar regions) and, for POP, 42 levels in the vertical.The CAM and CLM models run with grids with 0.9 • × 1.25 • resolution with 30 and 10 vertical levels, respectively.CLM also includes a separate vegetation layer.The output of the CESM consists of monthly means of several hundred quantities, plus daily averages of a subset of these quantities and hourly output of some key variables.

Development of land-use and land cover change representation in CLM4
A mechanistic representation for the influence of land use and land cover change (LULCC) on carbon, nitrogen, water, and energy cycles was developed and implemented for the CMIP5 land-use harmonization (Oleson et al., 2010;Lawrence et al., 2012).This approach is designed to operate on the land-use data stream provided by the GLM code after translation from the four basic land cover types of GLM into the 18 plant functional types (PFTs) of CLM.The CLM LULCC approach recognizes net annual losses and gains of vegetated area for each PFT within each grid cell.Net loss is treated as a reduction in PFT area with biomass densities kept constant; net gain is treated as an increment in PFT area with the introduction of very low initial carbon density on the new area.For PFTs with existing area on a given grid cell, net gains in area extend the existing area and expand the existing biomass to cover the new area.This dynamic LULCC in CLM 4 is one of several anthropogenic forcing factors influencing global biogeochemical cycles and surface energy balance and has been extensively evaluated (Shi et al., 2011(Shi et al., , 2013;;Mao et al., 2012aMao et al., , b, 2013)).For iESM, the time step of GCAM was reduced from 15 to a 5-year standard with flexible time-step capability.This capability is important for scale consistency and compatibility with CESM code.In addition, the land component, which simulates supply of land products (food, energy, fiber), was completely reformulated to follow functional forms that define productivity as a function of geographic location, climatic conditions, and inputs, and thus made more consistent with physical earth system parameters (Wise and Calvin, 2011).A higher spatial resolution dataset was compiled to allow for land productivity simulation in 151 global regions (Kyle et al., 2011).Finally, the post-processing code to downscale human emissions of CO 2 from the GCAM 14-region scale to a CAM-compatible grid was redeveloped and ported to the CESM by the iESM development team.The downscaling of short term forcers is currently under development.

The Global Land-Use Model (GLM)
The Global Land-Use Model (GLM) is a tool for computing annual, gridded, fractional landuse states and all underlying land-use transitions, including the age, area and biomass of secondary (recovering) lands and the spatial patterns of wood harvest and shifting cultivation, in a format designed for inclusion in Earth System Models (Hurtt et al., 2006).GLM computes these land-use patterns using an accounting-based method that tracks the fractions of cropland, pasture, urban area, primary vegetation, and secondary vegetation in each grid cell as a function of the land surface at the previous time-step.The solution of the model is constrained with inputs and data including historical reconstructions and future projections of land use (e.g., crop, pasture, and urban applications), wood harvest, and potential biomass and recovery rate.GLM is publicly available and may be downloaded from the University of Maryland Global Ecology Lab (2014) (URL in references).
GLM was selected as the primary tool to provide harmonized land-use datasets (Hurtt et al., 2011;Brovkin et al., 2013) for the CMIP5 experiments (Taylor et al., 2011) as part of the Fifth IPCC Assessment Report (IPCC, 2014).For this project GLM was used to compute the land-use states and transitions annually, for the years 1500-2100, using data from Integrated Assessment models for the years 1850-2100.GLM provided a continuous time- series of land-use data at half-degree spatial resolution in a format that could be used by a variety of ESMs consistent with both the historical data and future data from IAMs utilizing data from a variety of IAMs.Further information on this application of GLM is available from the Land-use Harmonization Project (2014) (2014; URL in references).
For use in iESM, GLM was modified to use GCAM data on a 5-year time-step and to accept data partitioned by GCAM's 151 agri-ecological zones instead of GCAM's 14 socioeconomic regions.In addition, GLM was altered to use the forest area data from GCAM and to spatially rearrange agricultural area within each AEZ to match potential forest area changes from GCAM.

Requirements for the coupling among GCAM, GLM, and CESM1
To ensure that the iESM is reliable, flexible, and extensible, its technical implementation follows from an extensive set of requirements that are detailed below.

Implementation of iESM as an extension of CESM
The primary goal of the development is to implement the iESM as an extension of the CESM to include a human dimension component.This requirement implies that the integrated assessment model is treated as a new component in CESM and the protocols applied to the five existing components are adopted for the human component as well.To conform with these protocols, the human dimension component has been integrated into CESM's software environment, including CESM's configure and build procedures, execution protocols, input and output conventions, and regression testing procedures.The execution protocols include CESM's procedures for synchronizing the coupling and time stepping of its various components and for exchange of fields among these components that conform with the conservation laws (e.g., conservation of mass) governing the dynamic evolution of the whole system.
The developers have also ensured that the iESM conforms to CESM's standards for repeatable experiments, including exact restarts and use of machine-independent repre- sentations for the initial, boundary, and restart data sets.CESM has adopted the Network Common Data Format (NetCDF) for these data sets to utilize its features for representation of numerical fields that can be transparently exchanged across computational platforms.This is complemented by the requirement that iESM conform to CESM's standards for hardware and software portability.This requirement helps ensure that experiments with iESM are, in principle, strictly repeatable assuming that the underlying software and hardware configuration has been validated by the CESM project.In practice, a precise description of the boundary and initial conditions, together with a detailed description of the model and its functionality, are needed to attain experimental reproducibility.To address this need, it follows that the functionality of the human dimensions component should be clearly and comprehensively documented.The documentation should encompass individual pieces such as GCAM, GLM, the Land-use Translator (LUT) code, as well as the pre/post processing code which operates on the data exchanged within the human dimensions component.

Flexible modes of execution
The second principal goal is to incorporate and extend CESM's flexible modes of execution to iESM.The flexibility has two main dimensions: first, the trade-off between the physical completeness and complexity of the model and its execution speed; and second, the equivalence between two-way communication between components with the introduction of feedbacks through their joint interaction.The first type of flexibility is realized by incorporating several versions of each critical component that range from very simple to very complete representations of the component dynamics, with a corresponding range from inexpensive to intensive computational resource demands.The second type of flexibility is implemented by introducing versions of each component that either produce the same output state (e.g., a climatology read from data file) regardless of the input state, or that compute an output state based on the input state combined with its evolution equations.The omission or inclusion of two-way communication corresponds to the omission or inclusion of feedbacks between the given component and the rest of the model system.Both types of flexibility are realized by incorporating three basic versions of each component known as "stub", "data", and "active" versions.The "stub" version is used primarily for automated testing of the system integration and performs some very rudimentary housekeeping functions in response to commands from the integration layer of the whole CESM.The "data" version produces a time-evolving state through spatial and/or temporal interpolation applied to a fixed time-dependent input read from data files.The same state is reproduced regardless of the evolution and dynamics of the remainder of the coupled system.This version is computationally inexpensive but does not include the two-way feedbacks between the given component and the rest of the system present in the real world.The "active" version produces a time-evolving state governed by its initial conditions, a representation of the fundamental dynamical equations that pertain to that component, and the boundary conditions supplied by the rest of the coupled model system.This version is computationally intensive but includes the two-way feedbacks present in the real world.
To conform with this protocol, the iESM includes stub and data version of the human dimensions component, as well as the fully interactive assessment model GCAM.The stub and data versions are automatically tested to ensure that they are integrated and operating correctly using the same general test procedures applied to the existing components of CESM.

Bilateral exchange among components of the coupled system
CESM utilizes a set of standard protocols to implement bilateral exchange among components of the coupled system, and these protocols have been adopted for internal communications within the human component as well as including GCAM, GLM, the LUT code that prepares GLM output for input into CLM, and the associated interfaces.These protocols ensure that the modes of interaction and exchange among components are visible, reproducible, flexible, and extensible.
The visibility follows from the requirement that all fields are exchanged through a single, top-level, standardized communication mechanism.This mechanism is capable of recording all information input to and output from all model components, together with the operations performed by the coupling layer to enable the exchanges.This capability also ensures that the interactions are strictly reproducible, since all exchanges are managed and recorded by one standardized communication mechanism.
This mechanism can be configured at run time to add arbitrary numbers of fields to be exchanged among any given pair of components.This ensures that the communication protocol can support increasingly complete and complex interactions among components using the same well-tested framework, without the need for lengthy modifications to the underlying software.
iESM has adopted these conventions for exchanging information to integrate the functional parts within the human dimensions component and, ultimately, to couple the humandimensions component to other components in CESM.In the first implementation, the data passed between the human dimensions components and the rest of CESM are exchanged using data files to minimize the modifications to the existing CESM components.However, these data exchanges can be readily upgraded to the standard top-level interfaces, internal memory, and message passage adopted by the rest of CESM.
This solution automatically includes provisions for exchanging additional data, in particular the exchange of more or all of the forcing agents covered by the RCP handshake protocol (tntcat.iiasa.ac.at:8787/RcpDb/).The information exchanged at the interfaces between the human component and the rest of CESM depend on the precise experimental configuration.However, the interfaces themselves are invariant under changes in configuration to guarantee that a single set of communication software can be thoroughly and repeatedly tested and validated.

Methodologies to treat the ranges in spatial and temporal resolution across iESM
The integrated assessment model solves for the evolution of the human system using a fundamental assumption of market quasi-equilibrium, namely that the inputs and outputs into energy generation, food production, and land resources are balanced on sufficiently large spatial and temporal scales.The length and time scales required for the market equilibrium This disparity introduces a requirement on the design of the iESM, namely to implement a flexible and extensible mechanism to handle differences in spatial and temporal resolutions between the human and physical components.To meet this requirement, iESM should include capabilities in temporal interpolation or accumulation (e.g.time averaging, or other operations) in all the interfaces depend on the ratio of time steps between the transmitting and receiving components linked by the interface.Similarly, spatial interpolation or accumulation should be included with the recognition that some preprocessing may be needed to prepare input datafiles to manage regridding.
These capabilities are consolidated into the interfaces among the human component and the rest of the CESM system to avoid "hard wiring" any assumptions about gradations in resolutions into the components themselves.The efficient exchange of data across different spatial grids is highly contingent on efficient communication between components and within a single component on highly distributed and massively parallel supercomputers.The interfaces are therefore based upon a common foundation of communication infrastructure that has been optimized to maximize computational throughput.In turn, the exchange of data between components operating on very different timesteps introduces a strong dependency on the time management procedures for the whole coupled system.This dependency has been satisfied by completely prescribing the sequence of component execution, the interlaced calls to the interfaces, and the interpolation/accumulation operations in each interface call.
While CESM is designed for hybrid execution in any combination of serial and/or parallel execution of its various components, in the initial version of iESM the human component is run in serial mode.This mode of operation is necessitated by the multi-year timestep of GCAM.Since the version GCAM used in iESM runs as a single-threaded application while the rest of the CESM is inherently multi-threaded, the processor elements devoted to the non-human components are idle while GCAM is run for a single timestep.This in- troduces the risk that iESM utilizes computational resources much less efficiently than the parent CESM.It is therefore necessary to evaluate the relative cost of the human dimensions component to ensure it is not a performance or memory bottleneck and refactor or parallelize code as needed.Fortunately the overall CESM performance is only marginally impacted by the introduction of this serial code.

Dual use capability and single code repository for GCAM
GCAM and GLM, like the other components of CESM, are research codes and are therefore under continual development and extension by their primary developers and by the wider GCAM and land-use communities.Recent developments include significant new capabilities directly relevant to studies of human-Earth system interactions, for example the introduction of supply and demand for water resources (Hejazi et al., 2013(Hejazi et al., , 2014)).In order to ensure that the human dimension capabilities of iESM stay current with IA science, the iESM developers have chosen to enhance GCAM and GLM so that these models could both run in their standard stand-alone modes or as parts of the iESM.Once these enhancements are incorporated in the main GCAM and GLM repositories, GCAM and GLM have dual-use capabilities as stand-alone models or elements of iESM, and these capabilities would be easily propagated to future versions with new scientific features of interest to both the GCAM and iESM communities.These future versions can then be extracted from the respective repositories to easily update iESM.
This design introduces several subsidiary requirements for the input to and output from GCAM and GLM.First, GCAM's and GLM's native input and output procedures must be extended as needed to perform file I/O in stand-alone mode to exchange data that is compatible with CESM.This in turn requires introducing input and output interfaces into GCAM and GLM that generalize the methods for information exchange to include message passing.As a result, the results from GCAM and GLM are indistinguishable whether using files or inline communication techniques to exchange data with the rest of iESM.One of the challenges in constructing iESM is the complexity of the historical land-use and land-cover data required for the downscaling operations performed by GLM.In order to create a much simpler and more robust run-time environment for the GLM component, several important modifications are necessary.These include collating and converting the numerous input and output data sets into a much small number of NetCDF files.It was also helpful to standardize GLM's control interface to provide a simple and robust way to manage GLM functionality.To reduce the considerable demands for memory from GLM, it was necessary to refactor GLM's data and control structures as needed to reduce its large in-memory footprint.Because CESM must meet a requirement for exact (bit-for-bit) restarts, it was necessary to extend GLM's functionality to add a restart capability.

Reproduction of the offline-coupled implementation of iESM
To the extent feasible, it would be advantageous to have the coupled iESM reproduce the offline-coupled implementation using separate models.To meet this requirement, it is necessary to construct tests ensuring that the data exported by each interface agrees with the corresponding information exchanges in the offline-coupled implementation to the precision of the standalone implementation.In turn, these tests are based upon and therefore require a core level of state output and diagnostics to allow iESM to be validated against relevant observations and documented CESM/GCAM/GLM control runs.

Implementation of the coupling among GCAM, GLM, and CESM
The first phase of iESM code development was designed to update and codify the experimental protocol from CMIP5 to incorporate land use change and emissions of GHGs and SLS from GCAM into CESM, such that the models exchanged information at each time step rather than as a single, full-century pass at the start of the model future period (2005) The IAC is currently visible only to the land model when run in iESM mode and drives prognostic land use change.Because the functionality of GCAM-GLM is encapsulated within a CESM component, it can also be replaced by a data-model, enabling testing with a range of integrated assessment models.
Code modifications were made to GCAM such that the model looks to CESM for instruction on when to begin each new time step.Thus, the first version of the coupled model operates by GCAM projecting land use, then CESM projecting climate and ecosystem change and returning productivity information to GCAM, which then incorporates that information into the land use decisions for the next time step.
The code has been tested and is running on leadership-class computing facilities at ORNL (the Titan Cray XK7) and NERSC (the Hopper Cray XE6 and Edison Cray XC30) and has also been tested and configured on the DOE IARP cluster at PNNL/UMD (Evergreen).The iESM code has also kept pace with current CESM versions, and was most recently updated (in summer 2013) to run with CESM 1.1.2,the most recent CESM release with a full carbon cycle spin-up available.
Scientific challenges were encountered in the design of the coupling between CESM and the IAC component, specifically with regard to faithfully representing CESM's land productivity passed into the IAC as well as capturing the land-use change being returned.These challenges were identified and solved through a series of soft-coupled runs where the project team ran each model one time-step forward at a time and passed model output between them, as well as a series of offline, CLM-only runs with the IAC enabled.In this fashion, the coupling steps were refined while the software development was under way.This first development phase focused on the land-use change components of the models.In parallel, algorithms to downscale GCAM GHG and SLS emissions have been developed and tested, and the code has been transferred to the development team.

General IAC implementation
The IAC is implemented like a standard component of CESM.The IAC component has stub, data, and active versions called SIAC, DIAC, and GIAC, respectively, that provide a range of functionalities and capabilities for the IAC component.The active IAC version (GIAC) is fully prognostic and runs the full suite of IAC subcomponents to produce dynamically varying land use/change data using carbon feedback scalars from CLM.The data component (DIAC) replaces the active GCAM/GLM submodels with data derived from an offline IAM/GLM control run.It is currently used for testing and model spin up, but in principle it could be used to force CESM with an arbitrary scenario, for example one of the three CMIP5 RCPs generated by an IA model other than GCAM.The stub component (SIAC) serves the same purpose as a CESM stub model, namely to serve as a placeholder to satisfy interface requirements when the active or data component is not being run.The stub IAC is the default mode for CLM, which makes it 100 % bit-for-bit backward compatible with the current CLM.
Like other CESM components, IAC has routines to initialize its state, execute by evolving forward in time, and complete its operations by communicating its new state and generating history and restart (check-point) files.While these routines do not satisfy all aspects of the current CESM interface standard, they could be readily modified to do so in the future.The checkpoint/restart mechanism built for the IAC meets the CESM requirements for exact restarts to facilitate long integrations of the model system.Following the template of other CESM components, the IAC has a built-in clock, a top-level interface that mimics a CESM component, a centralized collection of control information implemented via a standard Fortran namelist, and a set of clock, grid, control and field parameters defined in a shared module for query by and exchange with other parts of the model system.All the coupling within the IAC is done via internal memory.
While the IAC was initially implemented as a separate component in CESM, we have placed the IAC component beneath the land model, since the all the coupling in the initial version of iESM would involve the CESM land component.Because we are using a mature coupling strategy, we can easily reposition the IAC component as needed in the future.But for the moment, the IAC is implemented as an option in CLM, and therefore the IAC model resides in its own subdirectory within the main code base for CLM.The stub, active, or data mode of IAC is set via the CESM configuration files.When this mode is set to stub, the results from iESM are identical at the bit-for-bit level with the corresponding version of the conventional CESM.All the input data sets and namelist parameters for the IAC are set by enhanced versions of the namelist-generation procedures for CLM.
In the current iESM, the IAC is built as part of the compilation of the CLM code.The procedure that builds CLM calls scripts that build the IAC model.The IAC compilation is done for the stub, data, or active version of the IAC model depending on the mode specified by the user.Most of the IAC code is written in Fortran 90 or C, and leverages the CESM makefile.When the active IAC model is specified, GCAM is built via GCAM's build scripts that have been modified slightly to support coupling while retaining support for GCAM's implementation in C++.Currently, coupling between the IAC and CLM components is done via data files to leverage current CLM capabilities and to minimize changes to CLM.The IAC reads data from CLM history files at the start of a time step and writes data to a time varying surface data set at the end of the IAC timestep.Both sets of data evolve in time as the coupled system advances.

IAC design
The IAC component consists of five subcomponents, including the models GCAM and GLM and the interfaces IAC2GCAM, GCAM2GLM, and GLM2IAC between these models and the rest of the IAC component (Fig. 3).The sequence in which these subcomponents is invoked starts with IAC2GCAM, proceeds through GCAM, GCAM2GLM, and GLM, and concludes with GLM2IAC.Each sub-model is called in turn, processing CLM carbon information at the beginning of the sequence and eventually producing an updated land state that will be read by CESM throughout the model year (Fig. 4).The computational load of the IAC is dominated by GCAM and GLM, with the remaining subcomponents handling the processing needed to connect those models to each other and the rest of CESM.The IAC component includes the capability to read and write data between each step, thereby facilitating validation of each piece of code against corresponding offline versions and enabling detailed debugging for any differences revealed by the validation process.This validation and diagnostic capability has been implemented using NetCDF files to ensure the data exchanges are both self-descriptive and machine-independent.

IAC2GCAM
The IAC2GCAM interface translates and remaps gridded information from CLM on its terrestrial carbon state into regional scaling factors (scalars) for crop yields and ecosystem carbon densities used by the agriculture and land-use module internal to GCAM (Bond-Lamberty et al., 2014).The scalars represent our initial attempt to reconcile the separate carbon inventories either explicitly computed by or implicitly embedded via boundary data in the CLM, GCAM, GLM and interface routines.In this initial version of iESM, the input to IAC2GCAM is read from CLM history files and includes the fields listed in Table 1.The output consists of scalar fields for 27 crop and land-cover fields on each of GCAM's 151 land units.The remapping between the CLM grid and GCAM regions is accomplished by translating CLM carbon, defined in broad terms of vegetative functional types, to the 27 specific GCAM crop/land types that lie at the heart of its economic, energy and land-use parameterizations.In addition to mapping between different land representations, IAC2GCAM also handles the temporal interpolation and spatial aggregation that is needed to represent CLM's gridded data in terms of the annually averaged regional values that GCAM requires.The spatial regridding process is aided by an external data file that specifies the areal overlap of CLM grid points with the GCAM land units.The mapping of CLM carbon to scalars applied to GCAM above and below-ground carbon is accomplished by averaging over the GCAM time step and then post processing to remove outliers (Bond-Lamberty et al., 2014).

GCAM
The GCAM model produces worldwide land-use projections incorporating information about demographics, economics, resources, energy production, and consumption (Sect.3.3.2).Integration into iESM requires modifications to GCAM, including the addition of lightweight interface routines to CESM and the provision to share data in its XML database with these interface routines.In the current version of iESM, the input into GCAM consists of 27 crop and land cover scalars.The output from GCAM to the rest of the IAC component comprises the land surface areas for crop, pasture, forest, and the amount of harvested wood carbon for each of GCAM's 151 land units.

GCAM2GLM
The GCAM2GLM interface serves to allocate GCAM output from 151 land units to the 0.5 • GLM grid.In the process, it also harmonizes the GCAM output to provide a smooth transition from historical land-use data to future projections.The harmonization and regridding algorithms are based upon GLM historical simulations and the 2005 HYDE 3.0 historical land use data set (Klein Goldewijk et al., 2011).The inputs into the interface are the projections of crop, pasture, and forest area, as well as the amount of harvested wood carbon for 151 GCAM land units at the 5-year GCAM timestep.The outputs from the interface are the areal extents of cropland, pasture, and forest at annual time steps on the GLM halfdegree grid, together with a pre-processed version of the wood harvest data readied for spatial allocation within GLM.The GCAM2GLM processing is contingent on the climatechange scenario under consideration and has embedded priorities for how the fractional areas of crop, pasture, and forested land are allocated.For example, these priorities could dictate that agricultural expansion happens preferentially on forested lands.A mechanism for recording and readily altering these embedded allocation priorities should be included in future versions of iESM.

GLM
In terms of its interactions with the rest of the current IAC components, the GLM model converts the annualized fractional land-use states output by GCAM2GLM into gridded data sets suitable for input into CLM, while also computing the spatial pattern of wood harvest area and the area of natural vegetation occupied by both primary and secondary vegetation.GLM converts the GCAM2GLM output data into a variety of fields on its native half-degree grid, nine of which are currently utilized in iESM including five wood-harvest categories (Table 2).GLM also calculates gross land-use/cover transitions within each year, but these are not used by CESM.
Integration into CESM has required extensive modifications to GLM, including the redesign of data structures to reduce memory requirements and to accommodate control by CESM of its temporal evolution.Other modifications include the addition of restart functionality, the introduction of a control interface, the conversion of all boundary data into NetCDF, and the provision for routing all input and output through the calling interface.

GLM2IAC
The GLM2IAC interface is tasked with converting the harmonized outputs of GLM to timevarying data sets for land cover and wood-harvest area in CLM's native input format.The translation of GLM state and harvest variables to CLM land cover is based on code (Lawrence et al., 2012) to process the CMIP5 RCPs (Moss et al., 2010;van Vuuren et al., 2011), as well as on the external tool called mksurfdat (a contraction of "make surface data") used to generate CLM boundary data for the standard CESM.Both codes were inlined into the IAC component and are run interactively.The original land translation code has been extensively modified to better capture the afforestation signal generated in GCAM and has been renamed the Land Use Translator (Di Vittorio et al., 2014).

CLM modifications
Although CLM and the rest of CESM require minimal modifications to incorporate the IAC component, CLM was modified to permit updates to its time varying input surface data sets after its initialization phase.This modification required introducing some changes in order to reread the time axis of the dynamic surface data set during the execution phase of CLM.

Time stepping
The IAC component advances in one-year time steps and is called at the start of each calendar year.During this call, every sub-subcomponent in the IAC component is executed in order to prognose the time-varying CLM land surface data sets starting from the current CESM time step and ending one year into the future.To accomplish this, the IAC calculates the land surface for the time step one ahead, then CLM interpolates between the current and future land surface at its native 30 min time step.In between the yearly IAC timesteps, the IAC component is called monthly from CLM to create an annual average of CLM NPP and HR values.The GCAM subcomponent can be integrated using either one or three sequential 5-year time steps.The default is to use a 5-year time step and interpolate the yearly data needed for the rest of the IAC sub-components.Prior to each GCAM call, the IAC computes the carbon scalars that constitute the feedback between CESM and GCAM.

Technical issues
Several technical requirements and protocols specific to large climate codes and CESM had to be introduced with the IAC component.The IAC component is bit-for-bit reproducible when rerun, and it restarts exactly from check-point files generated by previous runs.The IAC component is included in the CESM code repository and is tagged regularly in order to track code versions.A specific numerical experiment using the IAC in CESM can be described by specifying the model tag, the compset (which determines the model components), the grid, and a set of plain-text files that specify the features and input setting for the CESM components.The CESM configuration scripts have been augmented for iESM to To facilitate running the IAC with different CLM grids, many of the IAC settings are specified via namelist or read from files specified at run time.All the output data is written in NetCDF to ensure portability across computing platforms and to exploit the selfdocumenting features provided by this format.All variables are given explicit types, real variables are assigned to a type of double precision wherever possible, and the Fortran code complies with the CESM coding standard and is written in Fortran 90.Because the GLM and GCAM are written in C and C++, Fortran/C interfaces have been implemented in several parts of the IAC component.

Validation of the coupling among GCAM, GLM, and CESM
One of the core requirements of the iESM design is to reproduce simulations conducted with the offline-coupled version of the same codes.Satisfaction of this requirement implies that the online-coupled simulations with iESM would be statistically indistinguishable from the offline-coupled simulations.Since the offline-coupled experiments have been configured to emulate the large number of simulations conducted using the same suite of codes for the CMIP5, successful reproduction of the offline-coupled runs would mean that the iESM user community could employ the large literature analyzing the CMIP5 runs to understand the baseline (or control) climatology and climate dynamics of iESM.Since iESM includes a variety of bug fixes and enhancements relative to the offline-coupled model configurations, the emulation will be only approximate.
The tests to verify the degree to which iESM reproduces the offline-coupled model have been conducted in three stages.First, with the exception of GLM, each component in iESM has been checked separately to show that, given the same input, the output of that component matches that of the corresponding component in the offline-coupled system to within the limits of machine precision (Sect.6.1).In the case of GLM, there was extensive refactoring of the code as well as conversion of many boundary datasets to NetCDF that resulted in differences that were greater than roundoff.Second, the development team has compared key climate properties from the iESM and offline-coupled system and has shown that differences between the two simulations of these properties are statistically indistinguishable from internal variability (Sect.6.2).Third, the iESM team has validated the land-use, land-cover change, and CESM climate generated from the newer coupled iESM experiment using the CESM standard model diagnostics as well as specially constructed and quite comprehensive diagnostics for each of the components.Application of these diagnostics is covered in the papers describing the various iESM experiments and will not be repeated here (e.g., Jones et al., 2012Jones et al., , 2013)).

Verification of the interfaces among components
These tests consist of comparing offline runs of each sub-component of the offline-coupled implementation and online runs of iESM using the same forcing.To facilitate these tests, the iESM designers have allowed each of the components (GCAM, GLM, LUT, GCAM2GLM, etc.) to continue writing the state and diagnostic files that were output in the original offline models.Additionally the data flowing between each of the component models was captured and written out in double precision NetCDF format.The ease of tracking the data flowing between each component as well as the ability of the component developers to continue using trusted analysis tools have allowed the iESM team to verify that the results produced by the offline and online versions of each sub-component are, in general, identical to within the machine roundoff precision of the underlying calculations.
Once the individual pieces were validated, the team forced the IAC with prescribed CLM history output and compared the offline-coupled runs to the online runs with identical forcing.These simulations were designed to test that the feedbacks from the ESM to IA subsystems of the iESM are as identical as possible between the offline-coupled and online versions.Both the offline-coupled and online IAC systems were subjected to the same external forcing from CLM, and the resulting dynamic surface datasets from both IAC versions were then compared.The team was able to verify that the results were identical to singleprecision roundoff.
Finally, this test has been repeated with consistent and uniform time synchronization between CLM and IAC.Since the original test (described above) was forced with prescribed data, it did not ensure that the the temporal interactions between CLM and IAC were correctly reproduced in the online version relative to the implementation of the same interactions in the offline-coupled version.The team enhanced iESM to guarantee the same temporal interaction between CLM and the IAC in the two versions and also provided an alternative, reduced length GCAM timestep of 5 years duration.The iESM also passed this more realistic test of its normal mode of operation, one in which there is cyclic two-way interaction between CLM and IAC coordinated by the master timing mechanism of the whole online model system.

Comparison of climate states from uncoupled and coupled versions of iESM
In order to test whether simulations from the offline-coupled and online iESM are statistically indistinguishable, we conducted a pair of integrations with these two versions of iESM based upon the RCP4.5 scenario.In these simulations, the copies of GCAM in both the offline-coupled and online versions are subjected to the same exogenous drivers and policy specifications that were used to create the original RCP4.5 scenario used in CMIP5.The two runs produce nearly identical future trajectories for global mean surface air temperature.To formally evaluate this, we projected the time and space varying surface air temperature trajectories from these two simulations onto the spatial warming fingerprint (Santer et al., 2004) derived from the CCSM4 RCP4.5 CMIP5 ensemble mean, yielding a time series of projection coefficients for each simulation.We performed the same projection for each of 6 CCSM4 RCP4.5 ensemble members in order to quantify model internal variability with respect to this metric.Variation between the offline-coupled and online-coupled simulations, either in terms of the spatial pattern of warming or overall warming trend would cause these two trajectories to diverge.However, only 5.1 % of the coefficients differed by more than the 95 % confidence interval for unforced variability across the 6-member ensemble of CCSM4 RCP4.5 simulations (Fig. 5).The unforced variability for the ensemble is generated by small perturbations to the date used to extract initial conditions from the end of a historical simulation terminating at the present day, and the resulting variability is manifested by different synoptic-scale weather but identical global climate across the ensemble.This test demonstrates that global-mean differences between the simulations from the offline-coupled and on-line versions of iESM are statistically indistinguishable from weather-related noise.

Conclusions
Several extensions to the iESM are already under development.First, capabilities have been developed for energy-sector components of the model to respond to climate change.These capabilities include developing the building sector so that demands for energy for heating and cooling are sensitive to temperature change (Zhou et al., 2013); developing thermoelectric plant sensitivity to ambient air temperature impacts on plant efficiency and water temperature impacts on plant operation; and developing model structure so that changes in climate (e.g., wind speed, solar irradiance) influence the supply curves of renewable energy sources (Zhou et al., 2012;Zhou and Smith, 2013).The development of these capabilities would be necessary for the eventual integration of climate information from CESM into the energy-sector operation of GCAM.Another area of development in iESM is the inclusion of supplies and demands for water, water management, and interactions of water resources with agriculture, the energy market, the hydrological cycle, and the rest of the climate system.New versions of GCAM fully track the water demands of energy and agriculture and incorporates a water-supply module that is sensitive to climate impacts (Hejazi et al., 2014(Hejazi et al., , 2013)).This major effort has positioned iESM to integrate water management and routing in subsequent phases of model development.Finally, the development of a new capability in GCAM to perform historical hindcast simulations (Chaturvedi et al., Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | 2013) will enable the iESM model to be evaluated in terms of its ability to represent key land, water, and energy management decisions in response to historical driving conditions, as well as the climate implications of those decisions.
The first version of the iESM, however, already provides a significant new capability to the climate community.iESM represents the first coupled treatment of the human/climate system based on an IAM and ESM that both contributed to the most recent IPCC and US National Assessments and that support international communities of developers and investigators in integrated assessment and climate science.While iESM is designed to exploit the full capabilities of its parent models, it can be readily simplified and expanded due to its flexible and extensible architecture.The simplifications include inclusion or exclusion of human components, as well as potentially drastic reductions in the complexity and computational burden of the Earth system components by use of CESM's data modes.This capacity for faster execution helps ensure that iESM can be used to explore a large range of future scenarios of climate adaptation and mitigation in both a thorough yet economical manner.The possible expansions include inclusion of other IAMs that conform to the RCP handshake protocol, incorporation of additional forcing agents from the human system that can alter the climate system, and extension to simulate the supply and demand of other major resources, e.g.water, that interact strongly with natural and societal processes.This capacity for extensibility helps ensure that the iESM can and will continue to evolve with the state of integrated assessment and climate science.gridcell fraction that had wood harvested from primary forested land gfvh2 gridcell fraction that had wood harvested from primary non-forested land gfsh1 gridcell fraction that had wood harvested from mature secondary forested land gfsh2 gridcell fraction that had wood harvested from young secondary forested land gfsh3 gridcell fraction that had wood harvested from secondary non-forested land Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper |

Discussion
Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper |

Discussion
Paper | Discussion Paper | Discussion Paper | Discussion Paper | assumption to hold are orders of magnitude larger than the corresponding scales used to solve the equations of motion for physical, chemical, and biogeochemical processes in the Earth system.
Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | 4.6 A simplified and robust run-time environment for the GLM component . The software development team acquired the GCAM and GLM model codes and incorporated Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | them into the land node of CESM through a new component, the Integrated Assessment component (IAC).
Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | include new compsets and new XML environment variables that specify items like the IAC mode (stub, data, active).The scripts have been further enhanced to incorporate several new libraries required by GCAM to support the open-source Berkeley DB XML (Oracle) database package with XQuery Access.These libraries include Berkeley DB XML, Berkeley DB, XQilla, and Xerces C++, which must be installed before the active IAC component can be run within CESM.
Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper |

Figure 5 .
Figure 5. Projection coefficients for the online-coupled and offline-coupled model implementations for a pair of equivalent scenarios based on RCP4.5.The coefficients are derived by projecting the spatial pattern of annual mean surface air temperature temperature onto the "fingerprint" of the surface air warming trend derived from the RCP8.5 ensemble mean.The fingerprint is taken to be the first empirical orthogonal function of the 96-year time series of RCP8.5 annual mean surface temperatures, scaled so that its mean value is 1 • C. Circles indicate values that differ between the online-coupled and offline-coupled simulations by more than the 95 % confidence interval of this same metric calculated for the RCP4.5 6 member ensemble.
Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | cepted in a two-layer canopy, with vegetation, soil, snow aging, and black carbon impacts on albedo.Subsurface processes include vertically resolved biogeochemistry, options for carbon and nutrient cycle parameterization, and recently improved treatment of wetlands and permafrost dynamics.CISM is based upon the Glimmer model, an open source (GPL) three-dimensional thermomechanical ice sheet model designed to be interfaced to a range of global climate models.

Table 2 .
Fields output by GLM.