ADISM v . 1 . 0 : an adjoint of a thermomechanical icesheet model obtained using an algorithmic di ff erentiation tool

Introduction Conclusions References Tables Figures


Introduction
In recent decades, adjoint models have found broad application in the geosciences, perhaps most notably as components of atmospheric data assimilation systems (Lahoz et al., 2010).In atmospheric modeling, it is crucial to obtain an estimation of the Figures initial state of the atmosphere that is consistent with the model physics; otherwise the model solution will be dominated by fast, transient modes as the model atmosphere adjusts toward balance.In weather forecasting and climate reanalysis applications, the assimilation of data in a given time window is necessary.Both problems can be addressed by using variational assimilation methods, which have adjoint models at their core (Courtier et al., 1993).
Researchers have expressed growing interest in inverse methods in glaciology over the past 20 yr, and consequently adjoint models have been studied in relation to a number of glaciological problems.Perhaps the first instance of a glaciological adjoint model in the literature is due to MacAyeal (1992), who obtained the adjoint of a depth-integrated ice-sheet model analytically and used it to invert basal traction parameters for an Antarctic ice stream.More recently, analytical adjoints have been presented by, among others, Brinkerhoff et al. (2011) and Goldberg and Sergienko (2011): the former use an adjoint to assess the sensitivity of basal conditions under a Greenland outlet glacier to basal traction and geothermal heat flux, while the latter examine nonlinearities in the context of inverse methods.
In addition to their use in data assimilation, adjoint models may be used by themselves to calculate the sensitivity of models to a specified set of parameters.In glaciology, the future of the Greenland and Antarctic ice sheets is of particular interest; the Greenland ice sheet (GrIS), for example, would cause sea level to rise by approximately 7 m if it were to melt completely (Cazenave, 2006).In trying to understand the likely mass loss from the GrIS over the coming centuries, it would be helpful to understand the sensitivity of the ice sheet to various spatially varying forcing fields.Ordinarily, model sensitivities to such spatially varying parameters are obtained by perturbing the parameter at each model grid point in turn and calculating the sensitivity of the desired cost function (e.g., ice volume).
Adjoint models, when used for these sensitivity calculations, are much more efficient because they can calculate a sensitivity "map" in a single step.Other than by analytical methods, adjoint models may be obtained from the underlying implementation as Figures

Back Close
Full a numerical program through a technique called algorithmic differentiation (AD).Heimbach and Bugnion (2009) used the AD tool TAF (Transformation of Algorithms in Fortran: Giering and Kaminski, 2003) to obtain an adjoint of the SICOPOLIS icesheet model (Greve, 1997(Greve, , 2000)).The adjoint was applied to the GrIS and used to calculate the sensitivity of the ice-sheet volume to basal sliding and basal melt rate.
In this paper, we present the adjoint of a thermomechanical ice-sheet model obtained using the OpenAD source-to-source transformation tool (Utke et al., 2006).
OpenAD is developed at the University of Chicago and Argonne National Laboratory and is made available under an open source license.It can be downloaded from http://www.mcs.anl.gov/OpenAD/.Our motivation for this work is to study the sensitivity of the GrIS to changes in its boundary conditions, and we present some examples of results of output from this application.
The use of OpenAD in the present work has an important advantage compared with the previous study by Heimbach and Bugnion (2009).Their use of the commercial TAF AD tool meant that the adjoint model code could not be released to the community at large.In contrast, because OpenAD is freely available, we are able to provide the full source code of the forward model and its adjoint as a supplement to this paper.Our aim is to encourage glaciologists to use source-to-source translation by providing an accessible example that can be downloaded and compiled by any interested reader.
The paper is structured as follows.Section 2 presents the principles of AD.The forward thermomechanical model is described in Sect.3. The adjoint of this model obtained with the OpenAD tool is then described in Sect. 4. Section 5 describes the verification of the forward model; its adjoint is tested against some selected numerical approximations in Sect.Thus, the cost function depends indirectly on the control variables.The derivatives ∂C/∂u form the Jacobian of C. In practice, the cost function might be the volume of the ice sheet at a particular time or some other physical property of the system, and u might be the discrete values of a spatially varying parameter such as basal sliding parameter.
Sensitivities can be numerically approximated by perturbing each control variable u i in turn and calculating the corresponding elements of the Jacobian using, for example, the finite-difference method: The result becomes exact as ∆u i approaches zero, but the effects of truncation error limit the precision with which sensitivities can be obtained by this method in finite precision arithmetic.Furthermore, the model must be run for each perturbation, thus making the approach computationally expensive (proportional to the number of control variables).
For nonlinear models the approximation error can be substantial given that one generally does not know the perturbation that minimizes the error in the trade-off between approximation and truncation errors.Obtaining derivatives that are exact would clearly be advantageous.This is what the adjoint model allows, because it is derived by differentiating the forward-model equations to obtain expressions for the Jacobian.However, for the large and complex numerical models often used in glaciology, manually deriving the adjoint would be time-consuming and prone to error.Introduction

Conclusions References
Tables Figures

Back Close
Full Algorithmic differentiation is a technique that can be used to obtain sensitivities when the cost function C and the model it relates to are specified as a computer program.AD tools take as input the source code for the forward model and add additional data (to hold the derivative values) and logic (to compute the derivative values) to the original program.AD may be implemented by source-to-source transformation or by operator overloading (see, e.g.Griewank et al., 1996).In this paper we apply the source transformation approach implemented in the OpenAD tool.

Theoretical basis of algorithmic differentiation
We do not describe the mathematical framework of adjoint models in depth here; our aim is to give enough background to support descriptions of specific technical aspects given later.A detailed discussion of AD is given, for example, by Griewank and Walther (2008).For an introduction in the context of the geosciences, the reader may refer to Wunsch and Heimbach (2007) and Heimbach and Bugnion (2009).
We assume that the forward model takes the form (Naumann and Utke, 2004) where n and m are the spaces of input and output vectors x and y, respectively, and n and m are the number of elements in each vector.Scalar, first-order AD allows for computing projections of the Jacobian J at a given point in the domain.In forward mode the projection is ẏ = J ẋ for a given direction ẋ, and in reverse or adjoint mode the projection is x = J T ȳ for a given weight vector ȳ (cf. Lagrange multipliers).Note the connection to the usual adjoint operator definition with an inner product ., .when written as ȳ, J ẋ = J T ȳ, ẋ .
In practical terms AD assumes F is implemented as a program whose execution implies the execution of a sequence of elemental operations φ such as intrinsics (sin, cos, etc.) and arithmetic operators.
and similarly in reverse mode the propagation of adjoints implies . . .
The adjoints of the arguments u, v of φ are computed by using the adjoint of the return value r .Note that forward mode is also referred to as the tangent linear model.From a modeling perspective, the cost function would not be considered part of the forward model; but for the purpose of computing sensitivities with AD, we may view them as one mathematical mapping with a scalar valued output.Therefore we have F with the dimension of the output space m = 1, and we want to compute the gradient.Doing so in forward mode implies that one Jacobian projection yields one gradient element at a time and gives all gradient elements in one Jacobian projection.In other words, the forward-mode gradient has a computational cost factor of O(n) over the original model computation, just like the finite differences in Eq. ( 1), but still yields derivatives with machine precision.In comparison, the computational cost factor for the reverse-mode gradient does not depend on n at all and therefore is preferable.The main caveat is the order in which the partials of φ are referenced.From Eq. ( 3) it is clear that one can simply compute them when φ is executed in the forward model.In contrast, when considering the computation order implied by Eq. ( 4), it becomes apparent that (as the name "reverse mode" suggests) the partials are referenced in the order opposite that of the computation of the φ.Therefore one will have to devise a means to recover these values, for example by storing or recomputing them.The method used to achieve this is discussed in the next section.Introduction

Conclusions References
Tables Figures

Back Close
Full

Checkpointing
In theory one can store all the partial derivatives, or the arguments u, v , . . . to nonlinear φ needed to compute them, during the storing forward pass in preparation for the reverse pass.In practice the memory requirements for this store-all approach are frequently prohibitive.The other extreme, the recomputation from the initial state or recompute-all approach, implies an equally prohibitive computational overhead.
A trade-off between storage and recomputation is possible by checkpointing (e.g., Griewank, 1992;Grimm et al., 1996).In the context of a time-stepping scheme with t iterations as applicable in the ice-sheet model, we can rewrite Eq. ( 2) as the sequence of t successive applications of F to the the model state x starting with an initial state x 0 : For simplicity we ignore the cost function C and some initial setup.We may have enough main memory to store the values needed for partials for a sequence of k t time steps at a time.That means we have s = t k (+1) subsequences of length k with a possible tail sequence (the "+1") if there is a nonzero remainder t modk.Assuming the initial state x 0 is available anyway, one can in a simple scheme write to disk s − 2 checkpoints at iterations ik , i = 1, . .., s − 2.Then, as one reaches iteration k (s − 1) one starts the storing forward pass for the last subsequence, immediately followed by the reverse pass.Then one restores the checkpoint written at iteration k(s − 2), followed by the storing forward pass from k (s − 2) to k(s − 1), immediately followed by the reverse pass for these time steps, and so on going backwards.In this way one would trade a recomputation effort of roughly one additional forward model execution for (s − 2) checkpoints on disk and main memory to store values for partials for k instead of t time steps at a time.For large t the resulting s may require too much checkpointing disk space.In these cases one can use a hierarchical scheme in which one first writes fewer checkpoints in distances k k and then runs forward from the sparse checkpoints to write denser checkpoints until one reaches distance k .

Checkpointing with Revolve
Given a number s of subsequences and the number c of checkpoints that may be held (on disk) at any one time during the execution of the model and its adjoint, there is a binomial scheme that minimizes the recomputation overhead which is called the Revolve algorithm (Griewank and Walter, 2000).
The Revolve algorithm has been reimplemented in Fortran for use with the OpenAD tool.The recomputation overhead can be pre-computed as function of s and c.For the computation in this paper we allowed s ≤ 20 000 checkpoints but as the checkpoint size depends on the size of the state vector x, it, as well as k, has to be adapted to the computing environment and the dimension of the concrete problem being tackled.
Note that the computational overhead is an estimate that does not include the time for writing and reading the checkpoints under the assumption that those times are dominated by the time to compute the k time steps of the respective subsequence.Using the Revolve implementation included with OpenAD requires the time-stepping loop in the original code to be changed to a loop in which the action in the loop body is governed by the Revolve algorithm.

The OpenAD tool
OpenAD is a source-to-source translation tool, which is made available under an open source license (Naumann and Utke, 2013).The tool is under continued development and maintenance at the University of Chicago and Argonne National Laboratory.
The code analysis built into the OpenAD tool provides, among other information, the sets of program variables that are referenced or modified within a given subroutine and its callees.This information is easily used to generate code that writes and reads checkpoints.Note that these checkpoints are typically significantly smaller than the process snapshots taken as "system-level" checkpoints and thus are more akin to "application-level" checkpoints without the need to manually create them.To use these automated checkpoints, one wraps the time-stepping loop body into a separate Introduction

Conclusions References
Tables Figures

Back Close
Full subroutine.Similarly, if the time-stepping loop itself is also wrapped into a separate subroutine, then the OpenAD template mechanism can be used to replace the original time stepping loop with a loop that references the Revolve interfaces.These subroutine encapsulations have been done as permanent changes in the ADISM model code.

Forward ice-sheet model
This section describes the forward ice-sheet model (ISM).The model is a three-dimensional, finite-difference model, based on the established shallow ice approximation (SIA: Mahaffy, 1976;Jenssen, 1977).The two main components of the ISM solve, respectively, the evolution of ice thickness and temperature.The thickness calculation code was written from scratch but based on an existing model coded in Matlab (courtesy of T. Payne, personal communication, 2008), while the thermodynamic code is essentially the same as that contained in the Glimmer community ISM (Rutt et al., 2009), although minor changes were made for technical reasons.All the model code is written in Fortran 90.

Ice thickness calculation
The ice thickness equation is obtained by integrating vertically the continuity equation for an incompressible material (Van der Veen, 2002): where q = ūH is the horizontal mass flux, M is the surface mass balance rate, H the ice thickness, ū the depth-mean horizontal velocity, and ∇ is the horizontal gradient operator.Note that in the remainder of the paper, the symbols n, s, u and x have different meanings from the earlier section covering the adjoint method.
The SIA assumes that surface and bedrock gradients are small, so the only important stress components are horizontal shear stresses.We can approximate the horizontal 5260 Introduction

Conclusions References
Tables Figures

Back Close
Full shear stresses in x and y at level z as (Rutt et al., 2009) where ρ is the ice density, g is the acceleration due to gravity, and s is the surface elevation.The horizontal shear strain rate ˙ xz is and similarly for ˙ yz .In practice, we assume that ∂w/∂x and ∂w/∂y are small compared with the vertical gradients of horizontal velocity, and we neglect them.These strain rates are related to stresses by the flow law.We use the standard Glen-Nye flow law (Glen, 1952;Nye, 1953), such that for horizontal shear stresses where A is a temperature-dependent coefficient, n is a constant (usually taken to be 3), and the effective shear stress τ * is given by the second invariant of the stress tensor for the SIA The flow parameter A depends on temperature according to the Arrhenius relationship where R is the gas constant, Q is the activation energy, a is a temperature-independent constant of proportionality, and f is a flow enhancement factor that can be used Introduction

Conclusions References
Tables Figures

Back Close
Full to compensate for impurities and the development of anisotropic ice fabrics.The temperature T * is the absolute temperature T corrected relative to the pressure melting point (Rutt et al., 2009): where C t is the dependence of the melting point on pressure.
Combining the SIA stress balance Eq. ( 7) and the expression for strain rate Eq. ( 8) gives the vertical gradient of velocity: Integrating Eq. ( 13) with respect to z gives the vertical profile of u: where u(h) is the basal velocity and z is a dummy vertical coordinate.A simple parameterization of basal sliding is included in the forward model, such that the basal velocity is proportional to the gravitational driving stress at the bed: where B is the sliding parameter.The value of B can be made to depend on the presence of melt at the bed.Integrating Eq. ( 14) again gives the averaged ice velocity ū (Rutt et al., 2009): It is convenient to express the continuity Eq. ( 6) in terms of diffusivity D, with diffusivity given by

Numerical solution of thickness equation
The model is discretized on a rectangular domain whose nodes have a regular spacing in x and y, such that and similarly for y i with a node spacing of ∆y.
The continuity equation written in terms of diffusivity Eq. ( 17) is solved to step the model forward in time.The expression for diffusivity Eq. ( 18) is discretized as follows: where A * i,j represents the effective value of the flow law coefficient A for the ice column and is given by the vertical double integral in Eq. ( 18): where k is the vertical coordinate and the integration is performed by using the trapezium rule.5263 Introduction

Conclusions References
Tables Figures

Back Close
Full The x and y components of the flux (denoted q x and q y ) are defined halfway between the main grid points, such that The continuity Eq. ( 17) is discretized by using an implicit time step: where ∆t is the time step and t and t + 1 indicate the time steps.In the case of fluxes at time t + 1, the diffusivity is approximated as being equal to that at the current time step (D with Equation ( 24) can be thought of as a matrix equation of the form where A is a sparse matrix composed of the elements α i,j , β i,j , γ i,j , κ i,j and diagonal elements φ i,j , and b is the right-hand side of Eq. ( 24).Equation ( 26) is solved by using the UMFPACK solver (Davis, 2011).This solver computes the LU factors in the factorization stage, and Eq. ( 26) is then solved without iterative refinement.

Temperature evolution equation
The evolution of temperature in an ice mass is governed by three processes: thermal diffusion, advection, and strain heating.In common with many three-dimensional icesheet models, we define a new vertical coordinate (σ), such that σ = 0 at the ice surface and σ = 1 at the bed: For a complete description of the resulting coordinate transformations, the reader is referred to Pattyn (2003).
Expressed in sigma coordinates, the temperature equation is as follows: where T is the absolute temperature, k is the thermal conductivity of ice, and c p is the specific heat capacity of ice.The coordinate transformation introduces a number of terms due to the time-variation of the ice geometry: these can be thought of as a vertical grid velocity and are combined into the term w grid (Rutt et al., 2009).
Because the vertical gradients of temperature in an ice sheet tend to be significantly larger than the horizontal gradients, only vertical diffusion of heat is represented in the model.Both horizontal and vertical advection of heat are represented, however.The third term in Eq. ( 28) represents the strain heating effect.This is of key importance to the thermomechanical effects seen in ice sheets.

Numerical solution of temperature equation
The code of the temperature evolution equation comes from the Glimmer ice-sheet model (Rutt et al., 2009).Details of the discretization of each of the terms in Eq. ( 28) are given by Hagdorn et al. (2010).
The temperature is solved iteratively for each column of ice; temperatures not in the column are taken from the previous iteration.The remaining unknown temperatures (each of the terms on the left of Eq. 28) are discretized by using the Crank-Nicolson scheme to form a tridiagonal matrix equation.The temperature is set to never exceed the pressure melting point.

Adjoint model
The adjoint model of the forward thermomechanical model described in Sect. 3 was obtained with the OpenAD tool (Utke et al., 2006).In order for the code to be passed through the tool, some changes need to be made to the model code and the build system.Introduction

Conclusions References
Tables Figures

Back Close
Full

Preparation for use of OpenAD
The user has to supply information regarding the inputs and outputs, adapt the driver logic to make use of the derivatives and modify the build system to execute the transformation itself and compile/link the transformed source code.

Pragmas
The way in which OpenAD treats certain portions of the model source code is controlled by pragmas inserted into the code.Specifically, the treatment of a given subroutine in the context of the checkpointing scheme is determined by applying templates, that are specified by pragmas.
Likewise, in the source code, one has to identify which program variables correspond to the inputs and outputs of the model (as in Eq. 2).This step is accomplished by using pragmas to declare given program variables to be the independent and dependent variables of the model (the control parameters and cost function, respectively).The choice of independent and dependent variables depends on the application of the adjoint model; details are explained by Utke et al. (2013).

Makefiles and drivers
In common with most numerical models, ADISM has a top-level driver routine that governs model initialization, execution, and result retrieval.The model code is compiled and linked by using GNU Make, which coordinates the build procedure.For the computation of the derivatives, the driver routine has to be adjusted to trigger the start of computation and the retrieval of the computed sensitivities.Because a source transformation step is involved, the build procedure has to be expanded to perform the transformation steps and then build the model from the transformed sources.The transformation takes places in multiple stages related to the different tool components involved (Fig. 1b).Introduction

Conclusions References
Tables Figures

Back Close
Full Where OpenAD is being used for the transformation of a simple model, the transformation stages may be wrapped into a script.For nontrivial use, as in the case of ADISM, the transformation stages benefit from being kept explicit because the transformation can be modified at each of the stages.Examples are given by Utke et al. (2013).
The principal flow of operation is as follows: -The code of a model f is parsed by the Open64 front-end.This produces a binary file, which contains the Open64-internal representation in the so-called whirl format (f whirl ).
-This representation is analyzed by the OpenAnalysis component, and results are translated into a language-independent XML format called xaif (f xaif ).
-This language-independent representation is then transformed (f xaif ) by Xaifbooster, which is the core AD component, into either the adjoint or the tangent linear model.
-The whirl representation of this differentiated function, f whirl , is obtained.
-The Open64 front-end unparses into Fortran code the whirl representation to the differentiated function f .
In the case of ADISM, the driver routine was adapted to provide variants appropriate to both the adjoint and tangent linear models.We provide the code of both the original model and the adjoint in the Supplement to this paper, as well as the makefiles necessary to run OpenAD.

Changes to the source code
The OpenAD framework, its components, and the Fortran language standard itself are under continued development.in the user code.Ancillary logic or calls to third party libraries may require the use of stubs.Implementing workarounds and stubs results in certain changes to the model source code.

Language limitations
The OpenAD version used in this paper has a number of limitations with respect to the subset of Fortran 90 syntax that can be handled.These limitations are due to the use of the Open64/SL front-end; consequently, current OpenAD development work is focused on replacing Open64/SL with the Rose front-end (ROSE compiler infrastructure, 2013), with which many language limitations are eliminated.One aspect of Fortran 90 syntax not currently supported by OpenAD is the derived type.Derived types were used extensively in the Glimmer subroutines that form the thermal part of ADISM (Rutt et al., 2009).The Glimmer code was changed to eliminate derived types; instead, extra parameters were passed to subroutine calls.
A second unsupported Fortran 90 language construct is the where construct.Likewise, these were rewritten by using if blocks and loops.
In addition, the Fortran implied do statement, usually used to construct arrays in a compact way, was replaced by explicit do loops wherever it occurred.

Stubs
Not all the code that constitutes a numerical model is part of the computation to be differentiated.In our case this situation applies to the I/O logic implemented by the NetCDF library where the source code may not be in Fortran, may not be available, or may use language features not well supported by the transformation tool and thereby may unnecessarily complicate the transformation process.In such cases it is appropriate to have the subroutine in question (here from the NetCDF library) be represented by a stub, that is, a subroutine with the same interface and a body that represents only the simplified differentiable data dependencies of the parameters, Introduction

Conclusions References
Tables Figures

Back Close
Full Stubs are also used to implement adjoint propagation of high-level mathematical operations such as the linear solution of the ice thickness equation using UMFPACK (Davis, 2011).Here the subroutine parameters and therefore the stub have differentiable dependencies.Rather than trying to differentiate by brute force through the UMFPACK library and also recognizing that the recent versions of UMFPACK are implemented in C rather than Fortran, we inject a stub for the solve step.Prior to the final compilation of the transformed code we apply the OpenAD template mechanism to replace that stub with the logic needed to carry out the appropriate action during the forward sweep (the system solve) and the adjoint sweep (the system solve with the transpose).Not only does this approach permit the use of UMFPACK as a black-box library written in C, but it is also more efficient than the brute-force differentiation.

Memory optimization
In order to optimize memory use and computational cost, the Revolve checkpointing scheme (Sect.2.3) was used in the model.The code of the forward model was structured so that the main time loop was in a separate file.Checkpointing was used for subroutines in the time loop following the Revolve scheme.No checkpointing was used for those subroutines outside the main time loop, a technique also known as split-mode AD.

Build environment
The model runs in this paper were undertaken using an 64-bit Ubuntu-derived Linux distribution, using the GCC (gfortran) compiler version 4.7.In this paper, we use the term verification to refer to the process of checking that the forward model code has been implemented correctly.We do not address the question of validation, which concerns the extent to which the underlying mathematical model is a good analogue for the target system.In any case, the strengths and weaknesses of finite-difference SIA models are well known and are covered at length elsewhere (e.g., van den Berg et al., 2006;Blatter et al., 2011) ADISM is first verified against the well-known EISMINT2 benchmark (Payne et al., 2000).The model applied to the GrIS is then verified against the EISMINT3 benchmark (Huybrechts, 1997, unpublished).In both cases, the verification is made by comparison with the performance of the numerical models included in the original studies.

Comparison with EISMINT2 benchmark
A domain of 1500 km × 1500 km and a horizontal resolution of 25 km are used in the EISMINT2 benchmark.ADISM was configured according to Experiments A and H (Payne et al., 2000).In both experiments, a circular ice sheet is grown on a flat bed under idealized forcing.Experiment A is the most basic configuration, where basal sliding is disabled; in Experiment H, basal sliding is enabled for regions where the bed is at pressure melting point.
In both experiments, the surface mass balance M has a maximum value M max in a circular region at the center of the domain, but beyond that it decreases with distance from the center.It is expressed as distance.The surface temperature T surf is expressed as where T min = 238.15K is the minimum surface air temperature and S t = 1.67 × 10 −2 K km −1 is the gradient of air-temperature change with horizontal distance.
In Experiment H, the linear sliding law given in Eq. ( 15) was used, with B = 10 −3 m Pa −1 yr −1 .Basal sliding was enabled only where the basal ice is at pressure melting point.
For both experiments, the model was run forward from zero ice initial conditions for 200 ka.The verification was carried out by comparing values of ice divide thickness and basal temperature at the end of the run with those of the models that took part in the EISMINT2 benchmark (Payne et al., 2000) (Table 1).The range and mean are shown for the EISMINT2 models.Values of basal temperatures of the ADISM for both experiments (A and H) are within those of the EISMINT2 range.The value of ice divide thickness of Experiment A is 40 m lower (about 1.2 % lower than the actual value) than the EISMINT 2 range, while that of Experiment H is within this range.
From a physical point of view, Experiment H is more realistic than Experiment A, because it includes basal sliding where ice is temparate.Basal temperatures and ice divide thickness of Experiment H both being within the EISMINT2 range (Table 1) gives us confidence that the model is implemented correctly for the physically realistic case, which is desired for the application of the forward model to Greenland (Sect.5.2).
We note that the variation between models in Payne et al. (2000) is larger for Experiment H than it is for Experiment A. The reason is that the thermomechanical feedback between ice flow, temperature, and sliding causes instabilities.These instabilities manifest themselves as unsteady radial fast flow features embedded in the ice sheet (Payne and Baldwin, 2000), as are seen in the field of horizontal velocity when ADISM is configured according to Experiment H (Fig. 2a).Instabilities are also seen as spokes of warm and cold ice in the basal temperature field (Fig. 2b).Introduction

Conclusions References
Tables Figures

Back Close
Full

Comparison with EISMINT3 Greenland benchmark
As an example of an application to a real ice-sheet geometry, ADISM was configured for the GrIS.Again, it was necessary to verify that the model performed as expected for this geometry, and so a comparison was made with the GrIS configuration from the unpublished EISMINT3 benchmark (Huybrechts, 1997).
The EISMINT3 Greenland benchmark uses a spatial resolution of 20 km in x and y and a time step of 10 yr.The initial ice geometry and bed DEM are those of Letreguilly et al. (1991).The surface mass balance is calculated by using an annual positive degree day (PDD) model (Reeh, 1991); this requires inputs of annual precipitation, mean annual surface temperature (T a ), and mean summer temperature (T s ).The annual precipitation field is that compiled by Ohmura and Reeh (1991).The temperature forcing is specified as a pair of simplified parameterizations (Huybrechts and deWolde, 1999;Ritz et al., 1997), based on observational data (Ohmura, 1987): where s is the ice surface elevation (m) and Φ is latitude ( • N).We are aware that these data are neither the most recent nor the highest quality available and that the dynamic characteristics of the modeled ice sheet are sensitive to the choice of ice geometry and forcing climate (Stone et al., 2010).Nevertheless, we adopt them for the present work because they allow comparison with the EISMINT3 results.The EISMINT3 benchmark uses a flow enhancement factor of f = 3 in Eq. ( 11).Calving is taken into account by assuming ice is removed when the thickness is within 200 m of the floatation point.The geothermal heat flux is 0.05 W m −2 , and basal sliding is disabled.
The model was run forward from the measured geometry for 100 ka, and the surface compared with those modeled as part of the EISMINT3 exercise.Unfortunately, no original data were available for these model runs, and so comparison had to be 5273 Introduction

Conclusions References
Tables Figures

Back Close
Full undertaken using the plots published in Huybrechts (1997): in this respect, ADISM was seen to perform satisfactorily.The maximum surface elevation was within the range of those obtained in the EISMINT3 benchmark.The surface elevation and volume with and without basal sliding also compared well with present-day values (Table 2).From the verification tests in this section, we can have confidence in the ADISM as applied to the GrIS.

Testing the adjoint model
To test the adjoint model, we check the adjoint sensitivities against those calculated by using finite differences from the forward model (as in Eq. 1).Forward model sensitivities should not be confused with those of the forward mode (or tangent linear mode) of AD.
In contrast to the verification of the forward model, basal sliding is enabled in these model runs.A ten year time step was used in the calculation of sensitivities.
We compare sensitivities of both the EISMINT2 and EISMINT3 cases.In both cases, we run the forward model to a quasi-steady state before calculating sensitivities over a further 100 yr of model runs.The reason is that the model exhibits highly nonlinear behavior when run from an artificial initial state.Using an initial "spin-up" period allows the model to reach a more physically consistent state before sensitivities are calculated.In each case, basal sliding is included in both spin up and model runs.
Calculating approximate sensitivities by using the forward model is difficult because even small changes in the magnitude of the perturbation yield can different orders of magnitude and even sign changes of the derivative approximations.Using an adjoint model, one can compute machine-precision sensitivities with respect to that artificial initial state.However, the large gradients of the initial nonlinearities render the result useless for practical purposes.We compared sensitivities with the model configured according to Experiment A and H of the EISMINT2 benchmark.While the models in the EISMINT2 benchmark are run over 200 kyr, the thermal regime has reached steady state after about 150 kyr.Sensitivities were run from the state of the forward model after 150 kyr, which is run from zero ice thickness.Presented here is the comparison of sensitivities of the model configured according to the more realistic setup of Experiment H.The comparison of those of the model of Experiment A is also referred to for completeness.The adjoint and forward sensitivity maps of ice-sheet volume to surface mass balance for Experiment H are shown in Fig. 3a and b.Here, surface mass balance is a two-dimensional independent variable (covering the model domain).Each point on the sensitivity maps is the total change of volume at the end of the model run (100 yr) for a unit perturbation of surface mass balance at that point.
The difference between the adjoint and forward sensitivities (Fig. 3c) is negligible in the center (10 −9 × the actual value) and is about 0.2 % the actual value at the margins.This is good, considering the large gradient of sensitivity at the margins.This small difference is a reflection of the fact that the model as considered here is very stable, with relatively little nonlinear behavior, which would otherwise give rise to larger differences between the forward and adjoint sensitivities.
For Experiment A, the adjoint model also performed well, with a negligible difference at the center and 0.05 % near the margins.From these tests we can have confidence in the adjoint of the thermomechanical model, so that the adjoint of the thermomechanical model applied to the GrIS can be tested.Introduction

Conclusions References
Tables Figures

Back Close
Full Sensitivities are calculated starting from the state of the forward model applied to the GrIS after 100 kyr.Sliding is taken into account in the spin up run -this contrasts with the EISMINT3 case which does not include basal sliding.
When examining sensitivities, it is desirable that the ice sheet from which the model run starts is as close to the present day ice sheet as possible.Using a flow enhancement factor of 2 resulted in the optimal ice sheet; sensitivities were run from this state (Fig. 5b).The ice sheet could be tuned further to the present day ice sheet using data assimilation; this is beyond the scope of this paper, however.
The adjoint sensitivity of volume to surface mass balance (Fig. 4a) compares very well with respect to the forward model sensitivity (Fig. 4b).The difference between these is about 0.01 % at the center and up to 0.05 % in the northeastern region (Fig. 4c).The larger difference is evidence of the nonlinearity in the model.In this case a perturbation factor of 10 −8 was used.
We note here that in the forward sensitivity approximations, a uniform perturbation is often applied over the whole domain, causing variations in the approximation-totruncation trade-off.Even if the optimal trade-off could be found for the perturbation separately at every discretization point in the domain, the forward approximation would still lose half the significant digits.For reference, the initial spin-up state from which the spun-up sensitivities are run is shown in Fig. 5b, along with the observed surface elevation in Fig. 5a.
Because the adjoint sensitivities agree well with the approximate sensitivities calculated from the forward model, these validation tests give a high degree of confidence in the adjoint model as applied to the GrIS.Figures

Discussion
Motivation for this work arose from the computational efficiency of the adjoint model and the power of AD for obtaining sensitivities.The reality of this is demonstrated by the fact that the computational time of the GrIS adjoint sensitivities (Fig. 4a) was about 500 times less than that required to compute the sensitivities by using the forward model.
The adjoint model is also useful in that it gives exact sensitivities, which have no numerical error.Differences between the forward and adjoint sensitivities in Sect.6 are due to the choice of perturbation factor used in the forward sensitivities, the finitedifference approximation of forward sensitivities, and not the adjoint sensitivities.Full Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | 6. Section 7 presents some examples of results obtained by applying the adjoint model to the GrIS.Calculation of model sensitivities is the central purpose of adjoint models.For a given numerical model (hereafter referred to as the forward model), sensitivities are the partial derivatives of a certain objective function -the cost function C(x) -with respect to a set of control variables u, where x are the state variables of the forward model.
Discussion Paper | Discussion Paper | Discussion Paper | For each elemental r = φ(u, v , . ..) the propagation Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Language features are covered based on user demand and weighing their implementation effort in comparison with the effort for workarounds Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | if any.Prior to the final compilation of the transformed code, the stub routines are removed.If the stubs do not represent data dependencies, one need simply link to the original implementations in the NetCDF libraries.
) where (x s , y s ) is the location of the center of the domain, R e = 450 km is the radial distance from the center at which the surface mass balance becomes zero, and S b = 0.01 m yr −1 km −1 is the gradient of change of surface mass balance with horizontal Introduction Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | It is therefore crucial to conduct the adjoint experiments from a spun-up model state.
Discussion Paper | Discussion Paper | Discussion Paper | The purpose of this paper was the presentation and testing of an adjoint model.This work has enabled further study of adjoint ice-sheet sensitivities, with the development of an adjoint model using a freely accessible open source AD tool.Discussion Paper | Discussion Paper | Discussion Paper | Lahoz, W., Khattatov, B., and Menard, R. (Eds.):Data Assimilation, Springer, Berlin, Heidelberg, available at: http://www.springerlink.com/content/978-3-540-74702-4/contents/, last access: December 2012, 2010.5252 Letreguilly, A., Huybrechts, P., and Reeh, P.: Steady-state characterisitics of the Greenland ice sheet under different climates, J. Glaciol., 37, 149-157, 1991.Discussion Paper | Discussion Paper | Discussion Paper | Reeh, N.: Parameterization of melt rate and surface temperature on the Greenland ice sheet, Polarforschung, 59, 113-128, 1991.5273 Ritz, C., Fabre, A., and Letreguilly, A.: Sensitivity of a Greenland model to ice flow and ablation parameters: consequences for the evolution through the last glacial cycle, Clim.Dynam.Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper |

Fig. 5 .
Fig. 5. (a) Observed ice-sheet surface elevation (m).(b) Steady-state GrIS surface elevation (m) at start of adjoint sensitivity run.Red line is edge of ice-sheet margin. 50

Fig. 5 .
Fig. 5. (a) Observed ice-sheet surface elevation (m).(b) Steady-state GrIS surface elevation (m) at start of adjoint sensitivity run.Red line is edge of ice-sheet margin.

Table 1 .
Steady-state ice divide thickness (m) and basal temperatures ( • C) of the forward thermomechanical model (ADISM) and of models that took part in Experiments A and H of the EISMINT2 benchmark.Mean and range of values of EISMINT2 are shown.