We present a new capability of the ice sheet model SICOPOLIS that enables flexible adjoint code generation via source transformation using the open-source algorithmic differentiation (AD) tool OpenAD.
The adjoint code enables efficient calculation of the sensitivities of a scalar-valued objective function or quantity of interest (QoI) to a range of important, often spatially varying and uncertain model input variables, including initial and boundary conditions, as well as model parameters.
Compared to earlier work on the adjoint code generation of SICOPOLIS, our work makes several important advances:
(i) it is embedded within the up-to-date trunk of the SICOPOLIS repository – accounting for 1.5 decades of code development and improvements – and is readily available to the wider community;
(ii) the AD tool used, OpenAD, is an open-source tool;
(iii) the adjoint code developed is applicable to both Greenland and Antarctica, including grounded ice as well as floating ice shelves, with an extended choice of thermodynamical representations.
A number of code refactorization steps were required. They are discussed in detail in an Appendix as they hold lessons for the application of AD to legacy codes at large.
As an example application, we examine the sensitivity of the total Antarctic Ice Sheet volume to changes in initial ice thickness, austral summer precipitation, and basal and surface temperatures across the ice sheet.
Simulations of Antarctica with floating ice shelves show that over 100 years of simulation the sensitivity of total ice sheet volume to the initial ice thickness and precipitation is almost uniformly positive, while the sensitivities to surface and basal temperature are almost uniformly negative. Sensitivity to austral summer precipitation is largest on floating ice shelves from Queen Maud to Queen Mary Land. The largest sensitivity to initial ice thickness is at outlet glaciers around Antarctica. Comparison between total ice sheet volume sensitivities to surface and basal temperature shows that surface temperature sensitivities are higher broadly across the floating ice shelves, while basal temperature sensitivities are highest at the grounding lines of floating ice shelves and outlet glaciers. A uniformly perturbed region of East Antarctica reveals that, among the four control variables tested here, total ice sheet volume is the most sensitive to variations in austral summer precipitation as formulated in SICOPOLIS.
Comparison between adjoint- and finite-difference-derived sensitivities shows good agreement, lending confidence that the AD tool is producing correct adjoint code.
The new modeling infrastructure is freely available at

An important ingredient to characterizing and quantifying our uncertainty in expected climate change outcomes is our understanding of ice sheet dynamics. A key approach for increasing such understanding is the development of more sophisticated ice sheet models.
Scientists have made significant strides in improving model sophistication, with the latest class of ice sheet models resolving all three dimensions of the ice sheet's internal stress balance (as opposed to previous classes of models, which employed various approximations to the stress field to save computational cost). However, while advances in computational glaciology have enabled us to simulate ice sheet behavior more accurately, remaining uncertainties in the range of independent input variables required for ice sheet simulations, in particular initial conditions, surface forcings, basal boundary conditions, and internal parameters, comprise crucial weaknesses in ice sheet – and ultimately climate system – prediction or projection

Ice flow critically depends on quantities that we either cannot easily measure (such as the friction or thermal forcing between ice and the bedrock below it), that parameterize subgrid-scale processes or empirical constitutive laws (such as the routing of meltwater or fracture propagation), or that we may never be able to measure in the present day (such as the rate of snowfall in the past). These unknown or uncertain variables can be construed as sets of parameters that we must infer or calibrate if we are to make projections with ice sheet models, and these parameters must both satisfy, by some measure, the assumed model physics and the sparsely made observations across such large bodies. In the language of optimal estimation and control theory, these parameters are referred to as control variables

If we are to integrate ice sheet model projections into societally relevant discussions on sea level rise, we may wish to know the sensitivity of ice-sheet-integrated or derived quantities of interest to a range of uncertain model inputs. For example, we wish to know how the total ice volume (above flotation) of an ice sheet is influenced by climatically relevant quantities (such as surface atmospheric forcings) or environmental variables (such as the melting at the base of an outlet glacier or floating ice shelf that drains an ice sheet). A computationally costly method for deriving such sensitivities might use individually made perturbations to the bottom melting rate at each location of the ice sheet’s base, in particular at its margins that are in contact with ocean water. This means that the ice sheet dynamics must be integrated throughout time for each simulation experiment in which a point-wise perturbation has been applied in order to assemble a sensitivity map across the entire domain to this control variable (basal melting). While the target of this approach remains of paramount importance – relating the output of an ice sheet model to poorly known inputs – the means are computationally expensive: understanding, for instance, the Antarctic Ice Sheet’s sensitivity to changes in melting or basal friction means simulating the entire ice sheet throughout time for every perturbation made at each point in the domain. In this case the computational cost of such a method scales with the dimension of the domain grid, and as such it is prohibitive.

Fortunately, adjoint models provide us with a means to this end whereby the computational cost of deriving sensitivity maps does not depend on the dimension of the control variable space. The adjoint model is in effect the transpose of the linearized operator of the ice sheet model.
Compared to the parent model, which propagates model inputs via the prognostic model state to model outputs, the adjoint propagates the dual of the ice sheet model state in reverse order, from the sensitivity of model outputs to sensitivities in the model inputs (which, for time-dependent models, amounts to a backward-in-time propagation of sensitivities). It thereby simultaneously calculates the sensitivity of a chosen quantity of interest (e.g., the volume of an ice sheet) with respect to the prescribed set of control variables (e.g., the basal melting beneath the ice, surface accumulation, or initial conditions). Thus, unlike the tangent linear model, which computes the impact of

Generally, adjoint models arise in at least two classes of geophysical investigations.

Adjoint-enabled optimization problems that are constrained by partial differential equations (PDEs), which are solved by using adjoint methods, may be posed in the following manner, beginning by formulating a scalar-valued cost function based on a least-squares model–data misfit subject to prior information on the uncertainty of the control variables:

PDE-constrained optimization seeks to find the gradient of

A model that can optimally reproduce the behavior of, e.g., an ice sheet throughout time with respect to observations possesses the advantage that model-derived predictions might be made with greater confidence, having been initialized by dynamics that are informed by spatiotemporal observations. In other words, the commonly termed “spin-up” of an ice sheet may produce more faithful projections when forced by optimally recovered initial and boundary conditions, and an optimal state estimate, which may be recovered by a time-dependent adjoint model. A model initialized and projected under such circumstances might better reproduce what can be inferred about its past state by observations, subject to the additional constraint of the assumed and (perhaps more subtle, but equally important) conserved model physics throughout time. Thus, the constrained optimization problem of recovering boundary and initial conditions, and the model's optimal internal state dynamics throughout space and time, might be approached first through the task of obtaining reliable adjoint sensitivities.

Beyond applications in optimization, the adjoint may also be widely applied to the comprehensive analysis of linear sensitivities (the subject of the work presented here) of QoIs to uncertainly known inputs (in particular, forcings or parameters) in nonlinear models.

The concept of the adjoint of a numerical model may be best understood in terms of the forward, original code construction and execution. If one wishes to know the sensitivity of some QoI (e.g., the volume of the Antarctic Ice Sheet) with respect to some model inputs or control variable (e.g., the average surface air temperature in July), one method of pursuing knowledge about such a sensitivity might be perturbing the control variable, in sequence, at each single point within the discretized domain and propagating the perturbation forward in time. The perturbation to the control variable results in a change in the QoI, and one can proceed to calculate the sensitivity of the QoI with respect to the control variable everywhere in the domain. Herein these are termed the finite differences

Adjoint models have been common in oceanic and atmospheric contexts

Compared to the literature cited above, and specifically in contrast to the work by

we develop the capability to generate the adjoint of a thermomechanically coupled ice sheet model by means of source-to-source transformation algorithmic differentiation (AD) using an open-source AD tool;

the code for source-to-source transformation is the an up-to-date version of the SICOPOLIS model, which should enable easy maintenance of the AD capability and wider use by the interested research community;

compared to

Adjoint models developed by AD exploit the chain and product rules for the computation of the derivatives of a function (

Schematic of AD applied to a simple function,

Equation (

As soon as a numerical model is implemented as a code, it is in fact translated as a sequence or composition of elementary operations like those shown in Fig.

We begin with the ice sheet model SICOPOLIS (SImulation COde for POLythermal Ice Sheets) and sketch the development of its adjoint model from version 5-dev

SICOPOLIS employs four different thermodynamics representations: (1) a two-layer polythermal scheme, which allows for the computation and effects of liquid water within a warmer temperate layer; (2) a purely cold-ice scheme in which no liquid water is present; and (3, 4) two flavors of the one-layer enthalpy scheme that combine the physical adequateness of (1) with the greater numerical simplicity of (2)

SICOPOLIS simulates ice as a nonlinear viscous fluid by employing Glen's flow law

Basal sliding under grounded ice links the sliding velocity,

As described in Sect.

The adjoint of the ice sheet model SICOPOLIS is largely generated automatically by the application of the freely available source transformation tool OpenAD

The adjoint model of SICOPOLIS produced by OpenAD results in approximately 50 000 executable lines, represented in a much simplified schematic in Fig.

The exclusion of forward model code that the user knows will not be executed may significantly simplify the AD tool's dependency and flow control analysis, avoid spurious dependencies that the AD tool may detect, and lead to more streamlined source code for the adjoint.

Because of the reverse mode and requirement to store required variables in time-reversed order (e.g., those used for evaluating state-dependent conditions and nonlinear expressions), adjoint models will have a substantially larger memory footprint than their parent forward model

A number of algorithmic aspects of the code needed one-time editing or refactoring for OpenAD to be able to successfully parse the source code and provide correct adjoint code. For example, non-smooth functions – such as piecewise linear functions represented by

Because SICOPOLIS is capable of simulating many different aspects of ice flow at the continental scale, we have designed a set of configurations each focusing on particular aspects of the model so that the resulting adjoint values and patterns may be more readily interpreted. Where we could have applied more complicated relationships, for example in the initialization in temperature, geothermal flux, or calving laws, we have opted instead for simplicity, as the exhaustive examination of such choices in simulation is left to future studies. The adjoint values are calculated for specific configurations of the original forward code of SICOPOLIS.

A sample of comparisons between adjoint-derived and finite-difference-based sensitivities. All regions in column (2) refer to either selected point-wise perturbations

We simulate Antarctica for 100 years of model time with a 20 km horizontal resolution and 81 terrain-following vertical layers. The dynamic and thermodynamic time steps (which can be chosen to differ) were both set to 0.2 years, as this was found to be the most stable value for the forward model simulation. Land ice, floating ice, and ice streams are approximated by the shallow-ice approximation (SIA), shallow-shelf approximation (SSA), and shelfy-stream approximation (SStA) formulations, respectively, described in

We note that the operation underlying calving (see above) amounts to a conditional statement. From an AD perspective, the following steps occur: (i) derivative codes are generated for each condition; (ii) code to store and restore the required variable is added to properly evaluate the conditional derivative. For legacy code the operation may not be differentiable at the exact condition (see Appendix B for practical details). This should be taken into account when performing gradient-based optimization.

The motivation for developing an adjoint of a numerical model stretches far beyond providing comprehensive sensitivity experiments; often, an adjoint model is developed so that the sensitivities may be used in a gradient-based PDE-constrained optimization problem to invert for uncertain initial conditions, boundary conditions, or model parameters, thereby producing a data-constrained estimate for the evolution of the state of the system.
Here, however, we are interesting in understanding model sensitivities. We present the sensitivity of the volume of the Antarctic Ice Sheet with respect to several control variables as a proof of concept, rather than extending the work in the direction of optimization, which will be the subject of future studies. The purpose is to gain physical insight into the model's linear response characteristics and to ascertain correctness and interpretability of the adjoint.
The adjoint-derived sensitivities are compared to finite-difference perturbations, either at single points or over a patch of the domain that has been uniformly perturbed, to demonstrate that the adjoint model is sufficiently consistent with sensitivities derived via finite differences. Those comparisons are shown in Table

Adjoint sensitivities,

Logarithms of the absolute value of adjoint sensitivities,

Figures

The sensitivity of total Antarctic ice volume to the initial ice thickness compares well with the calculated finite-difference-based value (Table

The pattern of the January (austral summer) precipitation adjoint values largely mirrors that of the ice thickness, with several distinctions. The order of magnitude is much larger, ranging instead between

Sensitivities to surface and basal temperature (Fig.

Over the 100-year simulation, high sensitivities to surface and basal temperature at the margins extend inward toward the middle of the ice sheet following glacier drainage basins (Fig.

Table

Columns (5) and (6) in Table

The utility of this comparison is to convert the sensitivities into meaningful quantities that can be compared against each other to assess, for example, which control variable impacts the cost function the most given a perturbation of expected magnitude, in addition to providing another metric by which we may measure the adequacy of the adjoint model of SICOPOLIS.
The percent difference between

Lastly, the adjoint model of SICOPOLIS runs serially and completed 100 years of model runtime in 20, 75, and 600 min of wall clock time on a Linux box (Intel Xeon CPU E5-2650 at 2.00 GHz) for resolutions of 64, 40, and 20 km. The results shown in Figs.

The results presented here are not meant to be exhaustive. Rather, they present initial adjoint sensitivity applications of the newly AD-enabled SICOPOLIS model, underscore the interpretable nature of adjoint-derived sensitivity fields, and are presented as a proof of concept for further investigation. They invite users to take advantage of this new infrastructure for their science applications. We leave an exhaustive study of sensitivities to different control variables in SICOPOLIS to future work, as here we only wish to examine a few important dynamic and thermodynamic controls and assess the validity of the adjoint model.

As a measure of the adjoint model's correctness, we compared gradients obtained from the adjoint model computed via finite-difference perturbations. Adjoint values compared acceptably against finite differences for ice thickness as well as surface and basal temperatures, with less than 10 % deviation. Austral summer precipitation adjoint values saw a larger disagreement with finite differences, of up to 12 %. Part of the higher discord may be due to the fact that the cost values (total Antarctic Ice Sheet volume) are very large, emphasizing numerical noise for sensitivity fields that are very small. Ice sheet volume changes calculated by the adjoint model and finite differences disagree more, although the largest discrepancy occurred with the smallest overall volumes calculated (both surface and basal temperature) and are thus likely, again, to be affected by numerical noise arising in the calculation. Control variables related to the conservation of mass equation provided the best agreement across measured metrics (ice thickness for point-wise sensitivities and precipitation for finite volume calculations). This is readily explained by the primarily linear nature of precipitation changes (seen as a volume flux) in changing total ice volume.

The general similarity between ice thickness and precipitation adjoint sensitivities (Fig.

In a related manner, the overall similarity of the surface and basal temperature sensitivities is reassuring as both of these are components in the same conservation of energy equation. Both fields of sensitivities delineate the drainage basins of glaciers and ice shelves, with very small sensitivity in the center of the ice sheet that increases by orders of magnitude toward the coasts. The surface temperature sensitivities more uniformly affect total ice volume over the ice shelves, while the basal temperature sensitivities indicate that positive perturbations in basal temperature at the grounding lines of glaciers and ice shelves have a larger effect on total ice volume and that, when compared with each other, variations in basal temperature are more powerfully felt across the Antarctic Ice Sheet. This seems to indicate that changes in ocean temperature at the grounding lines around Antarctica have much more potential to have a lasting impact on the volume of the ice sheet than temperature changes in the atmosphere.
However, this conclusion must be tempered by the fact that our current simulation of the surface of ice does not account for complex surface, englacial, or subglacial hydrology, e.g., meltwater ponding and induced catastrophic failure, as has been observed in the past at the Larsen B Ice Shelf, for example

Algorithmic differentiation relies on algorithms being differentiable, line by line, in a code. Numerical disagreement can accumulate for even simple reasons, such as the use of piecewise linear functions represented algorithmically by

This work presents a new capability of the ice sheet model SICOPOLIS to enable flexible adjoint code generation using the open-source AD tool OpenAD. The flexibility is afforded by allowing a wide range of choices of model domains, numerical algorithms for specific configurations, and the control variables (independent variables) and quantities of interest (dependent variables; cost functions) defined when generating the adjoint code. We demonstrate the utility, correctness, and interpretability of adjoint-derived sensitivity maps for Antarctic-wide simulations, with the total volume of the Antarctic Ice Sheet chosen as the quantity of interest. We compute the quantities' sensitivity to initial and boundary conditions over a 100-year simulation from the present day. Examining, ascertaining, and understanding the information contained in such sensitivity maps, which are formally gradients of scalar-valued functions with respect to model inputs, is a useful and natural first step in the use of these sensitivities in gradient-based optimization problems, which will be the subject of future work. Such work, enabled by the adjoint model of SICOPOLIS, could include understanding how different parameterizations of precipitation, melting, and other interesting higher-order processes of ice flow affect quantities of interest.

One suggested outcome of the sensitivity analysis is that, as a controlling variable, mean monthly applied summer precipitation influences the total integrated Antarctic Ice Sheet volume more than the initial ice geometry or surface and/or basal temperatures do for representative values of perturbation in each of these variables. Another hypothesized (and perhaps unsurprising) relationship derives from a comparison between the surface and basal ice temperatures: changes in basal temperature, particularly at grounding lines, affect total ice volume much more than those in surface temperature.

Much remains to be learned and further examined in the context of this model, including the degree to which results may be applicable to other models. Our results are specific for a given configuration of SICOPOLIS, with emphasis placed on the initial use of simple parameterizations for (often) the most interesting aspects of ice flow, including how basal melting and firn compaction are represented (both processes would be affected by the control variables chosen here). Our metrics of model validity evaluated point-wise show that the adjoint model is mostly accurate to within 10 % compared to sensitivities obtained via the finite-difference method. One likely reason for larger disagreements in some of the calculated metrics may be the regimes of very weak sensitivities, in which case numerical noise becomes a leading factor in the inferred differences.

Another cost function may be formulated as a model–data misfit based on, for example, the modeled versus observed spatiotemporal ice elevation change. Additionally, over-reliance on inherently non-differentiable piecewise linear functions for important aspects of surface mass balance terms may introduce discrepancies that could be minimized with the use of smoother functions or smooth implementations of parameterization schemes. These are valid and important aspects of code that are not easily addressed

As glaciologists strive to make ever more confident projections in the future behavior of ice sheets, tools that rigorously determine the relationship between often poorly known input parameters and important model outputs are increasingly needed. SICOPOLIS-AD is one such tool that is freely available to the cryosphere community

Here we present the results of a 100-year sensitivity study of Greenland Ice Sheet volume to basal ice temperature. This is added as an Appendix as Greenland sensitivities have been produced previously by

The forward simulation of Greenland is configured in much the same way as for Antarctica, with an emphasis on simplicity for proof of concept. Unless otherwise stated below, choices of numerical schemes, physical parameterizations, and forcing approaches are the same. We simulate Greenland for 100 years from the present day at a 10 km horizontal resolution with 81 terrain-following vertical layers. The dynamic and thermodynamic time steps take the value of 0.5 years. The dynamics now are only SIA, as we have restricted our simulation to grounded ice. The thermodynamic formulation is again via the conventional enthalpy method, with ice initialized at a constant temperature of

Comparison between adjoint-derived (column 3) and finite-difference-derived (column 4) point-wise sensitivities for Greenland ice volume as QoI (symbols as in Table 1). All regions in column (2) refer to points from Fig.

Adjoint sensitivities,

Figure

Interestingly, whereas in Antarctica the ice thickness sensitivities were almost entirely positive, substantial portions of the Greenland Ice Sheet lose volume when perturbed positively in ice thickness, a phenomenon previously inferred by

Sensitivity to precipitation, as in the case of Antarctica, is almost entirely positive and again dwarfs the other control variables tested here by many orders of magnitude. The overall larger magnitude of basal temperature sensitivities compared to surface temperatures is consistent with the Antarctic simulation. Completion of Greenland serial simulations on a Linux box (Intel Xeon CPU E5-2650 at 2.00 GHz) took 5, 10, and 140 min for horizontal resolutions of 40, 20, and 10 km, respectively. The results shown here are for 10 km resolution.

We made several modifications to SICOPOLIS to enable source transformation and differentiation via OpenAD.
The changes that were made enabled efficient AD in some cases and overcame some limitations of the AD tool used in
others. The modifications are guarded by C preprocessor (CPP) directive

Loop variables are not updated within the loop

The loop condition does not use

The loop condition’s left-hand side consists only of the loop variable

The stride in the update expression is fixed

The stride is the right-hand side of the top level

The loop body contains no index expression with variables that are modified within the loop body

SICOPOLIS contained several cases of statements injected to break out of loops that cause them not to
be simple. To differentiate non-simple loops correctly, OpenAD stores which array indices are actually used
per loop iteration. This approach causes significant memory usage and performance loss.
Therefore, we removed the

To differentiate the above formulation efficiently, an AD tool must not naively differentiate through the solver code.
OpenAD uses its

When SICOPOLIS uses the SOR solver for a system of linear equations wherein the matrix storage is in compressed sparse row (CSR) format, arrays are represented by

SICOPOLIS is free and open-source software available through a persistent Subversion repository that is hosted by the FusionForge system AWIForge of the Alfred Wegener Institute for Polar and Marine Research (AWI) in Bremerhaven, Germany (

The AD tool used to generate adjoint source code is OpenAD. A snapshot non-revocable code archive of OpenAD can be downloaded at

LCL, SHKN, and PH developed the adjoint code of SICOPOLIS; RG originally developed SICOPOLIS, provided insight on the model results, and helped host the freely available version of the code.

The authors declare that they have no conflict of interest.

We thank Laurent Hascoët, Mauro Perego, and Lizz Ultee for their careful reviews, thoughtful comments, and valuable suggestions, which helped to improve the paper.

This research has been supported by the U.S. Department of Energy, Office of Science (grant no. DE-AC02-06CH11357 and SC0008060), the U.S. National Science Foundation (grant no. 1750035), the Japan Society for the Promotion of Science (KAKENHI grant nos. JP16H02224, JP17H06104 and JP17H06323), and the Japanese Ministry of Education, Culture, Sports, Science and Technology (MEXT) through the Arctic Challenge for Sustainability (ArCS) project (program grant number JPMXD1300000000).

This paper was edited by Alexander Robel and reviewed by Laurent Hascoet, Lizz Ultee, and one anonymous referee.