SICOPOLIS-AD v1: an open-source adjoint modeling framework for ice sheet simulation enabled by the algorithmic differentiation tool OpenAD

We present a new capability of the ice sheet model SICOPOLIS that enables flexible adjoint code generation via source transformation using the open-source algorithmic differentiation (AD) tool OpenAD. The adjoint code enables efficient calculation of the sensitivities of a scalar-valued objective function or quantity of interest (QoI) to a range of important, often spatially varying and uncertain model input variables, including initial and boundary conditions, as well as model parameters. Compared to earlier work on the adjoint code generation of SICOPOLIS, our work makes several important advances: (i) it is embedded within the upto-date trunk of the SICOPOLIS repository – accounting for 1.5 decades of code development and improvements – and is readily available to the wider community; (ii) the AD tool used, OpenAD, is an open-source tool; (iii) the adjoint code developed is applicable to both Greenland and Antarctica, including grounded ice as well as floating ice shelves, with an extended choice of thermodynamical representations. A number of code refactorization steps were required. They are discussed in detail in an Appendix as they hold lessons for the application of AD to legacy codes at large. As an example application, we examine the sensitivity of the total Antarctic Ice Sheet volume to changes in initial ice thickness, austral summer precipitation, and basal and surface temperatures across the ice sheet. Simulations of Antarctica with floating ice shelves show that over 100 years of simulation the sensitivity of total ice sheet volume to the initial ice thickness and precipitation is almost uniformly positive, while the sensitivities to surface and basal temperature are almost uniformly negative. Sensitivity to austral summer precipitation is largest on floating ice shelves from Queen Maud to Queen Mary Land. The largest sensitivity to initial ice thickness is at outlet glaciers around Antarctica. Comparison between total ice sheet volume sensitivities to surface and basal temperature shows that surface temperature sensitivities are higher broadly across the floating ice shelves, while basal temperature sensitivities are highest at the grounding lines of floating ice shelves and outlet glaciers. A uniformly perturbed region of East Antarctica reveals that, among the four control variables tested here, total ice sheet volume is the most sensitive to variations in austral summer precipitation as formulated in SICOPOLIS. Comparison between adjointand finite-difference-derived sensitivities shows good agreement, lending confidence that the AD tool is producing correct adjoint code. The new modeling infrastructure is freely available at http://www.sicopolis.net (last access: 2 April 2020) under the development trunk. Published by Copernicus Publications on behalf of the European Geosciences Union. 1846 L. C. Logan et al.: SICOPOLIS-AD: an adjoint method

Abstract. We present a new capability of the ice sheet model SICOPOLIS that enables flexible adjoint code generation via source transformation using the open-source algorithmic differentiation (AD) tool OpenAD. The adjoint code enables efficient calculation of the sensitivities of a scalar-valued objective function or quantity of interest (QoI) to a range of important, often spatially varying and uncertain model input variables, including initial and boundary conditions, as well as model parameters. Compared to earlier work on the adjoint code generation of SICOPOLIS, our work makes several important advances: (i) it is embedded within the upto-date trunk of the SICOPOLIS repository -accounting for 1.5 decades of code development and improvements -and is readily available to the wider community; (ii) the AD tool used, OpenAD, is an open-source tool; (iii) the adjoint code developed is applicable to both Greenland and Antarctica, including grounded ice as well as floating ice shelves, with an extended choice of thermodynamical representations. A number of code refactorization steps were required. They are discussed in detail in an Appendix as they hold lessons for the application of AD to legacy codes at large. As an example application, we examine the sensitivity of the total Antarctic Ice Sheet volume to changes in initial ice thickness, austral summer precipitation, and basal and surface temper-atures across the ice sheet. Simulations of Antarctica with floating ice shelves show that over 100 years of simulation the sensitivity of total ice sheet volume to the initial ice thickness and precipitation is almost uniformly positive, while the sensitivities to surface and basal temperature are almost uniformly negative. Sensitivity to austral summer precipitation is largest on floating ice shelves from Queen Maud to Queen Mary Land. The largest sensitivity to initial ice thickness is at outlet glaciers around Antarctica. Comparison between total ice sheet volume sensitivities to surface and basal temperature shows that surface temperature sensitivities are higher broadly across the floating ice shelves, while basal temperature sensitivities are highest at the grounding lines of floating ice shelves and outlet glaciers. A uniformly perturbed region of East Antarctica reveals that, among the four control variables tested here, total ice sheet volume is the most sensitive to variations in austral summer precipitation as formulated in SICOPOLIS. Comparison between adjoint-and finite-difference-derived sensitivities shows good agreement, lending confidence that the AD tool is producing correct adjoint code. The new modeling infrastructure is freely available at http://www.sicopolis.net (last access: 2 April 2020) under the development trunk.

Introduction
An important ingredient to characterizing and quantifying our uncertainty in expected climate change outcomes is our understanding of ice sheet dynamics. A key approach for increasing such understanding is the development of more sophisticated ice sheet models. Scientists have made significant strides in improving model sophistication, with the latest class of ice sheet models resolving all three dimensions of the ice sheet's internal stress balance (as opposed to previous classes of models, which employed various approximations to the stress field to save computational cost). However, while advances in computational glaciology have enabled us to simulate ice sheet behavior more accurately, remaining uncertainties in the range of independent input variables required for ice sheet simulations, in particular initial conditions, surface forcings, basal boundary conditions, and internal parameters, comprise crucial weaknesses in ice sheet -and ultimately climate system -prediction or projection (Goelzer et al., 2018;Seroussi et al., 2019). As such, ice sheet modeling is facing similar issues of robust model initialization for prediction as those faced by the climate modeling community at large (e.g., Meehl et al., 2014;Balmaseda, 2017).
Ice flow critically depends on quantities that we either cannot easily measure (such as the friction or thermal forcing between ice and the bedrock below it), that parameterize subgrid-scale processes or empirical constitutive laws (such as the routing of meltwater or fracture propagation), or that we may never be able to measure in the present day (such as the rate of snowfall in the past). These unknown or uncertain variables can be construed as sets of parameters that we must infer or calibrate if we are to make projections with ice sheet models, and these parameters must both satisfy, by some measure, the assumed model physics and the sparsely made observations across such large bodies. In the language of optimal estimation and control theory, these parameters are referred to as control variables (Gelb, 1974).
If we are to integrate ice sheet model projections into societally relevant discussions on sea level rise, we may wish to know the sensitivity of ice-sheet-integrated or derived quantities of interest to a range of uncertain model inputs. For example, we wish to know how the total ice volume (above flotation) of an ice sheet is influenced by climatically relevant quantities (such as surface atmospheric forcings) or environmental variables (such as the melting at the base of an outlet glacier or floating ice shelf that drains an ice sheet). A computationally costly method for deriving such sensitivities might use individually made perturbations to the bottom melting rate at each location of the ice sheet's base, in particular at its margins that are in contact with ocean water. This means that the ice sheet dynamics must be integrated throughout time for each simulation experiment in which a point-wise perturbation has been applied in order to assemble a sensitivity map across the entire domain to this control variable (basal melting). While the target of this approach remains of paramount importance -relating the output of an ice sheet model to poorly known inputs -the means are computationally expensive: understanding, for instance, the Antarctic Ice Sheet's sensitivity to changes in melting or basal friction means simulating the entire ice sheet throughout time for every perturbation made at each point in the domain. In this case the computational cost of such a method scales with the dimension of the domain grid, and as such it is prohibitive.
Fortunately, adjoint models provide us with a means to this end whereby the computational cost of deriving sensitivity maps does not depend on the dimension of the control variable space. The adjoint model is in effect the transpose of the linearized operator of the ice sheet model. Compared to the parent model, which propagates model inputs via the prognostic model state to model outputs, the adjoint propagates the dual of the ice sheet model state in reverse order, from the sensitivity of model outputs to sensitivities in the model inputs (which, for time-dependent models, amounts to a backward-in-time propagation of sensitivities). It thereby simultaneously calculates the sensitivity of a chosen quantity of interest (e.g., the volume of an ice sheet) with respect to the prescribed set of control variables (e.g., the basal melting beneath the ice, surface accumulation, or initial conditions). Thus, unlike the tangent linear model, which computes the impact of one input perturbation on all model outputs at the cost of one execution (directional derivative), the adjoint model computes the sensitivity of one output quantity of interest (QoI) with respect to all model inputs (gradient). This is useful not only for understanding the sensitivity of some scientifically or societally interesting quantity to model inputs, but further (and perhaps more interesting) enables the recovery of other forcing or initial conditions (e.g., initial ice sheet geometry or rate of snow accumulation throughout time and space) through formal inversion.

Algorithmic differentiation (AD) and its uses
Generally, adjoint models arise in at least two classes of geophysical investigations.

PDE-constrained, gradient-based optimization
Adjoint-enabled optimization problems that are constrained by partial differential equations (PDEs), which are solved by using adjoint methods, may be posed in the following manner, beginning by formulating a scalar-valued cost function based on a least-squares model-data misfit subject to prior information on the uncertainty of the control variables: where x 0 , x b , and x(t i ) are the initial, background, and timevarying model state at time t i , respectively; y i is the set of observations at time t i and E i (x(t i )) is a projection of the model state at time t i to space of observations y i (or the datamodel misfit). C pr is a prior error covariance matrix (usually diagonal as its structure is not fully known), and C err is an observational error covariance matrix. This form is typical for a problem in which uncertain initial conditions are subject to variation to reduce the model-data misfit, and with prior information available (x b and C pr ). The formulation may be readily extended to include variation of boundary conditions or model parameters, and the collective set of all such input variables that are being subject to variation are called control variables (see Wunsch and Heimbach, 2007, for a more comprehensive treatment). PDE-constrained optimization seeks to find the gradient of J with respect to the control variables (here x 0 ), subject to the requirement that the (in general nonlinear) model L is fulfilled, rendered by a set of partial differential equations that step the state x from time t i to t i+1 , i.e., x(t i+1 ) = L (x(t i )).
( 2) This problem is efficiently solved by means of the Lagrange multiplier method (Wunsch and Heimbach, 2007). Lagrange multipliers have a direct interpretation as dual-state adjoint sensitivities or gradients of the cost function (Eq. 1) with respect to the control variables, ∂J ∂x 0 . They are used to seek a state of the system, x * , that is tolerably close to the minimum of J . As such, gradient information is essential in recovering the optimal x * , which minimizes model-data misfit as presented by Eq. (1). This procedure is done in an iterative fashion, with an initial guess of the state x 0 that is successively updated to achieve optimal control variables that produce an optimal prognostic model state throughout the model's simulation.
A model that can optimally reproduce the behavior of, e.g., an ice sheet throughout time with respect to observations possesses the advantage that model-derived predictions might be made with greater confidence, having been initialized by dynamics that are informed by spatiotemporal observations. In other words, the commonly termed "spin-up" of an ice sheet may produce more faithful projections when forced by optimally recovered initial and boundary conditions, and an optimal state estimate, which may be recovered by a timedependent adjoint model. A model initialized and projected under such circumstances might better reproduce what can be inferred about its past state by observations, subject to the additional constraint of the assumed and (perhaps more subtle, but equally important) conserved model physics throughout time. Thus, the constrained optimization problem of recovering boundary and initial conditions, and the model's optimal internal state dynamics throughout space and time, might be approached first through the task of obtaining reliable adjoint sensitivities.

Sensitivity analysis
Beyond applications in optimization, the adjoint may also be widely applied to the comprehensive analysis of linear sensitivities (the subject of the work presented here) of QoIs to uncertainly known inputs (in particular, forcings or parameters) in nonlinear models. Errico and Vukicevic (1992) and Marotzke et al. (1999) provide example applications in the context of atmosphere and ocean modeling, respectively. For a general scalar-valued function J , now termed the quantity of interest (QoI), the tangent linear model (TLM) L of a given (in general nonlinear) model L (Eq. 2) acts as a directional derivative, as it propagates small perturbations in the control variable, δx, to corresponding perturbations in the QoI, δJ . In turn, the adjoint model (ADM), formally the transpose L T of the tangent linear model, propagates the sensitivities of the QoI to each component of the control space, i.e., the partial derivatives of the augmented space of all control variables in reverse order of the execution of the forward model L (which, for time-dependent models, amounts to the integration backward in time of the adjoint model equations). Thus, whereas the TLM produces directional derivatives of J , the adjoint produces the gradient of J . A detailed treatment in the context of ice sheet modeling is provided by Heimbach and Bugnion (2009) and Goldberg and Heimbach (2013). For the present purpose we summarize the way in which these sensitivities are formally obtained by way of algorithmic differentiation in the following.

Formal reverse mode of AD
The concept of the adjoint of a numerical model may be best understood in terms of the forward, original code construction and execution. If one wishes to know the sensitivity of some QoI (e.g., the volume of the Antarctic Ice Sheet) with respect to some model inputs or control variable (e.g., the average surface air temperature in July), one method of pursuing knowledge about such a sensitivity might be perturbing the control variable, in sequence, at each single point within the discretized domain and propagating the perturbation forward in time. The perturbation to the control variable results in a change in the QoI, and one can proceed to calculate the sensitivity of the QoI with respect to the control variable everywhere in the domain. Herein these are termed the finite differences J x of the QoI J with respect to the (in general vector-valued) control variable x in the direction x, where J is calculated as in Eq. (1). An adjoint model code may be demonstrated as acceptable or reliable if the finite-difference-derived sensitivities approximate the adjoint-derived sensitivities (within some tolerance); for example, for a centered finite difference we obtain where ε is the magnitude of the perturbation to the control andn = δx ||δx|| is the unit vector pointing in the direction of δx. In this work ε varies depending on the examined control. Other finite-difference schemes may of course be employed, but for the purposes of this work we have selected the central difference for simplicity.
Adjoint models have been common in oceanic and atmospheric contexts (Talagrand and Courtier, 1987;Thacker and Long, 1988;Errico and Vukicevic, 1992) for decades. The method's popularity has been increasing steadily. MacAyeal et al. (1991) provide us with the earliest use of adjointmodel-derived sensitivities of a simplified ice stream model. Observed velocities were used to invert for optimal basal friction parameters. While that study employed a simplified version of the Stokes equations and lacked time dependence, other researchers have since undertaken the task of using adjoint sensitivities in a variety of ice-related applications with more complexity. Inversions of basal (Gillet-Chaulet et al., 2012;Goldberg and Sergienko, 2011;Joughin et al., 2009;Morlighem et al., 2010Morlighem et al., , 2013Vieli and Payne, 2003) and rheological (Larour et al., 2005;Khazendar et al., 2007) parameters were produced based on regional implementations of steady-state models of the Antarctic Ice Sheet, with varying degrees of approximations to the internal stress balance. Brinkerhoff and Johnson (2013) and Perego et al. (2014) used the method with an emphasis on producing optimal initial conditions for prediction. Waddington et al. (2007) provide one of the few examples to date for inferring surface boundary conditions (accumulation rates) from internal radar layer observations. The extended problem of inversion in the context of transient ice sheet models was considered by Goldberg and Heimbach (2013), Goldberg et al. (2015Goldberg et al. ( , 2016, and Larour et al. (2014). Use of higher-order derivatives (in particular, Hessian) to extend the inverse problem to the problem of quantifying uncertainties in the inferred parameters was treated by Petra et al. (2012) and Isaac et al. (2015). Mosbeux et al. (2016) provided an intercomparison of two assimilation methods for inferring basal parameters by comparing a sequential and an adjoint approach.
Compared to the literature cited above, and specifically in contrast to the work by Heimbach and Bugnion (2009) and Goldberg and Heimbach (2013), our work presents several technical (model development) and scientific (application of the capability developed) novel aspects: 1. we develop the capability to generate the adjoint of a thermomechanically coupled ice sheet model by means of source-to-source transformation algorithmic differentiation (AD) using an open-source AD tool; 2. the code for source-to-source transformation is the an up-to-date version of the SICOPOLIS model, which should enable easy maintenance of the AD capability and wider use by the interested research community; 3. compared to Heimbach and Bugnion (2009) the current SICOPOLIS model (5-dev) has several choices of thermodynamic representations and the ability to augment the shallow-ice approximation for grounded ice with a shallow-shelf approximation for ice floating in the ocean. This study will derive continent-wide sensitivities of changes in Antarctic ice volume to changes in sub-ice-shelf melt rates.
An inherent problem in the numerical simulation of ice dynamics is the nonlinearity of the forward model. This arises due to the nonlinear dependence of viscosity on a stress or strain rate formulation in ice. Because of this complication, hand-coded adjoints can be as labor-intensive (and error-prone) to develop as their nonlinear parent model. As an alternative to hand-coding the adjoint model, algorithmic differentiation (AD) provides a method for obtaining adjoint codes via rigorous exploitation of the chain (and product) rule (Griewank and Walther, 2008;Forth et al., 2012; Naumann, 2012) (http://www.autodiff.org, last access: 2 April 2020). AD has been used in an array of applications in the geosciences and computational fluid dynamics and has one substantive advantage over handwritten adjoint codes: it is flexible. Changes in the prescribed QoI, the control variables, or the underlying assumed and discretized model physics may lead to adjoint models of different structure. As models become more complicated due to time dependence and the inclusion of improved representation of ice physics, accurate, hand-coded adjoint solutions may be even more difficult to derive. In such contexts, AD methods provide a powerful alternative means for producing adjoint solutions to time-dependent problems that are up to date with respect to their parent forward model code. Adjoint models developed by AD exploit the chain and product rules for the computation of the derivatives of a function (J ) with respect to a set of input variables (u). For simplicity, here we only consider the case in which the control variables consist of the initial state of the model, u = x(t 0 ) = x 0 (for a complete treatment, see Wunsch and Heimbach, 2007). To demonstrate how a scalar QoI, J , is related to a control vector u, consider the following timedependent statement of the problem, where t ∈ (t 0 , t f ) represents marching the model forward through discretized time steps: where x(t f ) is the model's state at the end of the simulation, L represents the overall mapping of the control vector to the final state of the model, and L is the nonlinear system of equations (or forward model) applied successively to the initial state of the model. The subscripting of L refers to the time marching of the model, where t f = t N t , and L n maps the model state at time n to n + 1. As our interest here is to show how gradients are generated by this method, consider then how linear perturbations to the control space result in changes to the cost function, J , through a Taylor series expansion: Assuming O(δJ 2 ) is negligible, and considering another simplifying case in which J only depends explicitly on the final state x(t f ), δJ is shown (in the forward sense) to be where · , · is the inner product and L n = ∂x(t n+1 ) ∂x(t n ) is the tangent linear model of M, a linearization of L at time t n about x 0 . It follows from Eq. (6) that the adjoint model L T (where L T is the transpose of L) equivalently defines δJ : Equation (7) demonstrates that the sought gradient, ∂J ∂x , is computed by projecting the cost function to the model's final state, ∂J ∂x(t f ) , and mapping it backward in time, ultimately to the dependence of the model on its (user-selected) input or control variables. Figure 1 presents a small example of the computational flow of the full nonlinear forward sweep (blue arrow and variables) and the adjoint (reverse) mode of OpenAD (black arrow) applied to a single model, F : is sought (see Heimbach and Bugnion, 2009, for a related schematic of the relationship between forward and adjoint code via source-to-source transformation AD).
As soon as a numerical model is implemented as a code, it is in fact translated as a sequence or composition of elementary operations like those shown in Fig. 1, with a single line representing one single algorithmic step. Via AD methods, then, the tangent linear and adjoint of a numerical model are provided by exhaustive application of the chain and product rules, line by line, to the model. The forward (Sect. 2.1) or reverse (adjoint) mode of the model may be thought of as the composition in forward or reverse order of the Jacobian matrices and their transpose of the full forward code's lineby-line algorithmic elements: that is, L in Eq. (6) and L T in Eq. (7). In SICOPOLIS-AD, the entire forward code is composed of many lines of simple functions, like F, in sequence (blue downwardpointing array on the left). OpenAD provides ∇F by relating the partials of t1 and t2 to intrinsically differentiable functions, like sin(), here the red text, p1 (black upward-pointing array on the right). The partial derivatives of F are computed via the writing to memory of intermediate partial quantities, like dy (the adjoint or dual of y) , dt1, and dt2. Thus, the sought sensitivity of a QoI, F, is related to input parameters a, b, and c in this algorithmic (albeit much simplified) manner. . SICOPOLIS is open source and written in Fortran; it has a relatively long and stable history (Greve, 1997). SICOPOLIS has remained a relevant and powerful tool for the cryosphere community and continues to participate in model intercomparison exercises (e.g., Goelzer et al., 2018;Seroussi et al., 2019). The model couples ice sheet dynamics and thermodynamics (solving for the ice thickness, extent, velocity, temperature, water content, and age) over three-dimensional domains including, among others, Greenland, Antarctica, paleo-ice sheets on Earth, and the polar caps of Mars. It employs simplified versions of the threedimensional Stokes equation (internal stress balance). These include the shallow-ice approximation for ice resting on land (Hutter, 1983;Morland, 1984), the shallow-shelf approximation for ice floating in the ocean (Morland, 1987;MacAyeal, 1989;Weis et al., 1999), and the shelfy-stream approximation for fast-flowing ice streams with limited coupling to the bed (Bernales et al., 2017). For a detailed treatment of the numerical methods employed in the model, readers are referred to Greve and Calov (2002), Blatter (2009, 2016), and Bernales et al. (2017).
SICOPOLIS employs four different thermodynamics representations: (1) a two-layer polythermal scheme, which allows for the computation and effects of liquid water within a warmer temperate layer; (2) a purely cold-ice scheme in which no liquid water is present; and (3, 4) two flavors of the one-layer enthalpy scheme that combine the physical adequateness of (1) with the greater numerical simplicity of (2) (Aschwanden et al., 2012;Greve and Blatter, 2016). In all cases, horizontal diffusion of the thermodynamic fields (temperature, water content, or enthalpy) is neglected. The solvers employed use an implicit discretization scheme for the vertical derivatives and an explicit scheme for the horizontal derivatives.
SICOPOLIS simulates ice as a nonlinear viscous fluid by employing Glen's flow law (Glen, 1955) amended as in Greve and Blatter (2009) where η is the ice viscosity, T is the temperature difference relative to the pressure-melting point, σ e is the effective shear stress, and σ 0 is a small constant used to prevent singularities when σ e is very small. n is the flow law exponent (taken as 3), and A is a temperature-and pressure-dependent rate factor (Cuffey and Paterson, 2010) that is modified in temperate regions containing liquid water following Lliboutry and Duval (1985).
Basal sliding under grounded ice links the sliding velocity, v b , to the basal shear traction, τ b , and the basal normal stress, N b (counted positive for compression), in the form of a Weertman-Budd-type sliding law (e.g., Weertman, 1964;Budd et al., 1984): where C b is the sliding coefficient, and p and q are the sliding law exponents.

Adjoint model of SICOPOLIS generated with OpenAD
As described in Sect. 1.1, the construction of an adjoint model of a nonlinear, time-dependent forward model often presents a formidable task when solved analytically or handcoded (e.g., Goldberg and Sergienko, 2011;Gillet-Chaulet et al., 2012;Morlighem et al., 2013;Isaac et al., 2015). In previous works, the variational forward and adjoint equations are derived first and then discretized. As an alternative, AD produces adjoint code through the differentiation of source code using source transformation or operator overloading tools. As the standard of numerical models (in various contexts) has risen to more complicated physical representations, the use of AD has become increasingly popular (e.g., Heimbach and Bugnion, 2009;Goldberg et al., 2016;Morlighem, 2018, using source transformation -Brinkerhoff andJohnson, 2013;Larour et al., 2014;Hoffman et al., 2018, using operator-overloading and code-templating).
The adjoint of the ice sheet model SICOPOLIS is largely generated automatically by the application of the freely available source transformation tool OpenAD  developed at Argonne National Laboratory, the University of Chicago, and Rice University (http://www.mcs.anl.gov/ OpenAD, last access: 2 April 2020). It is a flexible and modular tool that parses a given model written in Fortran to generate a Fortran version of the model's adjoint code. As opposed to adjoint models that are analytically derived and discretized, the adjoint model of SICOPOLIS furnished by AD yields not one unique adjoint code but many possible configurations of the adjoint model, whose constituent parts are selected at compile time by options in the header files. Thus, there no single, core adjoint code; there is instead a flexible adjoint code that evolves in tandem with the forward model of SICOPOLIS and can accommodate updated empirical relationships and boundary conditions.
The adjoint model of SICOPOLIS produced by OpenAD results in approximately 50 000 executable lines, represented in a much simplified schematic in Fig. 1 by the composition of the blue, red, and black algorithmic steps. Like many complex, time-evolving geophysical models, SICOPOLIS comes with a range of choices of model configuration, in particular numerical schemes, which the user may choose from. As a matter of convenience, the preferred implementation is to make all of these choices (or options) available at runtime to minimize the need for recompiling the model. The same convenience is available, in principle, to the AD-generated adjoint model. The control flow analysis of the AD tool identifies all possible flows of forward model execution and produces corresponding adjoint flow paths. However, close to 2 decades of experience with the application of AD to complex, time-evolving geophysical models, all of which have a range of numerical schemes that users may choose from (Heimbach et al., 2002(Heimbach et al., , 2005Forget et al., 2015), has shown that for the specific application of adjoint modeling, it is preferable to remove code that will not be executed in a given application from adjoint code generation (and subsequent compilation). The two main reasons for proceeding in this manner are the following.
i. The exclusion of forward model code that the user knows will not be executed may significantly simplify the AD tool's dependency and flow control analysis, avoid spurious dependencies that the AD tool may detect, and lead to more streamlined source code for the adjoint.
ii. Because of the reverse mode and requirement to store required variables in time-reversed order (e.g., those used for evaluating state-dependent conditions and nonlinear expressions), adjoint models will have a substantially larger memory footprint than their parent forward model (Heimbach et al., 2005). Memory requirements may be significantly increased if the adjoint model is required to keep track of a large range of conditional branches for execution.
For these practical considerations, removing nonused forward model code at the time of adjoint code generation and subsequent compilation has proven to be highly preferable (although not strictly required). It is implemented here via C preprocessor (CPP) options that are enabled or disabled prior to generating the adjoint code (and prior to compilation time). We note that the implementation keeps runtime parameters and flags in place such that the forward model default to keep all code available at runtime is not compromised. By pairing SICOPOLIS with the source transformation tool OpenAD, the adjoint model of SICOPOLIS may be generated automatically for a large variety of forward model configurations (including detailed choices of model domain, numerics, control variables, and QoI). A number of algorithmic aspects of the code needed onetime editing or refactoring for OpenAD to be able to successfully parse the source code and provide correct adjoint code. For example, non-smooth functions -such as piecewise linear functions represented by if statements or absolute values -are inherently non-differentiable and sometimes required special treatment before the adjoint could be obtained by AD. Because of its importance in the development of a forward model that works properly within the framework of the AD tool, we have devoted a detailed description in Appendix B of the aspects of SICOPOLIS that required code refactoring. Further technical details on how to set up, compile, and run reference configurations are documented in a quick-start manual (Logan et al., 2019).

Example application: Antarctic Ice Sheet volume sensitivities
Because SICOPOLIS is capable of simulating many different aspects of ice flow at the continental scale, we have designed a set of configurations each focusing on particular aspects of the model so that the resulting adjoint values and patterns may be more readily interpreted. Where we could have applied more complicated relationships, for example in the ini-tialization in temperature, geothermal flux, or calving laws, we have opted instead for simplicity, as the exhaustive examination of such choices in simulation is left to future studies.
The adjoint values are calculated for specific configurations of the original forward code of SICOPOLIS.

Antarctic model configuration
We simulate Antarctica for 100 years of model time with a 20 km horizontal resolution and 81 terrain-following vertical layers. The dynamic and thermodynamic time steps (which can be chosen to differ) were both set to 0.2 years, as this was found to be the most stable value for the forward model simulation. Land ice, floating ice, and ice streams are approximated by the shallow-ice approximation (SIA), shallowshelf approximation (SSA), and shelfy-stream approximation (SStA) formulations, respectively, described in Bernales et al. (2017). Ice thickness evolves freely and without adjustment. Solutions to the SSA portion of SICOPOLIS are aided by invoking the external Library of Iterative Solvers (LIS; https://www.ssisc.org/lis/, last access: 2 April 2020). Thermodynamics are formulated by the conventional enthalpy scheme (Sect. 2.1). We use Glen's flow law (Eq. 8) with a stress exponent n = 3, a residual stress σ 0 = 10 4 Pa, uniform flow enhancement factors E = 5 for grounded ice and E = 1 for floating ice, and a rate factor A(T ) as in Cuffey and Paterson (2010). Horizontal and vertical advection in the temperature and age equations are discretized via a first-order upstream stencil of interpolated velocities and advection terms on the main grid, and topography gradients are evaluated with a fourth-order discretization. The ice temperature is initialized as a uniform value of −10 • C, as the goal of this exercise is a proof of concept and not an exhaustive examination of all aspects of the Antarctic Ice Sheet. For the same reason, a uniform geothermal flux of 55 m W m −2 is applied. Parameterization of the mean annual and mean January surface temperatures is according to Fortuin and Oerlemans (1990), and the applied surface temperature is held constant throughout the simulation. Accumulation is applied at present-day levels throughout the simulation (Le Brocq et al., 2010;Arthern et al., 2006). The fraction of solid precipitation is a linear function of the monthly mean surface temperature according to Marsiat (1994). Surface ablation is parameterized by the positive degree day method, and rainfall is assumed to run off instantly. Floating ice is removed at calving fronts for thicknesses less than 30 m. The parameters for the basal sliding law (Eq. 9) are chosen as C b = 11.2 m yr −1 Pa −1 , p = 3, and q = 2. Basal melting under floating ice is set to 30 m water equivalent per year around the grounding zone (adjacent grounded and floating grid points) and zero elsewhere, for simplicity. Sea level is constant, and there is no special treatment of subglacial hydrology. The initial geometry is taken from the present day (Fretwell et al., 2013). No isostatic bedrock adjustment is simulated in our configuration. No special pretreatment or Table 1. A sample of comparisons between adjoint-derived and finite-difference-based sensitivities. All regions in column (2) refer to either selected point-wise perturbations δx i = e i (with unit vector e i of the ith component) or perturbations applied over an extended box delineated in Fig. 1a. Columns (3) and (5) are adjoint-derived quantities, while columns (4) and (6) are derived via finite-difference perturbations either at a single point in the domain or integrated over a patch, as shown in Fig. 2a. Validation of the adjoint model is sought by comparing finite differences (4) and adjoint values (3) in a percent deviation metric (7). Column (7) is calculated as |col. (4) × 100. Point-wise ice thickness as well as surface and basal temperatures compare well, with a percent deviation of less than 10 %. Summer precipitation has a higher disagreement, at 12 %. Patch-wide comparisons have a considerably higher percent deviation. We also performed a test for the Greenland Ice Sheet, shown in more detail in Appendix A.

Variable
Region (from Fig. 2a) "spin-up" of the model is applied prior to the 100-year simulation; rather, the above conditions are set and the model evolves for 100 years. We note that the operation underlying calving (see above) amounts to a conditional statement. From an AD perspective, the following steps occur: (i) derivative codes are generated for each condition; (ii) code to store and restore the required variable is added to properly evaluate the conditional derivative. For legacy code the operation may not be differentiable at the exact condition (see Appendix B for practical details). This should be taken into account when performing gradientbased optimization.

Results
The motivation for developing an adjoint of a numerical model stretches far beyond providing comprehensive sensitivity experiments; often, an adjoint model is developed so that the sensitivities may be used in a gradient-based PDEconstrained optimization problem to invert for uncertain initial conditions, boundary conditions, or model parameters, thereby producing a data-constrained estimate for the evolution of the state of the system. Here, however, we are interesting in understanding model sensitivities. We present the sensitivity of the volume of the Antarctic Ice Sheet with respect to several control variables as a proof of concept, rather than extending the work in the direction of optimization, which will be the subject of future studies. The purpose is to gain physical insight into the model's linear response characteristics and to ascertain correctness and interpretability of the adjoint. The adjoint-derived sensitivities are compared to finite-difference perturbations, either at single points or over a patch of the domain that has been uniformly perturbed, to demonstrate that the adjoint model is sufficiently consistent with sensitivities derived via finite differences. Those comparisons are shown in Table 1. Lastly, compared to Heimbach and Bugnion (2009), we present this work for the novel application of examining Antarctic-wide adjoint-generated sensitivity maps. To the authors' knowledge, such a presentation has not been formally examined heretofore. Figures 2 and 3 respectively show the raw and logarithm of the absolute value (log 10 | • |) of the adjoint sensitivities. We have chosen to present the adjoint values in both ways so that the general pattern and sign of the adjoint values are readily apparent (Fig. 2) as is the order of magnitude of the adjoint values (Fig. 3), which can vary widely across the Antarctic Ice Sheet depending on the control variable. Further, we have chosen the locations shown in Fig. 2a so that the included dynamics and solvers invoked in the code can be tested in three different and important regions and regimes: location 1, the fast-moving Thwaites Glacier, which directly discharges into the Amundsen Sea Embayment; location 2, the middle of the Ross Ice Shelf; and location 3, Slessor Ice Stream, which feeds the Ronne-Filchner Ice Shelf. Testing the agreement between adjoint and finite-difference values at these locations offers a broad sense of the performance of the adjoint model in very different and important environments and dynamical regimes across the Antarctic Ice Sheet. The control variables we have selected to test involve either initial (ice thickness) or boundary conditions (summer precipitation, surface and basal temperature) and are independent inputs to either the conservation of mass (ice thickness and summer precipitation) or conservation of enthalpy (surface and basal temperature) equations.
The sensitivity of total Antarctic ice volume to the initial ice thickness compares well with the calculated finite-  Table 1. difference-based value (Table 1, column 7), differing only by about 1 % at the fast-moving Slessor Ice Stream (Fig. 2a,  location 3). Thickness sensitivities are relatively uniform and positive across the ice sheet over the 100-year simulation, except for a few outlet glaciers. A positive adjoint value in ice thickness indicates that a positive perturbation in ice thickness leads to a positive change in total volume, and vice versa. The Antarctic Ice Sheet is shown to have almost entirely positive adjoint values, as shown in Fig. 2a, except for a few marginal outlet glaciers. These few outlet glaciers that display negative ice thickness adjoint sensitivities contrast with other areas of ice discharge, notably the large floating ice shelves, which do not show any negative adjoint values. Figure 3a shows that the order of magnitude of this field of adjoint values is between 10 8 and 10 9 m 2 , except for several very sensitive outlet glaciers, including Thwaites and Pine Island Glacier, glaciers in Marie Byrd, Oates, and Wilkes Land, and Byrd Glacier.
The pattern of the January (austral summer) precipitation adjoint values largely mirrors that of the ice thickness, with several distinctions. The order of magnitude is much larger, ranging instead between 10 15 and 10 17 m 2 (Fig. 3b). Outlet glaciers in Marie Byrd, Oates, Wilkes, and Queen Mary Land exhibit weaker sensitivities compared to the average Antarctic-wide summer precipitation sensitivities. Portions of floating ice shelves from Queen Maud eastward all the way to Queen Mary Land show the highest sensitivities overall to summer precipitation, while the larger ice shelves exhibit some of the lowest sensitivity to summer precipitation across the entire continent, almost an order of magnitude lower than the floating ice fringing the coast between Queen Maud and Queen Mary Land, from 10 17 to 10 16 m 2 . Similar to the sensitivities to ice thickness, precipitation sensitivities are almost entirely positive, and the very lowest sensitivities are largely at the ice fronts (Fig. 3b). Table 1 shows less agreement in the January precipitation field calculated in the middle of the Ross Ice Shelf at point 2 (Fig. 2a), approximately a 12 % difference.
Sensitivities to surface and basal temperature ( Fig. 2c and  d) show much more structure than those of ice thickness and precipitation, with extended regions of positive and negative sensitivities. The sensitivities of total ice sheet volume to surface and basal temperature are largely negative, with the most negative values at the margins of the ice sheet and approaching zero toward the interior. The orders of magnitude of the surface and basal temperature sensitivities ( Fig. 3c and d) are comparable to each other, with maximum values of approximately 10 10 m 3 • C −1 .
Over the 100-year simulation, high sensitivities to surface and basal temperature at the margins extend inward toward the middle of the ice sheet following glacier drainage basins ( Fig. 3c and d). There are two distinct differences between the sensitivities to surface and basal temperature seen in the adjoint fields. First, the highest sensitivities to basal temperature are higher than the highest sensitivities to surface temperature, indicating that the total ice sheet volume is in general more sensitive to changes in the applied basal temperature of the ice rather than at the surface in SICOPOLIS. Second, Fig. 3d shows that the location of those most sensitive areas to changes in basal temperature are at the grounding lines of ice shelves and glaciers, while the most sensitive areas to changes in surface temperature are all across the surface of the floating portions of ice, with the sensitivity increasing (becoming more negative) toward the ice fronts. The adjoint values of surface and basal temperature compare well with finite-difference-based sensitivities, differing by about 4 % and 8 %, respectively. Table 1 also shows the results of a finite volume change calculation performed for the tile shown in Fig. 2a, region 4. Each control variable shown in column (1) is perturbed, as in the single point location finite differences, by ±5 % of the initial value of that field.
Columns (5) and (6) in Table 1 are computed in the following ways: V adj = 4 ∇J ·n dA is the area-integrated (or area-summed for discretized domains) volume change us-ing the adjoint-derived gradient information, where 4 is the area of the tile in Fig. 2a corresponding to the domain of control variable x. V fd = V (X+δX)−V (X−δX) 2 is the finitedifference-derived volume change, where X + δX is taken over the entire tile in region 4. In creating a sub-domain of Antarctica over which to calculate these finite volume changes, we selected an area of uniform sign. Areas with a great deal of sign variation might be more difficult to interpret since the adjoint values would tend to cancel each other.
The utility of this comparison is to convert the sensitivities into meaningful quantities that can be compared against each other to assess, for example, which control variable impacts the cost function the most given a perturbation of expected magnitude, in addition to providing another metric by which we may measure the adequacy of the adjoint model of SICOPOLIS. The percent difference between V adj and V fd over the 100-year time integration over tile 4 is largely higher than the point-wise measurements, ranging as high as 57 % for volume changes due to basal temperature perturbations uniformly in tile 4. Summer precipitation compares well, however, with a 6 % difference. Perhaps more interesting, the calculations in columns (5) and (6) suggest that, overall, summer precipitation has the largest impact on total ice sheet volume, with an approximate volume change of 10 18 m 3 , compared against initial ice sheet thickness as well as surface and basal temperature perturbations, which at the lowest resulted in a volume loss of 10 3 or 10 4 m 3 over 100 years. This result helps to explain the largest relative differences between adjoint and finite-difference sensitivities. These are large where the sensitivities are very small compared to the QoI, strongly suggesting that numerical noise plays an important role in either of these sensitivity calculations.
Lastly, the adjoint model of SICOPOLIS runs serially and completed 100 years of model runtime in 20, 75, and 600 min of wall clock time on a Linux box (Intel Xeon CPU E5-2650 at 2.00 GHz) for resolutions of 64, 40, and 20 km. The results shown in Figs. 2 and 3 are for 20 km resolution. To facilitate the adjoint computation SICOPOLIS-AD uses checkpointing, which is discussed in Appendix B.

Discussion
The results presented here are not meant to be exhaustive. Rather, they present initial adjoint sensitivity applications of the newly AD-enabled SICOPOLIS model, underscore the interpretable nature of adjoint-derived sensitivity fields, and are presented as a proof of concept for further investigation. They invite users to take advantage of this new infrastructure for their science applications. We leave an exhaustive study of sensitivities to different control variables in SICOPOLIS to future work, as here we only wish to examine a few important dynamic and thermodynamic controls and assess the validity of the adjoint model.
As a measure of the adjoint model's correctness, we compared gradients obtained from the adjoint model computed via finite-difference perturbations. Adjoint values compared acceptably against finite differences for ice thickness as well as surface and basal temperatures, with less than 10 % deviation. Austral summer precipitation adjoint values saw a larger disagreement with finite differences, of up to 12 %. Part of the higher discord may be due to the fact that the cost values (total Antarctic Ice Sheet volume) are very large, emphasizing numerical noise for sensitivity fields that are very small. Ice sheet volume changes calculated by the adjoint model and finite differences disagree more, although the largest discrepancy occurred with the smallest overall volumes calculated (both surface and basal temperature) and are thus likely, again, to be affected by numerical noise arising in the calculation. Control variables related to the conservation of mass equation provided the best agreement across measured metrics (ice thickness for point-wise sensitivities and precipitation for finite volume calculations). This is readily explained by the primarily linear nature of precipitation changes (seen as a volume flux) in changing total ice volume.
The general similarity between ice thickness and precipitation adjoint sensitivities ( Fig. 2a and b) is reassuring, as ice thickness and precipitation are both terms in the conservation of mass equation and as such are closely linked al-gorithmically. In particular, we might expect adjoint sensitivities to be linearly related. Similarly, we might expect the sensitivity to summer precipitation to be much larger than the sensitivity to initial ice thickness, as a perturbation in an initial condition is applied only once at model initialization, while a (constant in time) perturbation in the surface boundary condition is iterated for every time step throughout the 100 years of model runtime. One of the largest apparent differences between Fig. 2a and b is the appearance of high sensitivity on the floating portions of ice off the coasts between Queen Maud Land eastward to Queen Mary Land. Assuming that the "direct" linear effect of increased precipitation (volume flux) has the same effect everywhere on increasing the ice sheet volume, the difference between thin floating ice shelves and the ice sheet interior may be explained by the dynamical effect of increased ice shelf buttressing (i.e., reduced mass flux through the grounding line) as a consequence of ice shelf mass accumulation. This underscores the adjoint model's ability to accumulate local effects (direct effect on volume by thickening) and nonlocal effects mediated by ice sheet dynamics (thickening increases buttressing).
In a related manner, the overall similarity of the surface and basal temperature sensitivities is reassuring as both of these are components in the same conservation of energy equation. Both fields of sensitivities delineate the drainage basins of glaciers and ice shelves, with very small sensitivity in the center of the ice sheet that increases by orders of magnitude toward the coasts. The surface temperature sensitivities more uniformly affect total ice volume over the ice shelves, while the basal temperature sensitivities indicate that positive perturbations in basal temperature at the grounding lines of glaciers and ice shelves have a larger effect on total ice volume and that, when compared with each other, variations in basal temperature are more powerfully felt across the Antarctic Ice Sheet. This seems to indicate that changes in ocean temperature at the grounding lines around Antarctica have much more potential to have a lasting impact on the volume of the ice sheet than temperature changes in the atmosphere. However, this conclusion must be tempered by the fact that our current simulation of the surface of ice does not account for complex surface, englacial, or subglacial hydrology, e.g., meltwater ponding and induced catastrophic failure, as has been observed in the past at the Larsen B Ice Shelf, for example (Glasser and Scambos, 2008). In other words, the enormous difference in magnitude between sensitivity to summer precipitation and sensitivity to surface or basal temperature may change in other locations of the ice sheet where different physics are included (and must be differentiated in the sensitivity calculation).
Algorithmic differentiation relies on algorithms being differentiable, line by line, in a code. Numerical disagreement can accumulate for even simple reasons, such as the use of piecewise linear functions represented algorithmically by if statements (Appendix B presents a discussion on how unstructured or non-smooth code introduces error in adjoint codes developed by AD). Differentiable programming has recently emerged as a new programming paradigm in a physical system (Liao et al., 2019) and is certainly recognized to be of value for applications such as this work, wherein parametric sensitivities are being explored.

Conclusions
This work presents a new capability of the ice sheet model SICOPOLIS to enable flexible adjoint code generation using the open-source AD tool OpenAD. The flexibility is afforded by allowing a wide range of choices of model domains, numerical algorithms for specific configurations, and the control variables (independent variables) and quantities of interest (dependent variables; cost functions) defined when generating the adjoint code. We demonstrate the utility, correctness, and interpretability of adjoint-derived sensitivity maps for Antarctic-wide simulations, with the total volume of the Antarctic Ice Sheet chosen as the quantity of interest. We compute the quantities' sensitivity to initial and boundary conditions over a 100-year simulation from the present day. Examining, ascertaining, and understanding the information contained in such sensitivity maps, which are formally gradients of scalar-valued functions with respect to model inputs, is a useful and natural first step in the use of these sensitivities in gradient-based optimization problems, which will be the subject of future work. Such work, enabled by the adjoint model of SICOPOLIS, could include understanding how different parameterizations of precipitation, melting, and other interesting higher-order processes of ice flow affect quantities of interest.
One suggested outcome of the sensitivity analysis is that, as a controlling variable, mean monthly applied summer precipitation influences the total integrated Antarctic Ice Sheet volume more than the initial ice geometry or surface and/or basal temperatures do for representative values of perturbation in each of these variables. Another hypothesized (and perhaps unsurprising) relationship derives from a comparison between the surface and basal ice temperatures: changes in basal temperature, particularly at grounding lines, affect total ice volume much more than those in surface temperature.
Much remains to be learned and further examined in the context of this model, including the degree to which results may be applicable to other models. Our results are specific for a given configuration of SICOPOLIS, with emphasis placed on the initial use of simple parameterizations for (often) the most interesting aspects of ice flow, including how basal melting and firn compaction are represented (both processes would be affected by the control variables chosen here). Our metrics of model validity evaluated point-wise show that the adjoint model is mostly accurate to within 10 % compared to sensitivities obtained via the finite-difference method. One likely reason for larger disagreements in some of the calculated metrics may be the regimes of very weak sensitivities, in which case numerical noise becomes a leading factor in the inferred differences.
Another cost function may be formulated as a modeldata misfit based on, for example, the modeled versus observed spatiotemporal ice elevation change. Additionally, over-reliance on inherently non-differentiable piecewise linear functions for important aspects of surface mass balance terms may introduce discrepancies that could be minimized with the use of smoother functions or smooth implementations of parameterization schemes. These are valid and important aspects of code that are not easily addressed (Hascoët and Utke, 2016). We have described in some the detail code refactorization steps required for SICOPOLIS to comply with code parsing and analysis steps undertaken by Ope-nAD in Appendix B. Many of the issues described in the Appendix are frequently encountered when subjecting legacy code to AD or when considering the development of new code that should be subjected to AD. The Appendix thus provides insights for coding best practices in the context of AD beyond the application to SICOPOLIS.
As glaciologists strive to make ever more confident projections in the future behavior of ice sheets, tools that rigorously determine the relationship between often poorly known input parameters and important model outputs are increasingly needed. SICOPOLIS-AD is one such tool that is freely available to the cryosphere community (Logan et al., 2019) and, as demonstrated here, can help elucidate relationships between model inputs and outputs that were previously unknown or untested.

Appendix A: Greenland ice volume sensitivities
Here we present the results of a 100-year sensitivity study of Greenland Ice Sheet volume to basal ice temperature. This is added as an Appendix as Greenland sensitivities have been produced previously by Heimbach and Bugnion (2009), albeit with a different AD tool, Transformation of Algorithms in Fortran (TAF; , for a version of the SICOPOLIS model that is more than a decade old and that was not maintained as part of the SICOPOLIS main development trunk. Since then, SICOPOLIS has been updated to include a more state-of-the-art representation of thermodynamics via the enthalpy method, and work here represents an advance on what was presented before. The forward simulation of Greenland is configured in much the same way as for Antarctica, with an emphasis on simplicity for proof of concept. Unless otherwise stated below, choices of numerical schemes, physical parameterizations, and forcing approaches are the same. We simulate Greenland for 100 years from the present day at a 10 km horizontal resolution with 81 terrain-following vertical layers. The dynamic and thermodynamic time steps take the value of 0.5 years. The dynamics now are only SIA, as we have restricted our simulation to grounded ice. The thermodynamic formulation is again via the conventional enthalpy method, with ice initialized at a constant temperature of −10 • C. The constitutive law and physical parameters are exactly the same as in the Antarctic case, including the flow enhancement factor, geothermal flux, and all parameters for the sliding law. The ice initial geometry is from Bamber et al. (2013). Surface temperature is from Ritz (1997) and is held constant throughout. The monthly precipitation fields are created with the regional energy and moisture balance model REMBO using the setup described in Robinson et al. (2010) taken on the grid provided by Bamber et al. (2001). The temperature and humidity boundary conditions are from Uppala et al. (2005). These monthly climatological fields are averaged over 1958-2001 and applied as lateral boundary conditions to REMBO. Table A1. Comparison between adjoint-derived (column 3) and finite-difference-derived (column 4) point-wise sensitivities for Greenland ice volume as QoI (symbols as in Table 1). All regions in column (2) refer to points from Fig. A1a. Column (5) is a percent deviation metric, which is calculated as |col. (4)

Variable
Region (from Fig. A1a) July precipitation 1 2.72 × 10 16 2.72 × 10 16 3.37 × 10 −3 Surface temperature 2 −5.61 × 10 4 −5.57 × 10 4 7.18 × 10 −1 Basal temperature 2 −3.80 × 10 6 −3.80 × 10 6 8.20 × 10 −3 Ice thickness 3 4.86 × 10 2 4.89 × 10 2 5.04 × 10 −1 Figure A1 shows the total Greenland Ice Sheet volume sensitivity to the initial condition of ice thickness and the boundary conditions in July (boreal summer) for precipitation as well as surface and basal temperatures. Table A1 shows that, in general, the Greenland simulation performs much better than the Antarctic simulation, with all of the percent deviations between adjoint values and finite-differencebased gradients less than 1 %. We attribute this to the lack of SSA dynamics involved and the accompanying use of an external solver library.
Interestingly, whereas in Antarctica the ice thickness sensitivities were almost entirely positive, substantial portions of the Greenland Ice Sheet lose volume when perturbed positively in ice thickness, a phenomenon previously inferred by Heimbach and Bugnion (2009) and thus a robust feature of SIA models. This could be due to the dynamic drawdown of glaciers that experience a sudden increase in driving stress due to the increase in ice thickness. The increase in driving stress leads to increases in velocity, which, when subjected to the land-ice-only mask for the SIA dynamics used in this setup of Greenland, results in the immediate cutoff of ice.
Sensitivity to precipitation, as in the case of Antarctica, is almost entirely positive and again dwarfs the other control variables tested here by many orders of magnitude. The overall larger magnitude of basal temperature sensitivities compared to surface temperatures is consistent with the Antarctic simulation. Completion of Greenland serial simulations on a Linux box (Intel Xeon CPU E5-2650 at 2.00 GHz) took 5, 10, and 140 min for horizontal resolutions of 40, 20, and 10 km, respectively. The results shown here are for 10 km resolution.  Table A1.

Appendix B: Modifying SICOPOLIS
We made several modifications to SICOPOLIS to enable source transformation and differentiation via OpenAD. The changes that were made enabled efficient AD in some cases and overcame some limitations of the AD tool used in others. The modifications are guarded by C preprocessor (CPP) directive ALLOW_OPENAD and do not affect the original behavior of SICOPOLIS in any way. Below, we discuss the noteworthy changes.
-Data types. SICOPOLIS determines the number of bits for its data types at runtime through the calls selected_int_kind() and kind(). Because OpenAD requires full knowledge of the types for static analysis, it does not support this behavior. We therefore determine the number of bits per type separately for the machine being used and specify the value directly in the code.
-Unstructured code. Restricting code to comply with "simple loops" is common in models subject to AD and is good coding practice in general, as it supports compiler optimization of loops.
-Non-smoothness. Non-smoothness in the underlying mathematics of a model can be caused by the use of the absolute value, ceiling, and floor functions. Non-smooth models can be non-differentiable at a few or many points of the input space. Techniques such as piecewise linear differentiation and the absnormal form have been studied to differentiate non-smooth applications (Streubel et al., 2014). While SICOPOLIS employs all three functions, they are either used to index into lookup tables or used in portions not differentiated by OpenAD. Because OpenAD does not include abs, ceiling, and floor as functions within its intrinsic library, we created custom subroutines of these functions to be differentiated.
-The sqrt function. The derivative of √ x is 1 √ x . When x = 0.0, the result is a kink in the adjoint model and the appearance of NaN in the adjoint computation. The intended behavior of the adjoint model is to treat the derivatives as 0.0. Therefore, wherever the function sqrt() appears in SICOPOLIS, we use a conditional to check if the input to sqrt() is 0.0, and in those cases we use 0.0 instead of calling sqrt().
-Array declaration and array assignments. SICOPOLIS uses dynamic memory allocation for some arrays in the code. Because the handling of dynamic memory and pointers by source transformation AD tools such as OpenAD remains a topic of active research, we replaced the dynamic allocation with static allocation.
SICOPOLIS uses constructs such as where and elsewhere to elegantly assign values to array elements. Because OpenAD does not support these constructs, we rewrote them using loops and if statements.
-Intent of variables. SICOPOLIS passes the indices of two-and three-dimensional arrays as arguments with intent(in) to subroutines that act upon particular portions of the array. When these variables are not the type of active variable (usually real(dp)), their declarations must be changed to intent(inout). For variables that are active OpenAD changes the intent automatically.
-Solvers. SICOPOLIS employs an array of solvers depending on the domain (e.g., Greenland versus Antarctica) or physics chosen by the user: a successive overrelaxation (SOR) solver, a tridiagonal solver, and (for Antarctic domains) the library of iterative solvers (LIS) for computing a system of linear equations: A · x = b → x := solve(A, b).
To differentiate the above formulation efficiently, an AD tool must not naively differentiate through the solver code. OpenAD uses its template mechanism instead to encode the formulation below (Giles, 2008) to compute the adjointsĀ andb fromx using the original solver call: A T ·b =x →b := solve(A T ,x), A := −x T ·b.
When SICOPOLIS uses the SOR solver for a system of linear equations wherein the matrix storage is in compressed sparse row (CSR) format, arrays are represented by lgs_a_value (values), lgs_a_index (indices), and lgs_a_ptr (pointers). While the symbolic differentiation of the solver can be handled as above, the formation of the CSR representation requires us to change the type of the indices into real(dp) so that the indices are stored in the forward sweep for use in the reverse sweep.
-Checkpointing. For adjoint models, the memory requirement to compute the adjoint information is proportional to the operation count of the model being differentiated. We found that the memory requirements of the adjoint model of SICOPOLIS for even small number of time steps will quickly exceed the available memory of most machines. Therefore, we implemented a binomial checkpointing scheme using the library revolve (Griewank and Walther, 2000). This approach uses the recomputation of time steps in the original model to reduce the memory requirements of the adjoint model.
Code availability. SICOPOLIS is free and open-source software available through a persistent Subversion repository that is hosted by the FusionForge system AWIForge of the Alfred Wegener Institute for Polar and Marine Research (AWI) in Bremerhaven, Germany (https://swrepo1.awi.de/, AWI FusionForge, 2020). Detailed instructions for obtaining and compiling the code are at http://www. sicopolis.net (SICOPOLIS.net, 2020). The adjoint generation capability of SICOPOLIS is a part of the main trunk of the current developmental version (5-dev). The development and tests were performed using SICOPOLIS v5-dev (revision 1414), tagged as SICOPOLIS-AD v1. It can be specifically downloaded at https:// swrepo1.awi.de/svn/sicopolis/tags/ad-v1 (last access: 2 April 2020) (using "svn checkout" with "anonsvn" as the username and password). A snapshot non-revocable code archive of SICOPOLIS-AD v1 has been created at https://doi.org/10.5281/zenodo.3686393 (Logan et al., 2020). The AD tool used to generate adjoint source code is OpenAD. A snapshot non-revocable code archive of OpenAD can be downloaded at https://doi.org/10.5281/zenodo.3361744 . Detailed instructions on how to download and build the tool are at https://www.mcs.anl.gov/OpenAD/ (last access: 2 April 2020). Technical details on how to set up, compile, and run reference configurations of SICOPOLIS-AD are documented in a quick-start manual (Logan et al., 2019), a version of which has also been placed in the SICOPOLIS-AD Zenodo archive.
Author contributions. LCL, SHKN, and PH developed the adjoint code of SICOPOLIS; RG originally developed SICOPOLIS, provided insight on the model results, and helped host the freely available version of the code.