A Gaussian process emulator for simulating ice sheet-climate interactions on a multi-million year timescale: CLISEMv1.0

. On multi-million year timescales, fully coupled ice sheet - climate simulations are hampered by computational limitations, even at coarser resolutions and when using asynchronous coupling schemes. In this study, a novel coupling method CLISEMv1.0 (CLimate-Ice Sheet EMulator version 1.0) is presented where a Gaussian process emulator is applied to the 10 climate model HadSM3 coupled to the ice sheet model AISMPALEO. The temperature and precipitation fields from HadSM3 are emulated to feed the mass balance model in AISMPALEO. The sensitivity of the evolution of the ice sheet over time is tested with respect to the number of predefined ice sheet geometries the emulator is calibrated on. Additionally, the model performance is evaluated to the formulation of the ice sheet parameter (being either ice sheet volume, either ice sheet area, or both) and to the coupling time. Sensitivity experiments are conducted to explore the uncertainty introduced by the emulator. 15 Also different lapse rate adjustments are used between the relatively coarse climate model and the much finer ice sheet model topography. It is shown that the ice sheet evolution over a million-year timescale is strongly sensitive to the definition of the ice sheet parameter and to the number of predefined ice sheet geometries. With the new coupling procedure, we provide a computationally efficient framework for simulating ice sheet-climate interactions on a multi-million year timescale that allows for a large number of sensitivity tests.


Introduction
Earth System Models provide the state-of-the-art for quantifying feedbacks between the different components of the climate system on a decadal to centennial timescale (Eyring et al., 2016). On millennial to multi-millennial timescales, Earth System Models of Intermediate Complexity are used to explore the feedbacks in the climate system between the ice sheets, the atmosphere and the ocean (Eby et al., 2013;Van Breedam et al., 2020). Those fully coupled models, even at coarser resolution, 35 are computationally very expensive and other techniques have been proposed to simulate ice sheet-climate interactions on a (multi) million-year timescale.
The basic asynchronous method, also called the direct asynchronous method is a simple and straightforward coupling technique to address the different response times between ice sheets and the atmosphere (Pollard, 2010). The direct asynchronous method 40 consists in running a climate model, typically a General Circulation Model (GCM), for a few decades until steady state and then use the relevant climatic information over the polar regions as input to an ice sheet model. The ice sheet model is typically run for a few thousand years before the climatic information is updated (e.g. DeConto and Pollard, 2003;Gasson et al., 2016).
This procedure is repeated for the entire time span of interest. The indirect asynchronous coupling, also called the matrix method or the GCM lookup table (Pollard, 2010) is based on predefined, idealised GCM snapshots that span the possible 45 forcing during the entire simulation period. The matrix look-up table is more sophisticated, with the creation of a matrix of climate model output from the extremes of the forcing and some intermediate climate states in between (Ladant et al., 2014;Stap et al., 2017;Berends et al., 2018). The coupling procedure linearly interpolates the climatic fields from precursor climate model runs.

50
An alternative approach is to consider a Gaussian process emulator. A Gaussian process emulator is a statistical model that fits a Gaussian process to data in order to link input with output fields of a model, generally referred to as the simulator (Andrianakis and Challenor, 2012). Emulators have been used for a number of applications in climate science, for instance as a tool to predict the future climate evolution (Levermann et al., 2020) or sea-level rise as a result of land ice melting (Edwards et al., 2021), based on large ensembles of simulations with each different model input parameters. It is also a useful technique 55 to couple different components of the climate system that would require large computational resources such as an atmosphereocean coupling (Tran et al., 2019). An emulator has been used to assess the sensitivity of the climate during the Pleistocene (Araya-Melo et al., 2015) and the late Pliocene . In these simulations, the ice sheets are static and defined by different ice sheet geometries. So far, an emulator has not been used to study the climate system including dynamic ice sheets. 60 Here a Gaussian process emulator is presented that is calibrated on the climatic output from the climate model HadSM3 to force an ice sheet model in order to predict the climate over Antarctica during the late Eocene. The late Eocene to Eocene-3 Oligocene Transition is chosen because of the large contrast in continental glaciation and large variations in climate forcing such as CO2 concentrations at this time (Pagani et al., 2011;Zhang et al., 2013). The ice sheet model runs continuously over a 65 multi-million year period and passes the ice sheet geometry information (referred to as the ice sheet parameter) to the emulator (statistical representation of the climate model). The emulator calculates the climatic variables based on the prescribed external forcing (carbon dioxide concentration and orbital parameters) and the actual ice sheet parameter and returns temperature and precipitation data to the ice sheet model. This coupling procedure is novel, but various implementation choices may influence the result: the approach for lapse rate adjustment, the coupling time between ice sheet and climate, and the definition of 70 emulator input variables. Also, the number of GCM experiments on which the emulator is tuned might have an influence on the predicted climate. The key questions to be answered can be summarized as follows: 1. The ice sheet parameter is defined by a number that represents the influence of the ice sheet in the climate system (Araya-Melo et al. 2015. The ice sheet mainly influences the local climate by its distinct albedo, 75 by its height and by its freshwater input into the ocean (not taken into account in this study). Therefore, it is not trivial how the ice sheet parameter should be defined in a single number. Is ice volume a proper way to define the ice sheet parameter? Does ice area represent the climatic changes better? Or is it best to calibrate the emulator based on both ice volume and ice area? 2. The emulator needs a number of input ice sheet geometries to simulate the climate for a range in orbital parameters 80 and CO2 values for the given ice sheet geometry. How many ice sheet geometries and climate model experiments are needed? Does the spacing between different ice sheet geometries influence the model performance?
3. The lapse rate adjustment between a coarse resolution climate model and the high-resolution ice sheet model is usually applied by a constant value for the moist adiabatic lapse rate over the domain. Common values are 5˚C per km (Ladant et al., 2014), 6.5˚C per km (Löfverström et al., 2015), 7˚C per km (Thompson and Pollard, 1997) or 8˚C per km 85 (Berends et al., 2018). The lapse rate above ice-covered regions is found to be 4.9 ˚C per km, smaller than the typical values for the moist adiabatic lapse rate (Gardner et al., 2009). Moreover, the near surface lapse rate varies spatially and temporarily between diurnal and seasonal cycles as opposed to the free adiabatic lapse rate that has a rather constant value (Marshall et al., 2006). What is the influence of using a different lapse rate on the ice sheet evolution? 4. With asynchronously coupled climate-ice sheet model runs one generally updates the climatic information every 90 several thousand years only, given the long response time of the ice sheets and the computational limits. However, the choice of the coupling time might have an influence on the ice sheet evolution over time. What is the optimal coupling time to have a realistic, yet efficient model running ? 5. What is the uncertainty introduced by the emulator and what is its influence on the coupled ice sheet-climate simulations? 95 4

Model description
In this section, the new coupling method CLISEMv1.0 is described together with the climate model HadSM3 and the ice sheet model AISMPALEO. CLISEMv1.0 is calibrated on climatic output from HadSM3 and provides the forcing fields (monthly temperature and precipitation) for the ice sheet model. The ice sheet model AISMPALEO returns the ice sheet volume and/or area to CLISEMv1.0. 100

Climate model HadSM3
The climate model HadSM3 (Williams et al., 2001) is an atmosphere-slab ocean General Circulation Model (GCM). It has a resolution of 2.5˚ in latitude and 3.75˚ in longitude, with 19 levels in the vertical (Gordon et al., 2000). The MOSES-1 scheme is chosen as the land surface scheme (Cox et al., 1999) with a tundra-like albedo on the Antarctic continent where no ice is 105 present and an albedo for snow where ice is present. Sea surface temperatures are reconstructed based on a best estimate from Evans et al. (2017) for the late Eocene in order to calibrate the corrective heat fluxes from the slab ocean model. These corrective heat fluxes represent the seasonal deep water exchange and horizontal heat transport that is present in the real ocean.
The oceanic heat fluxes are exchanged between the atmosphere and the slab ocean model in the mixed-layer, which is 50 m thick in our simulations. This way, realistic sea surface temperatures are simulated for the different climate model simulations. 110 The model is chosen because of its good performance over the Antarctic ice sheet for the present-day and the Last Glacial Maximum (Connolley and Bracegirdle, 2007;Maris et al., 2012). Moreover, HadSM3 is a computational efficient climate model that allows performing a large number of experiments (Valdes et al., 2017). The paleogeographic reconstruction for the simulations is based on the method presented in Baatsen et al. (2016) and makes use of the Gplates software (Müller et al., 115 2018). The paleogeographic reconstruction represents the continental configuration at 39 Ma to be representative for the late Eocene. As a result, the Antarctic continent has a slightly different position compared to the present-day. The bedrock topography for Antarctica is derived from the Wilson et al. (2012) maximum bedrock elevation reconstruction ( Figure 1).
Simulated temperatures for a warm orbital configuration (maximum austral summer insolation) and a cold orbital configuration (minimum austral summer insolation) are shown in Fig. 2a and 2b, respectively. 120

Antarctic ice sheet model AISMPALEO
The Antarctic ice sheet model AISMPALEO is a three-dimensional thermomechanical ice sheet/ice shelf model (Huybrechts and de Wolde, 1999) used for simulating ice sheet dynamics during periods in the past (Goelzer et al., 2016a(Goelzer et al., , 2016b and the future (Seroussi et al., 2020. The Shallow Ice Approximation is used to calculate the grounded ice flow, which is a result of internal deformation and basal sliding where the pressure melting point is reached. The model 125 comprises a component taking into account the solid Earth response due to ice loading. This component consists of a rigid 5 elastic plate on top of a viscous asthenosphere to allow for deviations from local isostatic loading. The surface mass balance is computed using the Positive Degree Day (PDD) method (Janssens and Huybrechts, 2000), where the yearly sum of daily average temperatures above 0 ˚C is used to determine the melt potential. The standard deviation of the mean daily temperature is 4.2 ˚C, representing random weather fluctuations and the daily cycle. The difference in snow and ice albedo in the ice sheet 130 model is taken into account by using a PDD factor for snow melting of 0.003 m ice equivalent (i.e.) per ˚C per day and a PDD factor for ice melting of 0.008 m i.e. per ˚C per day. Monthly mean temperature and precipitation are used from HadSM3 to drive the PDD model. The rain limit is chosen at 1 ˚C and determines whether precipitation falls as snow or as rain. Meltwater retention allows runoff to be retarded and/or to eventually refreeze in the snowpack. Ice shelf formation is included and calculated using the Shallow Shelf Approximation. Ice shelves start to form when the grounding line reaches the coast and the 135 influx of ice from the continent exceeds the ablation (surface ablation and basal melting). A constant basal melt rate of 1 m per year is used in all the simulations. The ice sheet model is run at a resolution of 40 km to allow for long integrations.

CLISEMv1.0: set-up and calibration
The Gaussian Process (GP) Principal Component Analysis (PCA) emulator (Wilkinson, 2010;Bounceur et al., 2015; used in this study is a statistical representation of the climate model HadSM3 (the simulator). The emulator is 140 calibrated on a relatively small number of climate model runs and aims to predict the climate for any combination of climatic forcing of the original climate model runs. To allow for reliable predictions, the initial climate model runs need to fill the entire multi-dimensional input space. In our case, the multi-dimensional input space consists of the orbital parameters (eccentricity e, obliquity ε and longitude of perihelion ϖ), the carbon dioxide concentration forcing, and the ice sheet size and extent defined by the ice sheet parameter. The theoretical background has already been discussed in detail in previous papers (Araya-Melo et 145 al., 2015;Bounceur et al., 2015;Lord et al., 2017) and here the focus is on the implementation of the dynamic ice sheet component.
It is recommended to have at least 10 experiments per input parameter (Loeppky et al., 2009). Therefore, the recommended minimum number of experiments with our five input parameters (three orbital parameters, CO2 concentration and ice sheet 150 parameter) would be 50. Since the atmosphere-slab ocean model is time-efficient, it is chosen to run 100 climate model runs with 5 variable forcing parameters. The ice sheets have a very distinct climatic imprint compared to the orbital parameters and the CO2 level, which all result in smooth climatic fields. Because of the large difference in albedo between ice and tundra at the edge of the ice sheet, the climatic imprint of a certain ice sheet geometry has a sharp boundary. The number of ice sheet geometries taken into the model design of the emulator might therefore have a large impact on the performance of the emulator. 155 To test the impact of the ice sheet parameter, four different emulators are constructed based on a different number of predefined ice sheet geometries or based on a different spread of the ice sheet geometries. The different emulators are named according to the number of ice sheet geometries in the model design, being 8, 12 and 20 for respectively EMULATOR_8, EMULATOR_12a, EMULATOR_12b and EMULATOR_20. 6 160 Except for the number of predefined ice sheet geometries, also the spread of the different prescribed geometries is varying between the different emulators, depending on whether the ice sheet parameter is defined by ice area or by ice volume.
EMULATOR_8, EMULATOR_12a and EMULATOR_20 have a good spread between the different ice sheet geometries in terms of ice volume and ice area. EMULATOR_12b is well defined for ice volume, but poorly defined by ice sheet area as there are several experiments with the same ice sheet area yet different ice sheet geometry. (see Table A1 in Appendix for the 165 experimental parameter values). This way, the influence of the spread in ice sheet volume/ice area on the emulated climate is investigated. The spacing of the different ice sheet geometries is expected to be crucial for medium sized ice sheets, because they constitute a transition zone towards a fully glaciated continent. EMULATOR_20 has the smallest spacing of ice sheet geometries around the crucial medium sized ice sheets, separated at the minimum distance that corresponds to the resolution of the climate model. The maximum ice sheet geometry in the model design for EMULATOR_12 is smaller than the maximum 170 ice sheet geometry in the model design for EMULATOR_20 and EMULATOR_8. The objective designing EMULATOR_12 is to evaluate to what extent the emulator can still be used in an extrapolation regime beyond the largest ice sheet geometry.
A model design with 100 GCM experiments is constructed where each experiment has a different combination of orbital parameters, CO2 concentration values and ice sheet geometry (see Table A1 in Appendix). Insolation values are well 175 approximated as a linear combination of the eccentricity and longitude of perihelion (Loutre, 1993) and therefore, the terms esinϖ and ecosϖ in combination with the obliquity ε are used for the orbital parameter variation in the model design. The range of orbital parameters is taken from Laskar et al. (2004) for the period 40 Ma to 33 Ma. The eccentricity has a maximum value during the period between 40 Ma and 33 Ma of 0.063 and the obliquity is sampled in the range of 22 -24.5˚. The CO2 interval ranges from 1150 ppmv to 550 ppmv, roughly equivalent to 4xCO2 to 2xCO2. The ice sheet parameter consists of 8, 12 or 20 180 predefined ice sheet geometries. They are constructed based on preliminary steady-state ice sheet model runs for a range of different climatic forcings (for EMULATOR_12a and EMULATOR_12b) or from ice sheet geometry snap shots during the build-up of a continental scale ice sheet (for EMULATOR_8 and EMULATOR_20). The final ice sheet geometries are chosen to provide a range from an almost ice-free Antarctic continent up to a fully glaciated continent. Tundra is present between the ice sheet margin and the coast. The parameter combinations are constructed using a Latin hypercube design where the 185 minimum Euclidean distance between two parameter combinations is maximised ( Figure 5). With this model design, 100 GCM runs are performed until the climate (atmosphere and slab ocean) is in steady-state with the forcing (40 years).
The design matrix of input data D (n x p) has 100 rows (number of experiments) and 5 (esinϖ, ecosϖ, ε, CO2, ice volume or ice area) or 6 columns (esinϖ, ecosϖ, ε, CO2, ice volume, ice area). Each simulation performed by the climate model is 190 characterized by a row of matrix D, called the input vector xi. The climatic output (temperature and precipitation) where HadSM3 is a function of the input vector is called f(xi) from all 100 experiments is saved in the matrix Y. Each column of Y contains the output for one experiment. Here, the matrix Y only contains climatic output data on the ice sheet model grid (201x201 grid points) because our interest is the climate evolution over Antarctica.

195
The climatic output is modelled as a Gaussian process, defined by a mean function m(x) and a covariance function V(x,x').
The prior mean function is defined by a linear combination of a set of linear regression functions (Eq. 1), where β is a vector of regression coefficients corresponding to the mean function and h(x) is a vector of known regression functions of the inputs.
The covariance function consists of the correlation function and the scaling value σ 2 (Eq. 2). The squared exponential correlation function is chosen with the inclusion of the so called nugget ν and correlation length δ hyperparameters (eq. 3). 200 The nugget was originally meant to deal with sampling variability of the simulator output, but even when this sampling variability is small, it is effective to prevent numerical instability and to compensate for inadequate correlation priors (Andrianakis and Challenor, 2012;Araya-Melo et al., 2015). The correlation length δ can be understood as a measure of how quickly the model output is changing as a function of the input (Wilkinson, 2010). The larger the distance between two input vectors, the quicker the correlation goes to zero. 205 210 The formalism followed here is Bayesian and the prior mean and prior covariance functions (Eq. 1-3) are updated (Eq. 4-5) based on the climate model output data. All values of β are a priori equally probable and we assume a vague conjugate prior (β,σ 2 ) that is proportional to σ 2 . The posterior distribution of the model data (temperature and precipitation) is a Student's t distribution with n-q degrees of freedom (which is close to Gaussian) with a mean m*(x) and covariance V*(x,x'): A PCA is performed on the climatic output and between 17 and 20 components are kept before calibrating the emulator (see model code for PC, length scales and nugget values). The Gaussian process model with the posterior means and variances is 220 applied to each principal component. Once the PCA emulator is calibrated (by optimizing the nugget and length scales), the scores of each principal components can be estimated for arbitrary input values, with an associated covariance that effectively measures the prediction uncertainty.
As described before, four different emulators are constructed (EMULATOR_8, EMULATOR_12a, EMULATOR_12b and 225 EMULATOR_20) and each emulator is calibrated with either ice volume, ice area or with both ice volume and ice area as the ice sheet parameter. This gives a total of 4 x 3 = 12 distinct calibrated emulators to simulate the ice sheet evolution. Calibration of an emulator is achieved by adjusting the length scales δ, the nugget ν (uncertainty band) and the number of Principal Components (PCs) in order to minimize the root mean square error between the simulated and predicted climatic fields in leave-one-out experiments. It is assumed that the sampling error introduced by model variability is almost negligible because 230 we are using a slab-ocean climate model version where the internal climate variability is small and the climate states quickly converge to a mean value. Therefore, the adopted nugget is chosen to be 0.001 (small non-zero uncertainty band around the data) for the temperature and 0.01 for the precipitation emulation. The emulator calibration is done for precipitation and temperature data for each month. During the calibration process, the number of PCs is varied between 5 and 25. It is chosen to keep 20 PCs to explain the observed variation in temperature and 17 PCs to explain the variation in precipitation, since these 235 numbers gave the best emulator performance (as explained further on).
R's Nelder-Mead optimization function (Nelder and Mead, 1965) is used to maximise the likelihood of the emulator (Kennedy and O'Hagan, 2000;Lord et al., 2017). We obtain a low length scale for the ice sheet parameter (between 0.02 and 0.5). When looking at the summer temperature for all GCM runs, there is a smooth, almost linear dependency between the ice sheet 240 parameter and the simulated summer temperatures ( Figure 6) and therefore we increased the length scale for the ice sheet parameter. The optimization function is used to get the correlation lengths for the orbital parameters and the CO2 forcing right and the correlation length for the ice sheet parameter was manually chosen to be 1.2 for all emulators.
The emulator performance is determined by the variance of the emulator and the reliability of the emulator. The variance is a 245 measure of the uncertainty of the mean predictions of the emulator. The reliability of the emulator determines how well the emulator is calibrated (i.e. how well the emulator estimates its own uncertainty). Ideally, the emulator is well-calibrated and the uncertainty is low. The calibration is investigated by leave-one-out experiments where the simulated temperature is predicted based on the calibrated emulator while leaving out one experiment at a time. The results are visualised in Fig. 7 where the number of grid points that is predicted within 1 (grey) to 4 (red) standard deviations from the simulated temperature 250 is given for each of the GCM runs for EMULATOR_20. Overall, all emulators perform well, since > 68 % of the grid points are predicted within 1 standard deviation. The calibration based on ice volume and ice area separately shows a very similar reliability and uncertainty. Even though the variance for the emulators calibrated both on ice area and on ice volume is lower, it has a lower reliability because it struggles to capture the output dependency on both variables simultaneously.

255
The spatial difference between the simulated and the predicted climatic fields is visualised in Fig. 8. January temperatures are shown for experiment xaemdk that performs poorly and for experiment xaemdv that performs quite well. These experiments are run with the second smallest and smallest ice sheet geometry, respectively. The predicted temperatures are warm-biased up to 8˚C for the tundra region and cold-biased up to 10 ˚C for the ice-covered region for experiment xaemdk. The bias is much smaller for experiment xaemdv with errors less than 2 ˚C over most of the Antarctic continent. Other experiments that perform 260 bad such as xaemdg and xaembb have a very high eccentricity, high obliquity and summer during aphelion or perihelion. They have the most extreme insolation values and lay at the edge of the experimental design, which may explain why the emulator does a more poor job at predicting the simulated temperatures. .

Coupling procedure between AISMPALEO and the emulator
Due to computational limitations, the coupling procedure between ice sheets and climate on a multi-million year timescale is 265 always asynchronous. However, the emulator-ice sheet coupling leaves the possibility for a very short coupling time because once the emulator is calibrated, it can be run in standalone modus (i.e. without the need to use the simulator). In these simulations, the ice sheet model is initialized from an ice-free state. After a predefined time (1000 years in the standard experiments), the ice sheet model passes the ice sheet parameter (the actual ice volume, ice area or both ice volume and ice area) to the emulator and the emulator provides temperatures and precipitation as a function of the orbital parameters, CO2 270 forcing and the ice sheet parameter. In our simulations, temperature and precipitation are interpolated to the ice sheet model grid using a bilinear interpolation scheme, in order to have smooth climatic fields. In the standard experiments, a constant lapse rate correction of 5 ˚C per km is applied between HadSM3 and AISMPALEO. The climatic information is lapse rate corrected for the nearest input ice sheet geometry in terms of ice volume or ice area.

Sensitivity of the ice sheet evolution to the model set-up 275
In this section, the sensitivity of the ice sheet evolution is tested for the different emulators. The influence of the different number of ice sheet geometries in the model design is investigated in combination with how the ice sheet parameter is defined.
In addition the sensitivity of the ice sheet evolution is tested to the coupling time and to the lapse rate adjustment between the coarse climate model and the much finer ice sheet model. The performance of the four different emulators is assessed. The ice sheet evolution is forced over a 3 million year time period with the real orbital forcing from 38 Ma to 35 Ma (Laskar et al., 280 2004) and CO2 scenarios assuming a linear decrease in concentrations from around 980 ppmv to 720 ppmv.

Sensitivity to the definition of the ice sheet parameter
The ice sheet parameter represents the shape and area of the ice sheets. In previous studies (Araya-Melo et al., 2015;, it is defined by indexing the different ice sheet geometries that have been used to generate the simulation outputs. . Yet, there are several other options on how to define the ice sheet parameter. Here, it is proposed to define the ice sheet 285 parameter by quantifying the ice sheet volume, the ice sheet area or by a vector combining both aspects. This is done for the four different experiment designs. The ice sheet area and ice sheet volume are good parameters to define the ice sheet's influence on climate because the first parameter affects the local albedo and the latter has an influence on the elevation and 10 hence on the temperatures through adiabatic cooling. In case the ice sheet parameter is defined by both the ice sheet volume and the ice sheet area, both variables are calculated after each iteration of the ice sheet model. 290 Figure 9a shows the ice sheet evolution for the four emulators calibrated on ice volume. EMULATOR_12a, EMULATOR_12b and EMULATOR_20 show the transition towards a continental scale ice sheet in a very narrow CO2 interval of 845 to 875 ppmv. On the other hand, EMULATOR_8 does not seem to show any sensitivity to the CO2 forcing during the 3 million yearlong simulation (and also not on a longer timescale). The reason is that the prescribed ice sheets are separated too much in the 295 initial climate model runs. Because of the large difference in albedo between ice and tundra, the prescribed ice sheets create a sharp boundary at the ice sheet margin that is visible in the temperature field. If insufficient prescribed ice sheet geometries are used, the ice sheet does not grow enough to see the temperature regime obtained when HadSM3 is run with the next input ice sheet geometry. Consequently, the emulated temperatures remain too high at the ice sheet margin. It appears that the threshold on the number of needed ice sheet geometries is somewhere between 8 and 12. Using 20 input ice sheet geometries 300 does not change the model performance much.
Another option is to calibrate the emulator based on the ice area, which is more directly linked to the albedo. The glaciation threshold for EMULATOR_12a and EMULATOR_20 shows a very similar sensitivity to CO2 of about 860 ppmv (Figure 9b).
EMULATOR_8 grows immediately to a medium sized ice sheet and cannot grow further towards a continental scale ice sheet 305 for the reasons already quoted. EMULATOR_12b was poorly defined on ice area (several ice sheet geometries had a similar area, but different geometry) and the ice sheet grows immediately towards a continental scale. Therefore, in addition to having sufficient ice sheet geometries, a good spacing of the ice sheet parameter is another requirement to use an emulator for coupled ice sheet-climate simulations.

310
The simulated temperatures during the first 100 kyr of the simulations are visualised during a strong insolation maximum after 47,000 years and a strong insolation minimum after 60,000 years ( Figure 10) for all four emulators calibrated on ice volume.
The corresponding ice sheet sizes are given in Figure A5 (appendix). The temperature patterns are very similar, especially above the tundra regions. The main difference in simulated temperatures between the different emulators is caused by the size of the ice sheet, which in turn is determined by the number and spacing of ice sheets used for the calibration. 315 When the coupling is based on both the ice volume and the ice area, the transition to a fully glaciated climate gives very distinct results for each of the emulators (Figure 9c). EMULATOR_12a shows the transition to a continental scale glaciation for a similar CO2 threshold than for the emulators tuned on ice volume or ice area of around 890 ppmv. EMULATOR_12b shows the transition to a fully glaciated continent immediately when the simulations start, because of the poor definition on ice area. 320 Remarkably, the transition to a fully glaciated continent for EMULATOR_20 happens for a much lower CO2 threshold of 765 ppmv. However, the reliability of EMULATOR_20 calibrated both on ice area and ice volume is lower than the reliability of the emulator on either ice volume or ice area (see section 2.3), even though the variance is smaller when additional information on the ice sheet parameter is added. The poor emulation originates from calibrating the emulator based on 6 variables, while actually only 5 input forcing parameters are reasonably independent. The additional information on the ice sheet parameter is 325 strongly correlated in most cases, but not everywhere because the spread between ice volume and ice area is not equal. This is visualised in Fig. 11 where three different schematic ice sheet geometries are shown with their respective ice volume and ice area (normalized). For the second ice sheet geometry, the ice area increases with 0.8 units, while the ice volume increases with 0.2 units. Oppositely, the next ice sheet geometry is defined by an ice volume increase of 0.8 units and an ice area increase of 0.2 units. In EMULATOR_20, the smallest prescribed ice sheet geometries have a significant increase in ice area while the ice 330 volume increase is relatively smaller. On the other hand, the largest prescribed ice sheet geometries have a much larger increase in ice volume than ice area (Figure 4). For EMULATOR_12a, the relative increase in ice area and ice volume is more equal and therefore, the reliability and the performance of the emulator is better. EMULATOR_8 grows to a fully glaciated continent for a CO2 threshold around 810 ppmv. As mentioned earlier, the lack of sufficient ice sheet geometries is also visible here with complete growth and decline close to the glaciation threshold. 335 The difference in ice sheet area and ice sheet volume evolution during the build-up of an ice sheet is further visualised in Fig.   12. When the ice sheet starts growing, both the ice sheet area and the ice volume increase. However, the timing between ice sheet volume growth and ice sheet area growth are not fully synchronous. When the climate is cool during an austral summer insolation minimum, the area where the mass balance is positive increases and first the area where ice is accumulating 340 increases. It takes more time for the ice volume to adjust and by the time the ice volume reaches a maximum, the ice area starts decreasing again during an austral summer insolation maximum. This can be seen in Fig. 12c where the first snapshot of the ice sheet geometry shows a relatively large area with ice, but ice volume at that moment is negligible. The ice area will start decreasing towards an insolation maximum, while the ice volume is still growing in response to the large area with a positive mass balance. The delay of the ice volume response compared to ice area is even more clear when an ice sheet is growing from 345 bare bedrock to a continental scale ice sheet (Figure 12a and 12b).
These remarks suggest a possible improvement that would consist in a better experiment design with ice sheets spanning a 2-D space more optimally. On the other hand, the difference in relative magnitude of ice volume and ice area during the buildup of an ice sheet also suggests that it is not easy to create an optimal set of variables for the model design. 350

Sensitivity to the coupling time between AISMPALEO and the emulator
The coupling time in the first set of experiments is 1000 years. The sensitivity of the ice sheet evolution to the coupling time is tested by applying five different coupling times, ranging from 10 years to 2000 years. The smallest time step is in the same order of magnitude as the time step used in the ice sheet model. This way, it can be regarded as an example of a direct coupling between the climatic component and the ice sheet model. Fig.13 shows the ice sheet evolution during one precession cycle for 355 the five coupling times. The climatic information for the largest time step of 2000 years is updated eleven times during this interval and the ice sheet evolution clearly responds stepwise. Another observation is that the higher the coupling time step, the more delayed the ice sheet responds to the forcing and the lower the amplitude of the ice sheet volume. Decreasing the coupling time step results in a smoother ice volume evolution. The differences between a coupling time of 500 years, 250 years and 10 years are becoming very small, suggesting that the solution converges. To make a compromise between model 360 efficiency and model accuracy, we opted for performing the multi-million year sensitivity simulations with a coupling time step of 500 years as a lower limit and 2000 years as an upper limit.
The coupling time is doubled and halved to respectively 2000 and 500 years to test its influence on the ice sheet evolution for EMULATOR_12a and EMULATOR_12b calibrated with ice volume as the ice sheet parameter (Figure 14). Halving the 365 coupling time step increases the computational time with about 40 %, while doubling the coupling time decreases the computational time with the same percentage. When the coupling time decreases, the ice sheet volume has a larger amplitude and is slightly more sensitive to changes in insolation.
For EMULATOR_12a, the glaciation threshold is more sensitive to CO2 changes when the coupling time is decreased. The 370 difference in glaciation threshold between a coupling time step of 500 years and 1000 years is negligible, but the difference with a coupling time step of 2000 years is about 30 ppmv. The continental-scale glaciation for EMULATOR_12b with a coupling time step of 1000 years occurs for lower CO2 values than for a coupling time step of 500 and 2000 years due to the complex interaction between ice sheet response and forcing. The shorter the coupling time, the more sensitive the ice sheet is to the forcing. For a coupling time step of 1000 years, the ice sheet grows more than for a coupling time step of 2000 years, 375 but does not decline as much as for a coupling time step of 500 years and therefore grows quicker to the fully glaciated state.
Comparing the glaciation threshold with respect to the CO2 forcing between the different coupling time steps, there is a decreasing difference between EMULATOR_12a and EMULATOR_12b for a decreasing coupling time step. The difference in glaciation threshold is about 40 ppmv for a coupling time step of 2000 years, 30 ppmv for a coupling time step of 1000 380 years and 10 ppmv for a coupling time step of 500 years, suggesting that the solution converges for decreasing coupling time steps.

Sensitivity to the lapse rate adjustment between HadSM3 and AISMPALEO
The lapse rate is the change in temperature with elevation and its value is highly dependent on the moisture in the air. Since the climate model is relatively coarse compared to the ice sheet model, we need to apply a lapse rate correction to the elevation 385 difference between the climate model and ice sheet model. The lapse rate is, in reality, both temporarily and spatially variable, and here we test different model choices. One experiment includes temporally variations in the lapse rate correction, and another experiment includes both spatial and temporal variations. The temporally variable lapse rate is calculated as the average 13 near-surface lapse rate for all grid points on the Antarctic continent for each month (Figure 15 and Figure A1). The spatially variable lapse rate is included by calculating the local near-surface lapse rate (dT/dZ) simulated by HadSM3 for the 4 adjacent 390 ice covered grid points for each month. The average monthly lapse rate varies roughly between the wet adiabatic lapse rate during summer (December-January) and the dry adiabatic lapse rate in winter (June-July). These values are higher than the constant lapse rate that was applied in the standard experiments and therefore, the lapse rate corrected temperatures are lower for a growing ice sheet. The resulting ice sheet evolution shows a more stepwise change towards full glaciation (Figure 16).
The larger temperature difference between the climate model and the ice sheet model makes it harder for the ice sheet to grow 395 further until a threshold is reached and a large area is cold enough for snow accumulation.

Bayesian sensitivity analysis with EMULATOR_100b calibrated with ice volume
The additional value of the use of an emulator for coupled ice sheet-climate simulations is that the mean climate predictions come with the estimate of its variance and that two different predictions have a covariance. Here, the uncertainties caused by the emulator variance are explored in order to sample climate trajectories. The covariance between output points given by the 400 emulator is used to update the mean and variance of a climate prediction at a given iteration (e.g. time iteration i) of the ice sheet model, given the climate used at the previous point iteration (i-1). It is then possible to sample the updated distribution.
This provides a climate sample at iteration i and the procedure continues to obtain climate samples at iteration i+1 and so forth.
The process yields a sample climate trajectory. Strictly speaking, the emulated climate at iteration i is correlated with the outputs at iteration i-1, i-2, etc. and all of them should be used to update the mean and variance at iteration i. However, since 405 the Gaussian process emulator has an exponential decaying correlation function which is short-ranged (in contrast to a powerlaw), it is expected that the covariance structure of emulated climate trajectories that is associated to the emulator variance is effectively captured by the first-order autocorrelation. Now that the general principle is explained, more mathematical details about the procedure are provided with specific attention 410 to the fact that a PCA emulator is used. At any time step i, the current temperature is estimated on the basis of principal components. To this end, the emulator mean is computed for each PC score, given the orbital parameters, the CO2 concentration and the ice level at the current time step m(xi) and at the previous time step m(xi-1), along with the computed covariances associated with these elements, denoted V(xi,xi), V(xi-1,xi) and V(xi-1,xi-1), The mean and covariance at time step i are then updated given the scores for the corresponding principal component Ti-1 which was effectively applied at time step i-1. The 415 PC score at time step i which will finally be applied to the ice sheet model is drawn from this distribution. The procedure is repeated for each PC score. The temperature field reconstructed from these PC scores is further perturbed by a random field with variance equal to the residual variance not captured by the principal components. This approach provides us with a random draw of the temperature field consistent with the information provided by the PCA emulator (Eq. 7-9). 420 The uncertainty of the emulator is explored by performing 50 Monte Carlo simulations including the variance on 425 EMULATOR_12b calibrated with ice volume. Since temperature and precipitation are emulated separately for each month, a choice had to be made on which parameter is most decisive for ice sheet growth. It appears that summer temperature has a main control on the evolution of the ice sheet over time. Therefore, the uncertainties in the emulated January temperature are explored and emulated precipitation in a similar manner as the previous experiments. The temperatures of the other months are reconstructed based on the annual cycle of the 100 input experiments. First, the mean temperature for each month and for 430 each input ice sheet geometry is calculated. Then, the annual cycle with respect to the January temperatures is calculated for each input ice sheet geometry ( Figure 17). The mean difference between each month and the emulated January temperature is applied to calculate the temperature of the other months. By applying a constant temperature anomaly compared to the January temperature, the temperature of the ice-free regions in January is overestimated for the June temperatures when these regions are snow-covered. This has a minor influence on the results because ice melt does not occur during the austral winter months 435 anyway. So, the emulator is almost identical to EMULATOR_12b, except that covariances are used to sample trajectories around the mean. If the variances are assumed to be zero at all time steps, the mean trajectory already presented in Fig. 9a is approximated (slight difference due to the application of the annual cycle to the January temperature).
The resulting ice sheet evolution over time is shown in Fig. 18 for a  uncertainty in the ice sheet evolution is also comparable for both coupling time steps with a difference between the lowest and highest glaciation threshold of about 25-50 ppmv for more than 95 % of the simulations. Striking is the asymmetry in glaciation 15 threshold for the experiments including the variance in comparison with the reference experiment ( Figure 18). Most experiments including the variance predict the glaciation threshold to happen for lower CO2 values than the reference experiment based on the mean temperature prediction. The reason is that the ice sheet in these simulations lose mass during 455 an insolation maximum for a longer time and therefore does not manage to grow above the mean prediction. Fig. 18b also shows that the ice sheet is growing (peaks are higher) and melting (the ice sheet is not that stable) faster for a coupling time of 2000 years compared to a coupling time of 500 years. The main mechanism is the slow response time of the ice sheet that has more time to grow during an insolation minimum and more time to decay during an insolation maximum for a larger coupling time step. 460

Discussion
The aim of the new coupling technique CLISEMv1.0 is to create an efficient and accurate way to model ice sheet-climate interactions on time scales beyond what directly coupled models can achieve. In that sense, we build further on previous modelling attempts such as the asynchronous coupling method and the matrix method. The basic asynchronous method has the advantage that you do not need to have any prior information on the possible ice sheet geometries. A strong disadvantage 465 is that this method does not allow for sensitivity experiments at a reasonable computing time since the whole chain of ice sheet model -GCM runs would have to be repeated. Nevertheless, this method has been very popular in paleoclimatic studies during all time periods in geological history where ice might have been present, from the Paleozoic ice houses (Horton et al., 2007;Lowry et al., 2014;Pohl et al., 2016) over the Eocene-Oligocene Transition (DeConto and Pollard, 2003), Miocene (Gasson et al., 2016) to the Quaternary ice ages (Charbit et al., 2002;Herrington and Poulsen, 2012). 470 With CLISEMv1.0, the forcing uncertainty is explored with preliminary GCM snapshots. The emulation of climate and precipitation is akin to Kriging or geospatial interpolation, but in 5 or 6-dimensional space with a large number of climate model runs. It is not the same as linear interpolation as the posterior mean includes a term that absorbs deviations from linearity.
Crucially, the emulator comes with estimates of covariance, which measures the uncertainty introduced by using the emulator. 475 Checking that this uncertainty is consistent with leave-one-out experiments is a key aspect of the emulator evaluation. A socalled "nugget" allows to introduce variability directly explained by the model inputs, such as model internal variability (Andrianakis and Challenor, 2012), but in our design this nugget is a small numerical value. For this reason, the use of a GP emulator is also not completely equal to interpolating the raw model output from the climate model. This is in contrast to the climate matrix method, that consists of a limited number of GCM runs for the end members in the forcing and linearly 480 interpolates the climatologies based on the actual ice sheet geometry (Gasson et al., 2016;Stap et al., 2017;Berends et al., 2018).
Whether the asynchronous coupling method, the matrix look-up table or the GP emulator is used to simulate ice sheet -climate interactions on (multi)-million year timescales, a choice on the coupling time has to be made and that will affect the outcome. 485 When using the matrix look-up table or the emulator, the result will also be dependent on the choice of the model design and on the number of GCM runs. For the GP emulator, additional choices need to be made about the length scale, nugget and the covariance function to use. The more complexity is added to the method, the more uncertainties might arise. On the other hand, the GP emulator provides a posterior covariance which provides an objective criterion to verify that it is well calibrated and to evaluate the introduced uncertainties. 490 A common problem for the emulator and the matrix look-up table method, where the ice sheet parameter is defined by a single number, is that there is no control on the regions where ice starts to grow. The problem can be addressed by describing the ice sheet location and geometry with a vector of several dimensions. In reverse, the definition of this vector and the experiment design have to be thought such as to provide a reasonably orthogonal experiment design, in order to avoid the issues 495 experienced in this study by attempting to calibrate the emulator both on ice volume and ice area simultaneously. Optionally, ice sheets could be described with additional variables such as shape factors that are relatively independent on the other ice sheet parameters. We leave the suggestion of creating other ice sheet variables to improve the emulator performance for future work.

500
To have a properly working emulator including dynamic ice sheets, it is crucial to have a good spacing between the input ice sheet geometries and sufficient different ice sheet geometries. Five different emulators showed to have a good set-up: EMULATOR_12a, EMULATOR_12b and EMULATOR_20 tuned on ice volume, and EMULATOR_12a and EMULATOR_20 tuned on ice area ( Figure 19). EMULATOR_8 included too little input ice sheet geometries to make the ice sheet grow to a continental scale ice sheet . EMULATOR_12b tuned on ice area did not produce reliable results because several 505 input ice sheet areas had a different geometry, yet similar area. The glaciation threshold for each of these five emulators ranges between 845 ppmv and 875 ppmv. Overall, taking into account the radiative forcing of carbon dioxide, the differences in glaciation threshold are very small between the different ways of calibrating the emulators (see video supplement for the ice sheet geometry evolution of EMULATOR_12a, EMULATOR_12b and EMULATOR_20 calibrated on ice volume).

510
It appears that the more prescribed ice sheet geometries are used in the emulator, the larger the amplitude of ice sheet growth during a cold orbital configuration. However, this has no impact on the CO2 threshold to continental scale glaciation. The maximum ice sheet extent is also dependent on the largest predefined ice sheet geometry. EMULATOR_12a and EMULATOR_12b had a smaller largest predefined ice sheet geometry than EMULATOR_20. The extrapolation of the emulated climatic variables lead to a climatic state that was too cold, allowing for a larger ice sheet than EMULATOR_20 that 515 was calibrated on a larger range of prescribed ice sheet geometries.
Our simulations show that using a coupling time step of 500 years instead of 2000 years results in a quicker ice sheet response to the forcing. Such small coupling time steps are not common for multi-million year simulations with three-dimensional ice sheet models coupled to a climate model. Gasson et al. (2016) used an asynchronous coupling method to simulate the ice sheet evolution during the Miocene with a coupling step of 2000 years, while Stap et al. (2017) used a coupling time step of 500 520 years for a one-dimensional ice sheet model forced by a climate model. Also the value for the lapse rate correction clearly has an influence on the ice sheet evolution over time and the emulator allows a number of sensitivity experiments. We have attempted to include a more realistic lapse rate that follows the seasonal and spatial variations. In HadSM3, the monthly average lapse rate over the Antarctic ice sheet ranges between 7 ˚C per km in January (summer) to 10.5 ˚C per km in July (winter). The lapse rate over the Greenland ice sheet during the LGM had a similar range from ~5.5 ˚C per km during summer 525 to 9.5 ˚C per km during winter (Erokhina et al., 2017). This near-surface lapse rate is influenced by atmospheric boundary processes, the surface type (snow, tundra) and the atmospheric circulation (Kageyama et al., 2005).

Conclusions
In this study, the computational efficient coupler CLISEMv1.0 that provides climatic fields for simulating ice sheet-climate interactions on a multi-million year has been described together with its sensitivity to the implementation and an uncertainty 530 analysis. CLISEMv1.0 estimates the climate as a function of the orbital parameters, the CO2 forcing and the ice sheet parameter where each forcing is defined by a single number. The ice sheet parameter is either defined by the ice sheet area, the ice sheet volume or by both the ice sheet area and ice sheet volume.
A set of different emulators was constructed to investigate the influence of the number of prescribed ice sheet geometries in 535 the model design on the coupled ice sheet-climate simulations. The number of precursor ice sheet geometries has a large effect on the ice sheet sensitivity to CO2 and orbital forcing, because of its large climatic imprint caused by the high albedo of ice.
Also, the spread of the ice sheet geometries has been shown to have a significant impact on the performance of the coupled ice sheet-climate simulations and has a larger effect than the definition of the ice sheet parameter. When there is an equal spread between the ice sheet area and the ice sheet volume of the input ice sheet geometries, the threshold to continental scale 540 glaciation occurs in a very narrow CO2 window of 860 ± 15 ppmv.
Once the emulator is well-calibrated, the emulator -ice sheet coupling method is very suitable to use for performing ice sheetclimate simulations on a multi-million year timescale and to use for sensitivity tests. Here we tested the sensitivity of the ice sheet evolution to the coupling time and to the lapse rate adjustment. Our results indicated that the glaciation threshold to the 545 CO2 forcing converges for a decreasing time step. Also, shortening the coupling time slightly increases the sensitivity to CO2 forcing. The shorter the coupling time, the larger the ice sheet grows during an insolation minimum and the more the ice sheets shrinks during an insolation maximum. This might have large consequences for paleoclimatic studies implementing 18 asynchronous coupling techniques, where the coupling time is usually on the order of several millennia. The elevation differences between coarse climate models and high resolution ice sheet models are usually corrected for by applying a constant 550 lapse rate correction. The value of this lapse rate correction has a larger effect than the coupling time and we propose to take the real lapse rate correction that is observed in the climate model output.
The emulator-ice sheet coupling method is applied here for idealized CO2 scenarios for the time period between 38 Ma and 35 Ma ago. In these simulations, temperature and precipitation are emulated to drive the mass balance of the ice sheet model. 555 CLISEMv1.0 is a useful tool to investigate the ice sheet evolution during all major climatic transitions of the Cenozoic where the interaction between orbital parameters and CO2 variations are thought to have played a significant role or even to investigate the existence of pre-Cenozoic glaciations throughout the Phanerozoic.

Author contribution
JVB designed the coupling method between the emulator and the ice sheet model. MC developed the Gaussian process 560 emulator. PH developed the ice sheet model code. JVB wrote the manuscript with contributions from all co-authors.

Code and data availability
The code for the coupler CLISEMv1.0 between the climate and the ice sheet model is available on Zenodo: https://doi.org/10.5281/zenodo.5245156. Also a video supplement showing the ice sheet evolution for the three best performing emulators calibrated on ice volume is available on Zenodo: https://doi.org/10.5281/zenodo.5242914. All data used 565 in this paper is available upon request.

Competing interests
PH is topical editor of GMD.
3) Figure 15: Average monthly near-surface lapse rate in HadSM3 over the Antarctic continent for each of the 12 ice sheet geometries. Ice sheet geometry 1 is the largest ice sheet and ice sheet geometry 12 is the smallest ice sheet geometry.
835 Figure 16: Sensitivity of the ice sheet evolution to the application of the lapse rate correction using the emulator calibrated with ice volume for EMULATOR_12a (green) and EMULATOR 12b (blue).