TRAPPIST-1 Habitable Atmosphere Intercomparison (THAI). Motivations and protocol version 1.0

Upcoming telescopes such as the James Webb Space Telescope (JWST), or the Extremely Large Telescope (ELTs), may soon be able to characterize, through transmission, emission or reflection spectroscopy, the atmospheres of rocky exoplanets orbiting nearby M dwarfs. One of the most promising candidates is the late M dwarf system TRAPPIST-1 which has seven known transiting planets for which Transit Timing Variation (TTV) measurements suggest that they are terrestrial in nature, with a possible enrichment in volatiles. Among these seven planets, TRAPPIST-1e seems to be the most promising candidate to have habitable surface conditions, receiving ~66 % of the Earth's incident radiation, and thus needing only modest greenhouse gas inventories to raise surface temperatures to allow surface liquid water to exist. TRAPPIST-1e is therefore one of the prime targets for JWST atmospheric characterization. In this context, the modeling of its potential atmosphere is an essential step prior to observation. Global Climate Models (GCMs) offer the most detailed way to simulate planetary atmospheres. However, intrinsic differences exist between GCMs which can lead to different climate prediction and thus observability of gas and/or cloud features in transmission and thermal emission spectra. Such differences should preferably be known prior to observations. In this paper we present a protocol to inter-compare planetary GCMs. Four testing cases are considered for TRAPPIST-1e but the methodology is applicable to other rocky exoplanets in the Habitable Zone. The four test cases included two land planets composed with a modern Earth and pure CO2 atmospheres, respectively, and two aqua planets with the same atmospheric compositions. Currently, there are four participating models (LMDG, ROCKE-3D, ExoCAM, UM), however this protocol is intended to let other teams participate as well.


Introduction
M dwarfs are the most common type of stars in our galaxy and rocky exoplanets orbiting M dwarf stars will likely be the first to be characterized with upcoming astronomical facilities such as the James Webb Space Telescope (JWST). Ultra-cool dwarfs (T < 2700 K) are a sub-stellar class of late M-dwarfs and represent nearly 15% of astronomical objects in the stellar neighborhood of the Sun (Cantrell et al., 2013). Their smaller size compared to other stellar types allows easier detection of rocky exoplanets in close orbits, and this potential was recently realized by the discovery of the TRAPPIST-1 system .
Located about 12 pc away TRAPPIST-1 has seven known planets, and is one of the most promising rocky-planet systems for follow-up observations due to the depths of the transit signals Luger et al., 2017). Transit Timing Variation (TTVs) measurements of the TRAPPIST-1 planets suggest a terrestrial composition likely enriched in volatiles, and possibly water . Also, it has been found that three planets (TRAPPIST-1 e, f and g) are in the habitable zone (HZ, Kopparapu et al., 2013) where surface temperatures would allow surface water to exist Wolf, 2017Wolf, , 2018Turbet et al., 2018).
TRAPPIST-1 is an active M dwarf star (O'Malley-James and Kaltenegger, 2017;Wheatley et al., 2017;Vida and Roettenbacher, 2018) which offers an environment very hostile to the survival of planetary atmospheres. However, Bolmont et al. (2017) and Bourrier et al. (2017) argued that depending on their initial water contents, the TRAPPIST-1 planets could have retained some water presently. Assuming that this water has remained in sufficient quantity, TRAPPIST-1e may be able to maintain habitable conditions (locally or globally around the planet) through a very large set of atmospheric configurations (Wolf, 2017;Turbet et al., 2018;Grootel et al., 2018, and references therein). The first attempt to characterize those planets through transmission spectroscopy has been conducted by de Wit et al. (2016de Wit et al. ( , 2018 using the Hubble Space Telescope (HST) for the six innermost planets. Their analysis suggests that the TRAPPIST-1 planets do not have a cloud/haze free H 2 dominated atmosphere and that a large set of high mean molecular weight atmospheres are possible, such as thick N 2 , O 2 , H 2 O, CO 2 , or CH 4 dominated atmospheres. Using laboratory measurements and models Moran et al. (2018) have also shown that H 2 dominated atmospheres with cloud/haze can also be ruled out. Note that the uncertainties of these HST observations were very large, on the order of hundreds of parts per million (ppm) and further investigations with future facilities such as JWST (Barstow and Irwin, 2016;Morley et al., 2017) will be needed to determine the nature of atmospheres heavier than hydrogen.
Upstream of future JWST characterization of TRAPPIST-1e, it is important to derive constraints on its possible atmosphere to serve as a guideline for the observations. For this purpose, 3-D Global Climate Models (GCMs) are the most advanced tools . However, GCMs are very complex models and their outputs can vary from one model to another for a variety of reasons. GCM intercomparisons have been widely used by the Earth science community. For instance the Coupled Model Intercomparison Project (CMIP) initiated in 1995 and currently in its version 6 (Eyring et al., 2016), focuses on the dif-ferences in GCM responses to forcings for anthropogenic climate change. While exoplanets receive considerable attention from climate modelers, and atmospheric data from Earth-like worlds may be imminent, to our knowledge only one intercomparison of planetary GCMs has been published (Yang et al., 2019). They found significant differences in global surface temperature between the models for planets around M-dwarf stars due to differences in atmospheric dynamics, clouds and radiative transfer. However, Yang et al. (2019) concerns planets near the inner edge of the HZ and focuses on highly idealized planetary configurations. Note that another model intercomparison have been run for the exoplanet community: the Palaeoclimate and Terrestrial Exoplanet Radiative Transfer Model Intercomparison Project (PALAEOTRIP). The protocol of this experiment is described in Goldblatt et al. (2017) and aims to compare a large variety of radiation codes used for paleoclimate or exoplanets sciences, to identify the limit conditions for which each model can produce accurate results. Information and timeline about PALEOTRIP can be found at http://www.palaeotrip.org/.
The motivation behind the TRAPPIST Habitable Atmosphere Intercomparison (THAI), is to highlight differences among GCM simulations of a confirmed exoplanet, TRAPPIST-1e, that is potentially characterizable in the near term (with JWST or groundbased facilities), and to evaluate how these differences may impact our interpretations of retrievals of its atmospheric properties from delivered observables. Our objective is also to provide a clear protocol intended for other GCMs to join the intercomparison, which is therefore not only limited to the GCMs presented in this paper. Results of the intercomparison will be presented in a following paper. In this paper, the motivations, including a presentation of TRAPPIST-1e and of the GCMs, are presented in section 2. In section 3 we present the THAI protocol describing all the parameters to be set up in the GCM. In Section 4, we list the model parameters to be provided in order for a given model to be comparable to other GCM simulations. A summary is given in section 5.

Motivations for a planetary GCM intercomparison
Global Climate Models (GCMs) are 3-dimensional numerical models designed to represent physical processes at play in planetary atmospheres and surfaces. They are the most sophisticated way to model the atmospheres and oceans of real planets.
GCMs can be seen as a complex network of 1-D time-marching climate models connected together through a dynamical core (see description below). Each 1-D column contains physical parameterizations for radiative transfer, convection, boundary layer processes, cloud macroscale and microscale physics, aerosols, precipitation, surface snow and sea ice accumulation, and other processes, at varying levels of complexity.
The motivation behind this experimental protocol is to evaluate how some of the differences between the models can impact the assessment of the planet's habitability and its observables through transmission spectroscopy and thermal phase curves with upcoming observatories such as JWST. The intercomparison protocol was designed to evaluate three possible sources of differences between the models listed below: 1. The dynamical core: The dynamical core is a numerical solver of the hydrodynamic equations on the (rotating) planetary sphere. It calculates the winds that transport atmospheric gases, clouds, aerosols, sensible and latent heat, and momentum from one atmospheric column to another.

The radiative transfer:
Each model has its own radiative transfer working assumptions and may use different spectroscopic databases and even different versions of the same spectroscopic database (e.g., HITRAN), collision-induced absorption (CIA), line-by-line versus correlated-k distribution (Lacis and Oinas, 1991), line cutoff, spectral resolution, etc.

The moist physics:
The treatment of water in all of its thermodynamic phases is critical for the simulation of habitable planets. In particular cloud and convection process are a significant source of differences between climate models, and these differences are often exacerbated when modeling exoplanets around M-dwarf stars (Yang et al., 2014(Yang et al., , 2019. Note that a particular emphasis will be given on the differences of cloud properties between the models because they may have a large impact on the strength of the spectral signatures simulated by current radiative transfer tools ). Yet a sufficient understanding of 3D cloud fields is needed to provide realistic observational constraints to observers. It is therefore crucial to address these potential differences between the GCMs. By publishing our protocols in advance of the intercomparison work, we hope that other teams will also use this protocol to compare their own GCM with the four GCMs of this study.

The TRAPPIST-1e benchmark
TRAPPIST-1e is up to now one of the best habitable planet candidates for atmospheric characterization through transmission spectroscopy with JWST. Therefore, it is also an obvious candidate for an experimental protocol for GCM intercomparison. In Table 1 we summarize the TRAPPIST-1e parameters used in the THAI project based on Grimm et al. (2018).

Atmospheric configurations
For THAI, we have chosen a set of four planetary configurations with increasing complexity. We have chosen to start with benchmark cases of dry-land planets with N 2 -and CO 2 -dominated atmospheres respectively, which will allow us to assess atmospheric dynamical + boundary layer, and CO 2 radiative transfer differences. Next we conduct aquaplanet simulations of N 2 and CO 2 dominated atmospheres respectively, providing characteristic cold and warm habitable states for TRAPPIST-1e.
By gradually increasing the complexity of our simulations, we hope to be able to parse out meaningful differences between atmospheric dynamical + boundary layer, radiative transfer, and moist physical processes. The motivation for each of these cases is described below: -Benchmark case 1 (Ben1): In this case, constituted of 1 bar of N 2 only and 400 ppm of CO 2 , the purpose is to test the differences of the planetary boundary layer (PBL) schemes, the dynamical core and the associated heat redistribution between the different models. Note that N 2 -N 2 CIA should be included.
-Benchmark case 2 (Ben2): In this case, constituted of 1 bar of CO 2 , we test the PBL schemes and dynamical core differences as well as the CO 2 radiative transfer.
-Habitable case 1 (Hab1): In this case, constituted of a modern Earth-like atmosphere of 1 bar of N 2 and 400 ppm of CO 2 , the dynamical core, the clouds and atmospheric processes are tested together. It is also the most widespread benchmark for habitable planets in the literature (Barstow and Irwin, 2016;Morley et al., 2017;Lincowski et al., 2018).
-Habitable case 2 (Hab2): In this case, constituted of 1 bar of CO 2 , the dynamical core, the CO 2 radiative transfer assumption and the clouds and atmospheric processes are tested. This case is likely representative of the early Earth (during the Hadean epoch), early Venus, and early Mars, at a time when Martian valley networks and lakes were formed (Haberle et al., 2017;Kite, 2019).
We have therefore two lands planets (Ben1 & Ben2) and two aqua planets (Hab1 & Hab2). Note that Ben1 & Hab1 share the same atmospheric composition of bar of 1 bar of N 2 and 400 ppm of CO 2 and Ben2 & Hab2 share the same atmospheric composition of 1 bar of CO 2 . The protocol is therefore symmetrical with respect to the atmospheric composition between the land planets and aqua planets.
In each case, it is crucial to start each simulation with the same initial conditions. The simplest approach is then to start with an isothermal atmosphere. For THAI, we fixed the initial surface and atmosphere temperature at 300 K. The atmospheric configurations for the two benchmark (dry land) cases and two habitable cases are listed in Table 2, first horizontal block.
Note that for Ben2 initial results indicate that some models feature cold trap temperatures on the night-side slightly below the CO 2 condensation point at 1 bar (194 K). However, because not all the models include CO 2 condensation, it should be disabled in the models that allow it. Ben2 is thus to be viewed as a idealization for the sake of study. Initial results indicate that Hab1 is representative of a cool, largely ice covered world but with liquid water in the substellar region. Hab2 is significantly warmer than Hab1, owing to a strong CO 2 greenhouse effect and the water vapor greenhouse feedback, and is representative of a temperate habitable world. The amount and variability of clouds and the strength of the atmospheric processes should be enhanced providing a more challenging comparison than in Hab1. Trappist-1e, aquaplanet, 1 bar N 2 + 400 ppm CO 2 ExoCAM ROCKE3D LMDG UM In Fig. 1 we show results from preliminary simulations on case "Hab1" conducted with four different GCMs; UM, LMDG, ROCKE-3D and ExoCAM. We show surface contours for surface temperature, thermal emitted radiation (TOA) and reflected stellar radiation (TOA). We can see significant differences in the maximum, minimum and mean values of these parameters between the models. For such a complex atmosphere it is difficult to disentangle the effects leading to these differences.
However, it seems clear that the patterns of thermal emitted and reflected stellar TOA fluxes are strongly influenced by the cloud patterns produced by each respective model. Here we have shown preliminary outputs to demonstrate the feasibility of the described experiments. In depth analysis of these simulations will be discussed in a following manuscript in preparation.

Surface
The surfaces considered in THAI (Table 2,  albedos are fixed at a constant value of 0.25. Note that the sea ice/snow albedo parameterization is a common source of discrepancy between the models. Some models, like ROCKE-3D, account for the spectral dependence of the sea ice albedo over multiple bands and variations due to snowfall, aging, depth and melt ponds while other models, such as LMDG, compute the wavelength-dependent albedo of water ice / snow from a simplified albedo spectral law, calibrated to get an ice / snow bolometric albedo of ∼ 0.25 around an ultra-cool star like TRAPPIST-1 (Joshi and Haberle, 2012;von Paris et al., 2013;Shields et al., 2013). Differences in sea ice albedo have been found to have a large impact on planetary climate and habitability . However, for the sake of this intercomparison, this discrepancy can be easily avoided by fixing the sea ice and snow albedo at a constant bolometric value of 0.25.

Model spatial resolutions and time steps.
The model spatial resolution is an important parameter because every process taking place at a sub-grid level would be parameterized and those parameterizations often diverge between the models. Similarly the model time steps control the numerical stability and accuracy. However, the choices for those are fundamental to how each model operates under a given parameterization and arbitrary fixing these parameters may prevent some model to correctly and fairly perform the intercomparison. In addition, models should be compared using the specifications that they commonly use for exoplanet studies. Therefore, for the sake of the THAI, we do not impose the model spatial resolution nor time steps. Note that we however recommend (but this is not a requirement) the radiative time step (a parameter much more flexible than the others among the models) to be set up at 1800 s. This value should provide a good coupling of the radiation with temporal changes to the atmosphere without slowing down too much the model.
We also ask the contributing scientists to disable the sub-grid gravity wave parameterizations in their model. Indeed, all the models do not have implemented a gravity wave parameterization and some have prescribed or predicted gravity wave formation, tuned for Earth topography and meteorology. Therefore, to avoid differences in atmospheric dynamics especially above the tropopause, we recommend to not include the sub-grid gravity wave parameterizations in this intercomparison. Gravity waves whose wavelengths are greater than the model grid are explicitly resolved in the models and do not need to be modified.
Note that under the requirements of the protocol, the atmospheric simulation of TRAPPIST-1e may actually not represent what each individual model can simulate with all their parameterizations fully activated. This is especially true for the sea ice and snow albedo parameterization. Therefore, complementary to the Hab1 case, we propose the Hab1* which should be simulated with the commonly used model parameterizations fully activated. Therefore, only the requirements on the atmospheric composition (1 bar of N 2 and 400 ppm of CO 2 ) and the planet and star properties of Table 1 are constrained for Hab1*. Heat capacity (J/m 3 /K) 2 · 10 6 4 · 10 6 Thermal inertia (J/m 2 /K/s 2 ) 2000 12000 Momentum roughness length (m) 0.01 0.01 Heat roughness length (m) 0.001 0.001 Depth of the subsurface / ocean (m) > 3 100 Cautions: disable sub-grid gravity wave parameterization disable CO 2 condensation

Outputs
To compare the difference between models of a particular (instantaneous) output variable, both the average and standard deviation over the specified frequency and number of orbits for the case will be computed. Four categories of outputs frequently of the cloud fraction and the mass mixing ratio for the liquid, ice and both combined. Also in these two cases, the spatial and temporal variability is much weaker than in Hab1/Hab1* & Hab2. Therefore, to mitigate the amount of data we choose to only output data for ten consecutive orbits (with a 6 hour output frequency). Concerning Hab1/Hab1* & Hab2, we can see in Figure   9 2 that weather patterns modulate the surface temperature and cloud water column of Hab1 on a period nearly equal to 10 orbits.
Also Hab2 (hotter than Hab1) has a more efficient heat transport and is therefore more homogeneous in temperature but the cloud variability is very important. Therefore, more orbits (100) are needed in order to smooth out this variability. A summary of the output parameters is given in Table 3.  (Wolf and Toon, 2015). Surface temperature for Hab1 and cloud water column for the 2 cases vary by a couple of tens of percents on a timescale of 10 orbits due to weather patterns.
All the simulations should have reached radiative equilibrium at TOA at ∼ ±1 W · m −2 . If such limit can't be achieved by the model, the radiative equilibrium can be established if no discernible trend are observable in the last 10 year average global mean temperature.
To facilitate comparison between each GCM, we ask the contributing scientists to provide their outputs in netCDF format.
The contributing scientist will be able to upload their data on a public permanent repository at https://thai.emac.gsfc.nasa.gov/ dataset/thai after requesting an IP address authorization to Thomas Fauchez (thomas.j.fauchez@nasa.gov). Table 3. Instantaneous fields to be output by the GCM. For each diagnostic, the mean value and the standard deviation are computed from data output at the specified frequency and number of orbits for the case. OLR and ASR correspond to outgoing longwave radiation (at TOA) and absorbed shortwave radiation (at TOA), respectively, SW and LW correspond to shortwave and longwave, respectively, CF corresponds to cloud fraction and MMR at mass mixing ratio. The main objective of THAI is to highlight how differences in atmospheric profiles produced by each GCM are going to impact the predictions of atmosphere detectability and observational constraints for habitable planet targets such as TRAPPIST-1e (Morley et al., 2017;Fauchez et al., 2019). Therefore, in addition to the parameters of Table 3, we will emphasize the differences between the models in term of the planet's climate and habitability with a particular attention on the cloud coverage.
Also, the objective will be to identify and quantify the differences on the simulated JWST observations, through simulated transmission spectra (in NIRSpec prism and MIRI LRS ranges) and thermal phase curves (in MIRI LRS range) due to the differences of atmospheric profiles (temperature, pressure and gas mixing ratios) output by each GCM. The planetary spectrum generator (PSG, Villanueva et al. (2018)) will be used to simulate transmission and emission spectra. The comparison of the spectra for Hab1 & Hab2 cases will therefore highlight the sensitivity of model characteristics to predict transmission spectra of habitable planets.
Note that while additional simulations with a simple Newton cooling model, a 1-D column model, or with cloud radiative effects disabled would help to better understand the differences due to the dynamical cores and cloud physics, they would also dramatically increase the computationnal time, amount of data and effort. THAI aims to be easily reproducible and not time consuming in order to reach many GCM user groups. The five simulations propose in THAI should be enough to understand the main differences between the GCMs and their impact on the observables. THAI could also be used as a benchmark for future GCM intercomparisons that specifically aim to understand each differences between the models.

Summary
THAI is an intercomparison project of planetary GCMs focused on the exciting new habitable planet candidate, TRAPPIST-1e. Because rocky exoplanets in the Habitable Zone of nearby M dwarfs have the highest chance to be the first Earth-size exoplanets to be characterized with future observatories, TRAPPIST-1e is currently the best benchmark we could think of to compare the capability of planetary GCMs. In this first paper we have presented the planet and GCM parameters to be used in this experiment which already has four GCMs onboard (LMDG, ROCKE-3D, ExoCAM and UM), but we hope more GCMs will join the project. The results of the comparison of these four models will be given in a second paper and a THAI workshop is planned for fall 2020.
Data availability. No data has been used.
Competing interests. No competing interests are present.

Author contribution
T.J.F. lead the THAI project and has written the manuscript. E.T.W ran the simulation for Fig 1. and plotted the figures. Every author contributed to the development of the THAI protocol, to the discussions and to the editing of the manuscript.