Extreme Events Representation in CMCC-CM2 Standard and High Resolution General Circulation Models

The recent advancements in climate modelling partially build on the improvement of horizontal resolution in 10 different components of the simulating system. A higher resolution is expected to provide a better representation of the climate variability, and in this work we are particularly interested in the potential improvements in representing extreme events of high temperature and precipitation. The two versions of the CMCC-CM2 model used here adopt the highest horizontal resolutions available within the last family of the global coupled climate models developed at CMCC to participate in the CMIP6 effort. The main aim of this study is to document the ability of the CMCC-CM2 models in representing the spatial distribution of 15 extreme events of temperature and precipitation, under the historical period, comparing model results to observations (ERA5 Reanalysis, MSWEP and CHIRPS observations). For a more detailed evaluation we use both 6-hourly and daily time series, to compute indices representative of intense and extreme conditions. In terms of mean climate, the two models are able to realistically reproduce the main patterns of temperature and precipitation. The high resolution version (1⁄4 degree horizontal resolution) of the atmospheric model provides better results than the standard 20 resolution one (one degree), not only in terms of means but also in terms of intense and extreme events of temperature defined at daily and 6-hourly frequency. This is also the case of average and intense precipitation. On the other hand the extreme precipitation is not improved by the adoption of a higher horizontal resolution. .


The numerical experiments
The CMCC general circulation model has been developed in several configurations (Cherchi et al. 2019). The model uses as atmospheric module the CAM Atmospheric component (CAM4, Neale et al. 2010) in its grid point configuration. We will not 85 go in a detailed description here, but since it is worthwhile to mention for our discussion on precipitation biases, the deep convection scheme is the one developed by Zhang and McFarlane (1995), modified following Ritcher and Rasch (2008) and Raymond andBlith (1986, 1992). The scheme is based on a plume ensemble approach where it is assumed that an ensemble of convective scale updrafts may exist whenever the atmosphere is conditionally unstable in the lower troposphere. Moist convection occurs only when there is convective available potential energy (CAPE) for which parcel ascent from the sub-cloud 90 layer acts to destroy the CAPE at an exponential rate using a specified adjustment time scale. In other words the deep convection scheme is triggered based on a minimum positive threshold of CAPE, same as in the standard version of the CAM5 model (Wang and Zhang, 2013). The two models object of this study differ only in the horizontal resolution of their atmospheric component (CAM4) that is one degree in HR -the standard resolution one, and ¼ degree in VHR -the high resolution one. The ocean and sea-ice components are the same in HR and VHR models: a ¼ degree horizontal resolution 95 version for both ocean (NEMO3.6, Madec & the NEMO team, 2016) and sea-ice (CICE4, Hunke & Lipscomb, 2008). The land model (CLM4.5, Oleson et al., 2013) is implemented with the atmospheric model grid. The basic of the coupling between the different components is described in Fogli and Iovino (2014). The single components of the coupled model are described in detail in Cherchi et al. (2019); additional studies based on last generation CMCC GCMs can be found in Scoccimarro et al. 2017a, Scoccimarro et al 2020, Bellucci et al. 2021. No changes are applied in terms of parameterization choices -and relative 100 tuning parameters -moving from HR to VHR to be compliant with the HighResMIP protocol. Also, the two model versions use the same number of vertical levels in both atmosphere (26) and ocean (50) components. The complete set of experiments run with these two models is described in Haarsma et al. 2016. In the current analysis we investigate the hist-1950 HighResMIP experiment as described in section 2.3. 105  4 algorithm. Since there are many known issues with ERA5 precipitation (Rivoire et al., 2021;Hu et al., 2020;Crosset et al. 2020), for the evaluation of the model performance in representing the precipitation distribution, we build on MSWEP version 2 observational dataset (Beck et al. 2019): The Multi-Source Weighted-Ensemble Precipitation (MSWEP) global precipitation 120 dataset is available at a 3-hourly temporal resolution, covering the period from 1979 to the near present, with an horizontal resolution of 0.1 degrees. The dataset takes advantage of the complementary strengths of gauge-, satellite-, and reanalysisbased data to provide reliable precipitation estimates over the globe.

Re-analyses and observations for comparison
Since we aim to characterize different types of extreme events, we consider both 6-hourly and daily time series for the computation of the percentiles (see 2.3) for the chosen climate parameters. 125 For a more exhaustive evaluation of the precipitation distribution, we also take advantage of the CHIRPS (Climate Hazards group Infrared Precipitation with Stations) daily observational dataset. The version 2.0 of the CHIRPS database comprises a quasi-global (50°S-50°N, 180°E-180°W) domain, at ¼ degree resolution, and 1981 to near-present gridded precipitation daily time series. This dataset merges three types of information: global climatology, satellite estimates, and in situ observations (Funk et al. 2015). 130

Methodology
The period used to compare the simulated temperature (tas) distribution to the observations is 1950-2014. On the other hand, due to the shorter period available for the MSWEP and CHIRPS datasets, the precipitation (pr) distribution is evaluated over the common period between the observations and the historical model run 1981-2014. This time period is sufficiently long to capture the temporal variability at the global scale (Schindler et al. 2015). Typically, the warm extremes are computed based 135 on maximum daily temperature, but in this work we want to verify the potential improvements induced by the increased resolution in the representation of extreme temperature events defined at two different time frequency (daily and 6-hourly).
For this reason we investigate the distribution of daily and 6-houry average temperature (tas), instead of maximum daily temperature.
Model averages and 99 th /90 th percentile (99p/90p hereafter) are computed on the native grid and then the results are compared 140 to ERA5 or observational datasets, linearly interpolating the re-analysis (or observations) on the model grid. The kind of interpolation introduces very little differences in the fields (not shown). We denote events belonging to the 99p as "extreme events" and the ones belonging to the 90p as "intense events" (Scoccimarro et al. 2016). Two seasons are considered, December to February (DJF hereafter) and June to August (JJA hereafter) representative of the boreal winter and summer, respectively.

Moved (insertion) [1]
Deleted: grid differences are minor and therefore the Deleted: expressed as % fraction (Figure S17 only) the precipitation is shown only for regions where the seasonal average of precipitation is higher than 0.5 mm/d to avoid misleading percentual differences over dry domains (Scoccimarro et al. 2013).
The comparison with CHIRPS precipitation data is performed at the daily frequency only.

Representation of extreme events of temperature 160
In this section modelled extreme temperature is compared to the ERA5 reanalysis. Figure 1 shows the DJF 99 th percentile of Moving to the 6-hourly based extreme events, the fraction of land affected by a positive bias higher than 5 o C is more pronounced compared to the daily statistics, especially for the HR model during JJA (Figure 4). The positive bias over the 180 north western part of South America, during JJA, reaches 9 o C in HR and is only partially reduced in VHR; during the same season the positive bias of the same order of magnitude over central and eastern United States is not improved by the increased resolution. Similar patterns, but less pronounced, are reflected on the averaged temperature, as shown in supplemental figures S1-S2, and intense events representation ( Figures S7-S10). 7 associated to extreme events with respect to the total precipitation: this metric is obtained accumulating the water of all the events more intense than the 99p, and normalizing it by the total amount of precipitation in the considered period (season by 225 season). Figure S17 shows that both models reasonably well capture this metric in both seasons compared to MSWEP, but the VHR model tends to overestimate such amount over the Southern Hemisphere, except for the Australian domain. In particular, the strong positive bias of DJF average precipitation over Australia (up to 4 mm/d, Figure S3, lower panels) can't be attributed to the positive (higher than 15 mm/d, Figure 5 lower panels) bias found for extreme events, but must be associated to a right shift of the remaining part of the precipitation distribution, more pronounced for the non-extreme events as partially confirmed 230 by the positive bias in the 90p metric over the same season ( Figure S11).

Summary and conclusions
CMCC-CM2-HR4 and CMCC-CM2-VHR4 models are state-of-the-art fully coupled climate models, participating in different Model Intercomparison Projects within the 6th Coupled Model Intercomparison Project (CMIP6). CMCC-CM2-HR4 presents 235 a horizontal resolution typical of most of the CMIP6 involved models, while CMCC-CM2-VHR4 has a horizontal resolution standard for the models involved in the High-Resolution Model Intercomparison Project (HighResMIP). In this paper we highlight the ability of the two models to represent extreme climate conditions, based on daily and 6-hourly time series, comparing temperature and precipitation modelled distributions to the observed ones. In order to have a gridded dataset representative of the observed climate at the daily and 6-hourly time frequency we used ERA5 reanalysis for temperature and 240 MSWEP observations for precipitation. For the precipitation analysis we also reinforce our investigation on the base of the CHIRPS daily observations. It is well known that the representation of precipitation extreme indices is more dependent on the horizontal resolution than what we expect for temperature extreme indices (Wei et al. 2019). Anyway, on average, the highest resolution CMCC model (VHR) is better than the lower resolution model (HR) in representing average, intense (90p) and extreme (99p) events of 245 temperature both in terms of patterns and magnitude. This is true for daily and 6-hourly based statistics. Also VHR results are quite in agreement with CMIP6 multi-member average of daily intense and extreme temperature indices (Scoccimarro and

Code and Data availability
The code relative to the CMCC-CM2-HR4 and the CMCC-CM2-VHR4 climate models is available on the Zenodo repository (URL: https://zenodo.org/record/5499856#.YTs5Bh2xVZP, doi: 10.5281/zenodo.5499856). The data relative 310 to the two models are available through the ESGF data portal (Scoccimarro et al. 2017b andScoccimarro et al. 2017c, respectively). ERA5 Reanalysis are available through the Copernicus data portal (https://climate.copernicus.eu). CHIRPS observational dataset is available through the data storage of the University of California in Santa Barbara (https://www.chc.ucsb.edu/data/chirps).

Author contribution
ES, AB and DP implemented the two model versions and run the simulations. PGF supported the implementation of the Aerosol input management routines.TL prepared the radiative forcing files and supported the model output postprocessing.
ES prepared the manuscript with contributions from all co-authors. .