Extreme Events Representation in CMCC-CM2 High and Very-High Resolution General Circulation Models

The recent advancements in climate modelling partially build on the improvement of horizontal resolution in 10 different components of the simulating system. A higher resolution is expected to provide a better representation of the climate variability, and in this work we are particularly interested in the potential improvements in representing extreme events of high temperature and precipitation. The two versions of the CMCC-CM2 model used here, adopt the highest horizontal resolutions available within the last family of the global coupled climate models developed at CMCC to participate in the CMIP6 effort. The main aim of this study is to document the ability of the CMCC-CM2 models in representing the spatial distribution of 15 extreme events of temperature and precipitation, under the historical period, comparing model results to observations (ERA5 Reanalysis and CHIRPS observations). For a more detailed evaluation we investigate both 6 hourly and daily time series for the definition of the extreme conditions. In terms of mean climate, the two models are able to realistically reproduce the main patterns of temperature and precipitation. The very-high resolution version (1⁄4 degree horizontal resolution) of the atmospheric model provides better results than the 20 high resolution one (one degree), not only in terms of means but also in terms of extreme events of temperature defined at daily and 6-hourly frequency. This is also the case of average precipitation. On the other hand the extreme precipitation is not improved by the adoption of a higher horizontal resolution. .

(HighResMIP, Haarsma et al., 2016) was designed to understand the role of the horizontal resolution. In this paper, we present an analysis based on two version of the GCM developed at CMCC (CMCC-CM, Cherchi et al., 2019), that we use for two simulations of the present climate  differing only for the atmospheric horizontal resolution: HR with an horizontal resolution of 1 degree and VHR with a resolution of ¼ of a degree. The two models are described in detail in the next section. 35 The difference between the results obtained with the two versions of the model allows us to evaluate the impact of the model horizontal resolution on the temporal distribution of temperature and precipitation events compared to observations. It has been shown that the horizontal resolution can affect the representation of extreme events in state-of-the-art climate models (Van Haren et al., 2015;Iles et al., 2020). Besides, Demory et al. (2020) have shown that high resolution models, when implemented with a resolution similar to VHR, achieve skills comparable to state-of-the-art Regional Climate Models in 40 reproducing precipitation distributions over Europe. However, such analyses has employed rather low frequency data, and short-duration high-intensity precipitation events can easily escape detection if high-frequency data are not used (Meredith et al. 2020, Scoccimarro et al. 2015. In this paper we present both, a daily and a high-frequency analysis using 6-hourly data from the experiments, comparing model results to data from a reanalysis data set with comparable horizontal resolution (ERA5, Hersbach et al. 2020). The 45 importance to evaluate extreme events at the sub-daily scale resides in the importance of such events on human health and over both urban and rural environments (Wehner et al. 2021).
The work is organized as follows: Sect. 2 describes the data and the methodology adopted, Sect.3 and Sect.4 describe the evaluation of model ability in representing the distribution of temperature and precipitation events respectively and Sect. 5 summarises and concludes the work. 50 degree in VHR. The land model (CLM4.5, Oleson et al., 2013) is implemented with the atmospheric model grid. The basic of the coupling between the different components is described in Fogli and Iovino (2014). The single components of the coupled 65 model are described in detail in Cherchi et al. (2019); additional studies based on CMCC GCM can be found in Scoccimarro et al. 2017a, Bellucci et al. 2021. No changes are applied in terms of parameterization choices -and relative tuning parameters -moving from HR to VHR. Also, the two model versions use the same number of vertical levels in both atmosphere (26) and ocean (50) components. The complete set of experiments run with these two models is described in Haarsma et al. 2016. 70

Re-analyses and observations for comparison
The model performance is evaluated comparing results to the European Centre for Medium Range Weather Forecasts (ECMWF) ERA5 re-analyses (Hersbach et al. 2020, Andersson andThepaut, 2008), with 137 hybrid sigma/pressure (model) levels in the vertical, and the top level at 0.01 hPa. The data used in the paper (two-meter temperature, hereafter "temperature", and precipitation) can be obtained from the Copernicus Data Store (CDS) at https://cds.climate.copernicus.eu up to hourly 75 frequency. The horizontal resolution is close to the one of the higher (VHR) resolution model employed here (1/4 degree) and since we aim at the characterization of different type of extreme events, we consider both 6-hourly and daily time series for the computation of the percentiles (see 2.3) for the chosen climate parameters temporal distributions. It is important to note that the improvement of ERA5 reanalysis with respect to the previous ERA-Interim (Dee et al. 2011) product is due not only to the increased resolution but also to the addition of new integrated observation and aircraft data covering the recent decades, 80 assimilated by the 4D-Var algorithm. The precipitation in ERA5 is generated by both large scale parameterizations (Forbes & Ahlgrimm, 2014;Forbes & Tompkins, 2011;Tiedtke, 1993) and a convection scheme (Bechtold et al., 2008;Hirons et al., 2013;Tiedtke, 1989).
For a more exhaustive evaluation of the precipitation distribution, we also take advantage of the high resolution CHIRPS (Climate Hazards group Infrared Precipitation with Stations) daily observational data-set. The version 2.0 of the CHIRPS 85 database comprises a quasi-global (50°S-50°N, 180°E-180°W) domain, at ¼ degree resolution, and 1981 to near-present gridded precipitation daily time series. This dataset merges three types of information: global climatology, satellite estimates, and in situ observations (Funk et al. 2015). Since the observed precipitation is not assimilated into the ERA5 reanalysis until 2009, a comparison of model precipitation with CHIRPS, in addition to the ERA5 product, is necessary.

Methodology 90
The period used to compare the simulated temperature (tas) and precipitation (pr) distributions to the observations is 1950-2014. This time is sufficiently long to capture the temporal variability at the global scale (Schindler et al. 2015). Model averages and 99 th percentile (99p hereafter) are computed on the native grid and then the results are compared to ERA5, linearly interpolating the re-analysis (and/or CHIRPS observations) on the model grid. The grid differences are minor and therefore the interpolation introduces very little differences in the fields. We denote events belonging to the 99p as "extreme events" (Scoccimarro et al. 2016). Two seasons are considered, December to February (DJF hereafter) and June to August (JJA hereafter) representative of the boreal winter and summer, respectively.
Percentiles computed at the daily time frequency are obtained based on a sample of 5850 (90 days x 65 years) events, while the percentiles computed at the six-hourly time frequency are obtained based on a sample of 23400 (90 days x 65 years x 4 six-hourly data in a day) events. Models versus ERA5 biases are shown as differences for temperature, expressed in degree 100 Celsius [ o C], and as percentage differences for precipitation. The precipitation biases are shown only for regions where the seasonal average of precipitation is higher than 0.5 mm/d to avoid misleading percentual differences over dry domains (Scoccimarro et al. 2013). The comparison with CHIRPS precipitation data is performed at the daily frequency only for the shorter period 1981-2014, covered by this dataset, therefore the percentiles are obtained based on a sample of 3060 (90 days x 34 years) events. 105

Representation of extreme events of temperature
In this section modelled extreme temperature is compared to the ERA5 reanalysis. Figure 1 shows the 99 th percentile of ERA5 temperature time series at the daily (left panels) or 6-hourly (right panels) frequency, for DJF (upper panels) and JJA (lower Moving to the 6-hourly based extreme events, the fraction of land affected by a positive bias higher than 5 o C is more pronounced compared to the daily statistics, especially for the HR model during JJA (Figure 3). The positive bias over north western part of South America, during JJA, reaches 9 o C in HR and is only partially reduced in VHR; during the same season the positive bias of the same order of magnitude over central and eastern United States is not improved by the increased 125 resolution. Similar patterns, but less pronounced, are reflected on the averaged temperature as shown in supplemental Figure   S1.

Representation of extreme events of precipitation
Following the same structure as in previous section, the modelled extreme precipitation is here compared to the ERA5 reanalysis (from Figure 4 to Figure 6) for both daily and 6-hourly statistics, and then to the CHIRPS data-set (Figure 7) for 130 daily statistics only. Figure 4 shows the ERA5 seasonal extreme precipitation bias for DJF (upper panels) and JJA (lower panels) during the historical period. Left panels of Figure 4 refer to the 99 th percentile computed based on daily time series, while the right panels refer to the same percentile, but computed on 6-hourly time series. The higher extreme events magnitude associated to the 6-hourly results (Figure 4, right panels) compared to the daily statistics (Figure 4, left panels) is visible almost everywhere, but it is more pronounced over the Tropics. In fact this is where convective processes are expected, and it is well 135 known that convective precipitation tends to be short lived, while long-duration intense events (from 12 hours to 3 day) are often associated to synoptic weather systems and tend to have larger spatial scales (Chan et al. 2014, Scoccimarro et al. 2015. While reanalysis results are shown in millimeter per day (mm/day), the model biases are shown as percentage change with respect to ERA5 reanalysis (see Section 2.3 for details). In terms of average precipitation the VHR model shows less pronounced biases with respect to HR model ( Figure S2). In particular, during DJF, the negative bias over northern part of 140 South America is reduced from -80% to less than -50%, while the positive bias over western United States, South Africa and Australia is almost halved. During JJA, the bias tends to be less pronounced in both models, and the differences between the two are mainly located over Peru, Bolivia and Brazil ranging from about -80% of the HR model to values closer to zero, even positive, over a small portion of the domain in the VHR model.
A different behavior is found focusing on daily extreme precipitation events. No particular differences between high and low 145 resolution biases are found in the Northern Hemisphere, while the VHR model tends to overestimate the 99 th percentile of daily precipitation distribution in the Southern Hemisphere, in both seasons especially within the Tropics ( Figure 5). Similar patterns emerge for the 6-hourly based extreme precipitation ( Figure 6), but with a less pronounced overestimate in VHR over the Tropics, compared to HR results.
To corroborate our results in terms of precipitation biases, we computed the same statistics obtained from ERA5, using the 150 CHIRPS observational daily dataset for averages ( Figure S3) and extreme events (Figure 7). The biases computed with respect to the CHIRPS dataset are very similar to what we already described based on ERA5, but with a reduced magnitude ( Figure   7 compared to Figure 5) for extreme events in both models, during JJA, along the Tropics.
The worsening of the extreme precipitation bias moving from the HR to the VHR model is also associated to a deterioration of the representation of the fraction of precipitation associated to extreme events with respect to the total precipitation: Figure  155 S4 shows that both models reasonably well capture this metric in both seasons compared to ERA5, but the VHR model tends to overestimate such amount over the southern Hemisphere especially during DJF, except for the Australian domain. In particular, the strong positive bias of DJF average precipitation over Australia (up to 140%, Figure S3, higher panels) can't be attributed to the positive (about 50%, Figure 7 upper panels) bias found for extreme events, but must be associated to a right shift of the remaining part of the precipitation distribution, more pronounced for the non-extreme events. In fact, such potential contribution of the positive bias in extreme events to the bias in the average precipitation is also partially neglected by the model tendency to halve the fraction of water attributable to extreme events over this domain, compared to the observed fraction ( Figure S4).

Summary and conclusions 165
CMCC-CM2-HR4 and CMCC-CM2-VHR4 models are state-of-the-art fully coupled climate models, participating in different Model Intercomparison Projects within the 6th Coupled Model Intercomparison Project (CMIP6). CMCC-CM2-HR4 presents a horizontal resolution typical of most of the CMIP6 involved models, while CMCC-CM2-VHR4 has a horizontal resolution standard for the model involved in the High-Resolution Model Intercomparison Project (HighResMIP). In this paper we highlight the ability of the two models to represent extreme climate conditions, based on daily and 6-hourly time series, 170 comparing temperature and precipitation modelled distributions to the observed ones. In order to have a gridded data set representative of the observed climate at the 6-hourly time frequency we used ERA5 reanalysis, and for the precipitation analysis we also reinforce our findings on the base of the CHIRPS daily observations. On average, the highest resolution model (VHR) is better than the lower resolution model (HR) in representing average and extreme events of temperature both in terms of patterns and magnitude. This is true for daily and 6-hourly based statistics. The 175 described differences between the computed daily and 6-hourly biases in temperature statistics are very similar for HR and VHR models. This result suggests that the horizontal resolution is not at the base of such differences. Consequently, the worsening of model biases in high frequency (6-hourly) temperature statistics derives from deficiencies of the current version of model components and parameterizations in representing high-frequency processes.
Regarding the precipitation distribution, the VHR model performs better in representing averages, but more pronounced biases 180 appear in VHR compared to HR when focusing on extreme events, with a more evident degradation in the daily statistics compared to the 6-hourly. This latter result reduces the confidence we usually attribute to the highest horizontal resolution in modelling extreme precipitation, and is consistent with recent findings (Bador et al. 2020) suggesting that highest resolution models tend to produce more pronounced extremes than lower resolution ones and that many of them show lower skill -both in terms of intensity and spatial distribution -at higher resolution compared to their corresponding lower resolution version. 185 This emphasizes the need to focus not only on the horizontal resolution to improve the model ability in representing the climate system, but also on physics and tuning. In particular, in the highest resolution model, object of this analysis (VHR) the tuning parameters were kept constant, moving from the HR to the VHR version, in order to be compliant with the PRIMAVERA ( EU project) protocol.
The different biases, obtained based on daily and 6-hourly time frequencies, also suggest that for the setup of model physics 190 and tuning we need to consider the event distributions at different time frequencies, to take into account the representation of the different processes responsible of the extreme conditions emerging at the different frequencies (Scoccimarro et al. 2015). The poor performance of climate models in representing extreme precipitation is not improved in the last CMIP6 generation models, compared to the previous CMIP5 generation , and in this work we have shown that this is even more evident moving to the highest resolution version of the CMCC-CM2 model adopted for HighResMIP, consistently 195 with multi-model analysis performed at the same horizontal resolution (Bador et al. 2020).

Code and Data availability
The code relative to the CMCC-CM2-HR4 and the CMCC-CM2-VHR4 climate models is available on the Zenodo repository (URL: https://zenodo.org/record/5499856#.YTs5Bh2xVZP, doi: 10.5281/zenodo.5499856). The data relative 200 to the two models are available through the ESGF data portal (Scoccimarro et al. 2017b andScoccimarro et al. 2017c, respectively). ERA5 Reanalysis are available through the Copernicus data portal (https://climate.copernicus.eu). CHIRPS observational data set is available through the data storage of the University of California in Santa Barbara (https://www.chc.ucsb.edu/data/chirps).

Author contribution
ES, AB and DP implemented the two model versions and run the simulations. PGF supported the implementation of the Aerosol input management routines.TL prepared the radiative forcing files and supported the model output postprocessing.
ES prepared the manuscript with contributions from all co-authors. .