Towards an objective assessment of climate multi-model 2 ensembles . A case study : the Senegalo-Mauritanian upwelling 3 region 4

Climate simulations require very complex numerical models. Unfortunately, they 11 typically present biases due to parameterizations, choices of numerical schemes, and the 12 complexity of many physical processes. Beyond improving the models themselves, a way to 13 improve the performance of the modeled climate is to consider multi-model combinations. In the 14 present study, we propose a method to select the models that yield an efficient multi-model 15 ensemble combination. We used a neural classifier (Self-Organizing Maps), associated with a 16 multi-correspondence analysis to identify the models that represent some target climate property 17 at best. We can thereby determine an efficient multi-model ensemble. We illustrated the 18 methodology with results focusing on the mean sea surface temperature seasonal cycle over the 19 Senegalo-Mauritanian region. We compared 47 CMIP5 model configurations to available 20 observations. The method allows us to identify an efficient multi-model combination of 12 21 climate models. The future decrease of the Senegalo-Mauritanian upwelling proposed in recent 22 studies is then revisited using this multi-model selection. 23


2
In this study, we present a methodology aimed at selecting a coherent sub-ensemble of the 28 models involved in the Climate Model Intercomparison Project, Phase 5 (CMIP5) that best 29 represents specific observed characteristics. While the future evolution of the global climate is 30 subject to great changes and great uncertainty (Collins et al., 2014), the most common way to 31 predict the evolution of the climate is to run climate models that include fully coupled 32 atmosphere-ocean-cryosphere-biosphere modules. Due to their low resolution, and the fact that 33 they use different parameterizations of the physics, numerical schemes and sometimes include or 34 neglect different processes, these models have some marked biases in specific regions. They also 35 have different responses to an imposed increase of atmospheric greenhouse gases, which partly 36 explain their mean climate biases. This variety of models allows us to assess the uncertainty of 37 present climate representation when compared to observations and, by studying their dispersion, 38 to roughly estimate the uncertainty of the response to future climate change. 39 For several generations of climate models, it has been shown that for a large variety of their study also highlighted a large uncertainty due to model biases in this region. The method we 90 have developed selects a subset of the CMIP5 ensemble based on the capability of the climate 91 models to reproduce the SST seasonal cycle observed during the historical period in key sub-92 regions. These sub-regions are identified by a neural classifier. The method leads us to rank the 93 different models and to determine an efficient multi-model combination for the analysis of the 94 Senegalo-Mauritanian upwelling and projections of its behavior in global warming conditions. 95 The paper is structured as follows: section 2 presents the different climate models and the 96 climatological observations used in the study, together with the region of interest. The 97 classification method is described in section 3 and applied to the extended region. Section 4 98 presents a qualitative analysis able to group the different climate models in clusters presenting 99 similar performances. Section 5 investigates the results of the method applied over a smaller area,  This study is based on the CMIP5 (Coupled Model Inter-comparison Project Phase 5) database. 107 We use the output of 47 simulations listed in Table 1. The models are evaluated over the 108 historical period defined as    (Rayner, 2003). The main results regarding the future of the upwelling were shown to be 123 independent of the validation dataset primarily because the models' biases and the inter-model 124 differences were much larger than the differences between the validation datasets. The 125 methodological and oceanographic results presented in this study are thus expected to depend 126 only very weakly on the target dataset.

127
In section 6, the model selections are used to characterize the response of the upwelling to 128 climate change. This response is characterized in terms of SST anomalies as well as wind 129 intensity. For wind intensity, the simulated wind stress is compared to the TropFlux reanalysis.  region and the return branch of the subtropical gyre in the northwestern part. Therefore, we firstly 149 study the representation of the SST seasonal cycle intensity in the different climate models over a 150 relatively large region that includes part of the Canary current in the north and the Guinea dome 151 in the south. The so-called "extended region" is defined by a rectangular box extending from 152 9°W to 45°W and from 5°N to 30°N (Fig. 1). In a second step, we will proceed to the same 153 analysis and classification of the models within a much more focused (hereafter zoomed) region, 154 namely [16°W-28°W and 10°N-23°N] (Fig. 1). All the results below will be first shown for the 155 extended region. Comparison with the focused region will be done in section 4.  efficiently. In the present section, we describe the methodology we developed to score the 174 different climate models with respect to the observations. In section 4, we will tentatively group 175 the different climate models into blocks having the same behavior by using a Multiple  of a SOM and the topological order is achieved through a minimization process using a learning 199 data set base, here from the observations. The cost function to be minimized is of the form: indices the neurons of the SOM map, is the allocation function that assigns cycle" the vector z in the following. 219 We used a SOM-map to summarize the different SST seasonal cycles present in the "extended 220 region". We found that 120 prototypes (or neurons) can accurately represent the 743 vectors of D. 221 This reduction (or vector quantization) is made by using a rectangular SOM-map of 30 x 4 222 neurons. 223 We then reduced the number of neurons in order to facilitate their interpretation in terms of 224 geophysical processes. For this, we applied a HAC using the Ward dissimilarity (Jain and Dubes, The typical SST climatological cycles for each region-cluster are presented in Fig. 2b 238 together with their related error bars. We note that the region-clusters are well identified, their 239 typical climatological annual cycles of SST being well separated. Furthermore, the 7 region-240 clusters are spatially coherent and have a definite geophysical significance.

241
For the extended region under study, 7 therefore appears to be an adequate cluster The aim is now to find the model(s) that best fit the "observation field". A heuristic 254 manner is to compare the pattern of the different region-clusters of the CMIP5 models with 255 respect to those of the "observation field" through a sight evaluating process. This kind of 256 approach has been proposed in Sylla et al., 2019, and we indeed immediately see that some 257 models better fit the "observation field" than others. Nonetheless, this method remains very 258 subjective.

259
In the following, we present a more objective approach. We use the previous for example). None of these models is ranked among the best models, with a score greater than 295 11 60%. As indicated above, this representation gives a very synthetic view of the structure of the 296 seasonality of the SST cycle in each of the models, potentially a very useful guide for climate 297 modelers to identify rapidly major biases.  In the (PC1, PC2) plane, the shorter the distance between two models, the more similar  Two models (models 7 and 25) have a better skill than Model-group 4 and Model-All.

379
These two models are very close to the observations on the first two axes of the MCA (Fig 4). It

394
The classification presented above relies largely on the ability of the models to represent 395 the offshore seasonal cycle of the SST. In the following, we propose to test the classification over 396 a much more reduced area in order to focus the analysis on the upwelling area. This "zoomed 397 upwelling region" is shown in Fig. 1.

398
As for the extended region, we partitioned the observations of the zoomed upwelling 399 region with a SOM (ZSOM in the following) followed by a HAC. We then applied a new MCA 400 to regroup the climate models. We did a similar analysis as this performed in section 4. We indeed represents a four-class picture fairly consistent with the observed structure (Fig. 7).

435
Important biases yet remain. In particular, the ZRegion-clusters 2 and 4 characterizing the

443
It is notable that all the models forming ZModel-group 2 are included in Model-group 4.

444
For a more precise assessment, we can also project the entire Model-group 4, identified as the 445 best multi-model ensemble over the extended region, on the ZSOM (Fig. 9, right). We notice that  ]. Fig. 12 shows the difference of the SST seasonal cycle amplitude between these two 519 periods. The general behavior is that the SST cycle amplitude will reduce in the upwelling region. with a cluster on the SOM map and consequently to a region-cluster on the geographical map. 559 We built a similarity criterion by counting the number of grid points in a region-cluster of a 560 given model matching the same region cluster defined by processing the observation field. 561 We then computed the ratio between that matching number and the number of pixels of the study is easy to use but is far less informative than the vector-skill whose 7 components are the 582 skills associated with the 7 sub-regions.

583
Such a multi-model ensemble selection allows sampling a set of models in order to obtain a more 584 realistic climatology over the region of interest. The response of the upwelling to climate change 585 given by the different multi-model ensembles is quite robust in the sense that they give similar 586 qualitative answers. However, a too-selective ensemble of models may lead to noisy patterns. A 587 compromise thus has to be found: a large number of models leads to smoothed biases and 588 unrealistic patterns, but also damps the characteristics of the selection. On the other hand, 589 selecting the most realistic models may yield spurious biases in the ensemble mean.

590
As discussed in the introduction, different criteria have been used for extracting some efficient based on the representation of several distinct regional behaviors. In spite of several subjective 612 choices, including the studied domain and the statistical metrics, we argue that this method is a 613 step towards an objective selection of models, based on a quantitative assessment rather than a 614 qualitative analysis of maps of performance.

615
The methodology is general and can be adapted to any climate or oceanographic phenomenon.   Fig. 2, 3, 6, 7 and 8.