On a new assessment method for long-term chemistry-climate simulations in the UTLS based on IAGOS data: application to MOCAGE CCMI-REFC1SD simulation

A wide variety of observation data sets are used to assess long-term simulations provided by chemistry-climate models (CCMs) and chemistry-transport models (CTMs). However, the upper troposphere–lower stratosphere (UTLS) has hardly been assessed in the models yet. Observations performed in the framework of IAGOS (In-service Aircraft for a Global Observing System) combine the advantages of in situ airborne measurements in the UTLS with an almost global-scale sampling, a ∼20-year monitoring period and a high frequency. If a few model assessments have been made using IAGOS database, none 5 of them took advantage of the dense and high-resolution cruise data in their whole ensemble yet. The present study proposes a method to compare this large IAGOS data set to long-term simulations used for chemistry-climate studies. For this purpose, a new software (named Interpol-IAGOS) projects all IAGOS data on the 3D grid of the chosen model with a monthly resolution, since generally chemistry-climate models provide 3D outputs as monthly means. This provides a new IAGOS data set (IAGOSDM) mapped at the model’s grid and time resolution. As a first application, the REF-C1SD simulation generated by MOCAGE 10 CTM in the framework of CCMI phase-I has been evaluated during the 1994–2013 period for ozone (O3) and the 2002–2013 period for carbon monoxide (CO). This comparison is exclusively based on the grid cells sampled by IAGOS, thus the assessed model output (MOCAGE-M) is obtained by applying a corresponding mask onto the grid. First, climatologies are derived from the IAGOS-DM product. Good correlations are reported between IAGOS-DM and MOCAGE-M spatial distributions. As an attempt to analyse MOCAGE-M behaviour in the upper troposphere (UT) and the lower stratosphere (LS) separately, UT 15 1 https://doi.org/10.5194/gmd-2020-328 Preprint. Discussion started: 13 October 2020 c © Author(s) 2020. CC BY 4.0 License.


Introduction
Chemistry-climate models (CCMs) and chemistry-transport models (CTMs) are essential tools for understanding the atmospheric composition, for providing information where measurements are lacking, and for predicting air composition future evolution. Assessing and reducing uncertainties on the processes controlling its past and future changes can be achieved by 10 comparing an ensemble of simulations from different models while using the same simulation setup. Among the model intercomparison projects, the main goal of the Chemistry-Climate Model Initiative (CCMI: Eyring et al., 2013) lies on the reduction of the uncertainties in the multi-model projections involving stratospheric ozone, tropospheric composition and climate change, but also in a better understanding of the atmospheric processes relevant for these thematics. CCMI has taken over from both SPARC CCMVal (Chemistry-Climate Model Validation: SPARC, 2010) focused on the stratosphere and IGAC ACCMIP 15 (Atmospheric Chemistry-Climate Model Intercomparison Project: Lamarque et al., 2013) dealing mainly with tropospheric composition. In this framework, a set of simulations has been designed to address its objectives. Among them, the  experiment aims at assessing the ability of the models to reproduce the actual atmospheric composition for the recent climate time period. The task for each participating model thus consisted in simulating as realistically as possible the tropospheric and stratospheric compositions in the last decades , following a common protocol. 20 Several studies have assessed the ability of REF-C1SD experiments, or previous similar simulations of air composition under recent climate conditions, to reproduce the mean tropospheric and/or stratospheric composition, by the use of monthly mean climatologies from observation data sets as reference, mostly from space. Froidevaux et al. (2019) based the evaluation of the REF-C1SD run from the CESM1-WACCM model on zonal monthly means of the stratospheric ozone column, using the Microwave Limb Sounder on Aura satellite (Aura-MLS) and the multi-satellite data set merged in the framework of the 25 GOZCARDS (Global OZone Chemistry And Related trace gas Data records for the Stratosphere) project. As described in Young et al. (2018), tropospheric ozone fields provided by the ACCMIP participating models have been assessed, referring to zonally averaged mixing ratios from the Tropospheric Emission Spectrometer (Bowman et al., 2013), and tropospheric ozone Only few studies compared observations (in situ measurements or from space) and CCMI  or similar simulations, focusing on the upper troposphere-lower stratosphere (UTLS). However, the latter is a key region regarding both the ozone (O 3 ) radiative forcing (Riese et al., 2012) and the stratosphere-troposphere exchanges (STEs) that substantially influence tropospheric ozone levels (e.g. Tao et al., 2019), albeit with a high uncertainty due to their different representations in models (Stevenson et al., 2006). Smalley et al. (2017) (Williams et al., 2019). In addition to ozonesondes, aircraft measurements from different campaigns were used in the evaluation of the REF-C1SD simulations from the model CESM1 CAM4-Chem (Tilmes et al., 2016). 15 Among available observation data sets, the commercial aircraft measurements from the on-going IAGOS European Research Infrastructure (In-service Aircraft for a Global Observing System: Petzold et al., 2015, http://www.iagos.org) are well designed to study ozone and CO on the long term, notably in the UTLS (Cohen et al., 2018). IAGOS observations started in August 1994 for ozone and in December 2001 for CO. They are characterized by a high spatio-temporal resolution and a wide coverage with most data gathered at cruise levels (9-12 km above sea level). Thus, IAGOS database is suited to assess long-term simulations 20 in this altitude range. Recently, its ozone data have been used to evaluate simulations from the models CESM1 CAM4-Chem (Tilmes et al., 2016) and GEOS-Chem (Hu et al., 2017) during the periods 1995-2010and 2012, respectively. Tilmes et al. (2016 used the IAGOS measurements gathered in the vicinity of Narita airport (Japan) only, and the comparison made by Hu et al. (2017) only spread over 2 years, while IAGOS ozone data are available since 1994 and covering a wide area, especially in the northern mid-latitudes from Western North America to East Asia. Gaudel et al. (2015) performed an evaluation of the 25 MACC (Monitoring Atmospheric Composition and Climate) reanalysis over Europe during 2003-2010, using IAGOS O 3 and CO measurements. However, this comparison was led using frequent simulation outputs, thus their methodology is not adapted to the assessment of the 3D outputs from the REF-C1SD simulations, which are monthly averages. Consequently, the IAGOS cruise data in the UTLS have been used neither as a whole ensemble nor to derive a monthly climatology for the evaluation of long-term chemistry-climate simulations. This is what we propose in the present paper.
To compare REF-C1SD simulations against IAGOS data, interpolating the simulation outputs onto the high-resolution observations is not possible because of the coarse spatio-temporal resolution of the REF-C1SD outputs. It would be very expensive computationally and not meaningful to interpolate monthly-mean model data onto very high frequency (a few seconds) IAGOS 5 measurement locations. Alternatively, the comparison could be performed after mapping the high resolution IAGOS data on the model grid, on a monthly basis. Several gridding methods already exist for in situ measurements. Some of them consist in interpolating the neighbouring measurements points onto each gridpoint (e.g. New et al., 2000). However, it requires to memorize all the measurement locations for a whole month. It is thus convenient for measurements with regular locations only, but their use on the IAGOS database would be expensive computationally as well. Variational methods are also widely employed 10 (e.g. Bourassa et al., 2005) but they concern integration, which is not our purpose. The present study aims at providing a new methodology designed to generate a gridded monthly data set from the IAGOS measurements, in order to evaluate REF-C1SD types of simulations. We also propose a set of relevant diagnostics for the model evaluation against IAGOS data mapped on the model grid. These diagnostics originate from Cohen et al. (2018) that studied climatologies and trends in ozone and CO, based on the analysis of the quasi-totality of the IAGOS database. The use of such a high spatial and temporal resolution 15 data set allows to account for inter-regional differences that could not be highlighted with zonal means. Its projection into a model grid suits well to the constraint of working on monthly outputs from multi-decadal simulations like REF-C1SD. In order to demonstrate the interest of the new methodology and its associated diagnostics, we perform the assessment on one of the REF-C1SD simulations, that of the MOCAGE CTM model.
In Sect. 2, we describe briefly the IAGOS observations, the CCMI model intercomparison project, the MOCAGE CTM that 20 we use in this study, and its configuration for the REF-C1SD simulation. In Sect. 3, we present the methodology proposed to map the IAGOS data set on the model grid on a monthly resolution, the chosen statistical metrics for models' evaluation and the different assessment diagnostics. In Sect. 4, we present a first application of this methodology on the evaluation of the MOCAGE REF-C1SD simulation. Strengths and weaknesses of the methodology and the chosen diagnostics are discussed.
Conclusions are given in Sect. 5. The European Research Infrastructure IAGOS (Petzold et al., 2015, http://www.iagos.org) provides in situ measurements on board several commercial aircrafts. The observations used hereafter have been performed in the framework of the on-going IAGOS-Core program that followed the MOZAIC program . Ozone (resp. CO) measurements started 5 in August 1994 (resp. December 2001), based on an UV (resp. IR) absorption technology, with an accuracy of 2 ppb (resp. 5 ppb), a precision of 2% (resp. 5%) and a time resolution of 4 s (resp. 30 s). Further information about the instruments can be found in Marenco et al. (1998) and Thouret et al. (1998) for O 3 , and in Nédélec et al. (2003) for CO. Nédélec et al. (2015) present a more recent evaluation of both ozone and CO instruments in the frame of IAGOS.
The IAGOS observations (referring to the IAGOS-Core database hereafter) sample frequently the whole troposphere nearby 10 airports, measuring vertical profiles during ascent and descent phases, and the UTLS during the cruise phases, mostly in the northern mid-latitudes where most of the flight observations are gathered. In these latitudes, a recent analysis of O 3 and CO climatologies and trends based on almost two decades of IAGOS cruise measurements has been performed in Cohen et al. (2018). In addition to global climatologies, the same analysis also focused on eight well-sampled regions in the UT and the LS separately. In order to generate comparable results with the latter, this study focuses on the same time period (1994)(1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013) and, 15 where relevant, on the same regions.

The CCMI project and the REF-C1SD experiment
CCMI phase-1 gathers a community of 18 chemistry-climate models (CCMs) and two CTMs, which description is given in the review of Morgenstern et al. (2017). A series of experiments have been designed to model tropospheric and stratospheric air composition for past, present and future climates. For each experiment, a common protocol is recommended to all participating 20 models. Amongst the CCMI simulations, the REF-C1SD reference experiment aims at modelling as realistically as possible the day-to-day tropospheric and stratospheric compositions in a recent climate, using specified dynamics (SD). For this purpose, as described in Eyring et al. (2013), the simulations are driven by (or nudged towards) dynamical reanalyses data sets (typically ERA-Interim or MERRA), and expanding from 1980 until 2010. For this long-term simulation, the 3D outputs fields of species concentrations are archived as monthly means. The MOCAGE model (MOdèle de Chimie Atmosphérique à Grande Echelle: Josse et al., 2004;Guth et al., 2016) is an offline global chemistry-transport model (CTM). The chemical scheme is composed by the coupling of the RACM (Regional Atmospheric Chemistry Mechanism: Stockwell et al., 1997) and the REPROBUS (REactive Processing Ruling the Ozone BUdget in the Stratosphere: Lefèvre et al., 1994) schemes, corresponding to tropospheric and stratospheric chemistry, respectively.

Methodology
The objective of the proposed methodology is to make possible the comparison between the full IAGOS database and the 3D 15 monthly mean volume mixing ratios from CTMs and CCMs simulations. Our approach consists in distributing the IAGOS observations, performed every 4 s, on a given model grid. A first application is proposed on MOCAGE REF-C1SD run, characterized by a ∼200 km horizontal resolution in the mid-latitudes, and a ∼800 m vertical resolution in the UTLS. In order to account for the eccentricity of the measurements inside one given cell, we chose a reverse linear interpolation at the first order, as described in Sect. 3.1 and illustrated in Fig. 1. The subsequent gridded monthly means are derived using weighted 20 averages, as described in Sect. 3.2, and are directly comparable to the model monthly mean outputs.  to locate the model's grid points (orange crosses) closest to this location; (c) to calculate a weighting coefficient for each dimension (α and β), depending on the distance between the measurement point and the "bottom-left" grid point; (d) the calculation of the weight for the four closest grid points. As indicated in the colour scale on the right, this weight ranges between 0 and 1.
(2018), this has been done with respect to Ertel potential vorticity (PV) and applied in eight northern mid-latitude regions selected because of their high level of sampling by IAGOS. The methodology used is explained in Sect. 3.4.

Reverse interpolation of a given measurement point on the model grid
At a given point where IAGOS measured a mixing ratio C obs (X) for an X species, the algorithm presented here locates its position on the model grid defined by its longitude, latitude and σ-hybrid pressure coordinates. More precisely, we locate the 5 7 https://doi.org/10.5194/gmd-2020-328 Preprint. Discussion started: 13 October 2020 c Author(s) 2020. CC BY 4.0 License. model grid point which is the closest west and south of, and below (in altitude) the observation point and which corresponds to the i th , j th and k th grid point coordinates respectively. As shown in Fig. 1c, a normalized weighting coefficient is then computed for each dimension (coefficients α, β, γ), increasing linearly with the distance between the measurement point and the (i, j, k) grid point. Note that the γ vertical coefficient is derived from log-pressure coordinates. Finally, a resulting 3D weight is computed for each of the eight closest cells. By noting the variable indexes I, J and K belonging to the ensembles {i, i+1}, {j, 5 j+1} and {k, k+1} respectively, we define the functions f I , g J and h K which values depend on α, β and γ respectively, such as: The resulting weight for each of the grid points surrounding the measurement location is thus defined as the following product: In this way, as illustrated in Fig. 1d, for a given cell (I, J, K) amongst the eight closest ones, this weight decreases with the distance between the measurement point and the model grid point. Note that since the simulation outputs are monthly averages, we use the monthly mean surface pressure for determining the hybrid σ-pressure on the 47 vertical grid levels for a given model longitude/latitude. Although the surface pressure can show an important intra-monthly variability, we verified that a 30 hPa change at surface would cause a variation weaker than 2 hPa on a given vertical grid level in the UTLS. Although caution is 15 needed while treating low-altitude measurements, the monthly resolution on the surface pressure field thus has a negligible impact on the cruise IAGOS data distribution onto the model vertical grid.

Deriving the monthly mean values from observations
The weight coefficients defined above correspond to one single observation data point. To obtain monthly averages from the observation data set, the last step consists in summing up all the values measured in the vicinity of the (i, j, k) grid point for 20 each month. Thus, for a given grid point (i, j, k), we define n as the index for the measurement performed in its vicinity during the considered month, and the corresponding mixing ratio for the species X is noted C obs, n (X), and N the total amount of measurements performed in this vicinity. The monthly value of the X mixing ratio at (i, j, k) is then derived with the equation: where the denominator is equivalent to the amount of measurement points performed in the (i, j, k) grid cell during the chosen month. Hereafter, we refer it as N eq .
In the end, this method yields monthly fields of IAGOS O 3 and CO mixing ratios (or any other variable measured by IAGOS, Note that the measurement points on the MOCAGE vertical levels below level 28 (∼360 hPa) are considered as corresponding to ascent or descent phases of the flights. These measurements are not processed, since they are only available in small areas 10 close to airports. Levels 27 and 28 also correspond to these phases but include cruise measurements above elevated lands, since 3.3 Methodology for the assessment of the climatologies

Filtering conditions
For the climatological part of this study, we chose to perform a seasonal and a yearly analysis. Avoiding sampling biases where and when IAGOS-DM data (counted as N eq ) are not numerous enough requires that the seasonal sample N eq reaches 20 a minimum threshold to be selected (noted N thres ). We chose to set this N thres limit depending on latitude to account for the varying gridbox area, and on the chemical tracer to account for the shorter period for CO measurements compared to O 3 . N thres therefore decreases with latitude following a cosine function, similarly to the model horizontal grid cell areas. The reference threshold N thres, ref corresponds to O 3 measurements for gridbox areas during a given season, over the whole period. It has been set to N thres, ref =100 as a compromise between sampling robustness and a large-enough amount of data in IAGOS-DM 25 sample. Accounting for the shorter CO measurement period compared to O 3 , the corresponding N eq threshold for this species is derived by applying a factor 0.6, leading to 60. Last, the reference filter is defined seasonally. The filters defined here are thus quadrupled for yearly climatologies.

Statistical metrics for assessing the climatologies
Quantifying a simulation assessment requires the use of statistical parameters. This paragraph aims at defining the chosen metrics, and at justifying this choice. Pearson's coefficient is a key result from linear regressions. It is used to quantify the 5 correlation between two signals. If we call (m i ) i∈ 1,N and (o i ) i∈ 1,N the lists of modelled and observed values respectively, their correlation is defined as: wherem andō are the mean values and σ m and σ o their respective standard deviations. Quantifying total biases and mean errors is also primordial in a model assessment. However, the use of the absolute mean bias and root mean square error (RMSE) 10 may not be relevant for climatological purposes because of a strong influence from observed outliers. In our context, another inconvenient lies in the strong vertical O 3 gradient near and above the tropopause. It tends to induce a strong absolute bias with respect to the tropospheric mixing ratios, since it makes the O 3 absolute mean bias and RMSE mainly depending on the highest vertical grid cells. The normalized bias metric (and associated standard error) is chosen for a better representativeness of biases for both low and high mixing ratios. The modified normalized mean bias (MNMB) and the fractional gross error 15 (FGE) are respectively defined as: and 3.4 Methodology for assessing the seasonal cycles in the UT and in the LS 20 A second part of this assessment targets the behaviour of the model in the UT and the LS. The diagnostics we use for this purpose are adapted from Cohen et al. (2018). In the latter study, based on Thouret et al. (2006), the tropopause for each IAGOS individual measurement was defined as the 2 PVU isosurface derived from the ECMWF operational analysis, with a 3-hour resolution, before deriving monthly means in the two layers. The present work is based on monthly gridded fields, including potential vorticity (PV). Consequently, determining whether a given cell is mostly composed by tropospheric or stratospheric air masses is achieved with a monthly resolution. For this purpose, we use the PV from the dynamical field (based on ERA-Interim), yielding 6-hourly PV values which were averaged monthly to match the simulation outputs. A given grid point is then considered as belonging to the UT if its monthly PV is lower than 2 potential vorticity units (PVU), and 5 to the LS if the PV is greater than 3 PVU. The cells which PV ranges between 2 and 3 PVU are considered as belonging to the transition zone separating the two layers and are not selected. In order to enhance the distinction between the UT and the transition zone, the first model level below the 2 PVU threshold is also filtered out from the UT. The 2 PVU threshold is derived from a log-pressure interpolation between the grid points. We also filter out the grid boxes where this PV classification is not We estimated that a supplementary 40 ppb interval would limit an exaggerated filtering of grid cells monthly values.
As in Cohen et al. (2018), we focus our analysis on the seasonal cycles for eight regions in the northern mid-latitudes that 15 are well sampled by IAGOS. Their coordinates and their corresponding sampling are detailed in Table 1 in Cohen et al. (2018).
Because of the 2 • × 2 • horizontal grid resolution in the simulation, we applied a 1 • eastward or northward shift on the oddcoordinated edges. The subsequent regions defined in this paper are shown in Fig. 2. For each of them, the monthly means are calculated by averaging the gridded monthly means separately in the UT and the LS. The latter values were defined as described in Sect. 3.1 and 3.2.

20
In the previous study, the regional monthly means with N eq lower than 300 were filtered out. Here, due to the loss of data caused by the monthly resolution, we lowered this minimum threshold to 150 in order to keep taking the less sampled regions into account, as Western North America and Siberia. Still, we kept the criterion from Cohen et al. (2018) which required at least 7 days between the first and last measurements, probably ensuring the averages to be representative of a synoptic timescale.
Following the same study, the computation of the seasonal cycles is based on the years exhibiting seven available months or 25 more, distributed on three seasons at least. This criterion avoids biases linked to the inter-seasonal differences in the sampling, thus ensuring a good representativeness of the whole year.   be underestimated (resp. overestimated) in all vertical grid levels in summer (resp. winter). Note that the discontinuity over

EUS
Greenland is due to its topography causing a steep elevation of the vertical grid levels.
In Fig. 4, CO also shows a good correlation between the two data sets, notably with the same maxima and minima locations.
But the CO mixing ratio is generally overestimated by the model, especially over East Asia and India. In the northern midlatitudes, the seasonal climatologies in Figs. A5-A8 generally show an overestimation in winter and spring and a less-visible 5 underestimation in summer and fall.    In this section, we attempt to evaluate the simulation in the UT and the LS separately, focusing on the seasonal cycles. For this, we sort both data sets between the two layers as explained in Sect. 3.4. As a first step, before comparing the simulation to the observations, we analyse the impact of the mapping method for IAGOS onto MOCAGE grid on a monthly basis. For this purpose, two versions of the IAGOS data set are used. Hereafter, IAGOS-HR refers to the high-resolved IAGOS data synthesized in Cohen et al. (2018), where every single measurement was categorized as belonging to the UT (P TP +15 hPa < P 15 < P TP +75 hPa), the transition layer or the LS (P < P TP -15 hPa), and where regional monthly means were derived by averaging every value measured above the defined region. In contrast, IAGOS-DM refers to the new product presented in this paper, i.e. the IAGOS data distributed on the model's grid, directly comparable to the simulation. Note that IAGOS-HR seasonal cycles were computed on the original regions' coordinates, but the changes induced by some 1 • difference are expected to be negligible, based on the geographical sensitivity tests mentioned in Cohen et al. (2018).

20
The comparison between the two IAGOS products in matter of seasonal cycles is proposed in Figs. 7 and 8, respectively for O 3 and CO. They are shown with their corresponding interannual variability (IAV), defined as a year-to-year standard deviation. In Fig. 7, both IAGOS versions show a summertime O 3 maximum in the UT and a springtime maximum in the LS.
A lessened contrast between the UT and the LS is observed in IAGOS-DM. In the UT, the O 3 volume mixing ratio and its interannual variability are higher in IAGOS-DM than in IAGOS-HR for winter and fall seasons (∼ 60 ± 20 ppb compared to 25 ∼ 50 ± 10 ppb), whereas they are similar in spring and summer. In this layer, the most important differences between the two versions thus take place during lower-ozone seasons. In the LS, the O 3 amounts are lower in IAGOS-DM (∼ 125-400 ppb) than in IAGOS-HR (∼ 150-450 ppb) during the whole year. There are two main reasons that explain the lower O 3 amounts in the LS and the higher amounts in the UT in IAGOS-DM compared to IAGOS-HR. First, the projection of IAGOS observations with a very fine vertical resolution onto MOCAGE vertical grid with a ∼800 m vertical resolution. Second, the use of a monthly PV cannot provide the description of the day-to-day variations of the tropopause altitude, whereas the latter can be important to sort the data points between the two layers. In other words, by using the monthly mean PV from the simulation, some of the IAGOS measurement points may be attributed to the LS while being in the UT (or in the tropopause layer) and In Fig. 8, the CO seasonal cycles in the UT are consistent between IAGOS-HR and IAGOS-DM, with a generally low differ-5 ence, a common springtime maximum, and a consistent inter-regional variability: a higher CO level in the two regions on the Pacific coast (Northwest America and Northeast Asia), higher summertime amounts in Siberia and Northeast Asia, and lower CO levels in the two southernmost regions (the West Mediterranean basin and Middle East). Note that the PV monthly resolution leads to a lessened sampling in the UT in IAGOS-DM. In the North Atlantic region where aircraft trajectories describe a narrow altitude range, the resulting seasonal cycle was incomplete so that we chose to exclude it from the figure.

25
We now assess the MOCAGE-M seasonal cycles by comparing them to IAGOS-DM. As complements to Fig. 7, statistical results are given in Table 3. Note that inter-regional averages have been computed only to synthesize the assessment and to provide quantifications that confirm some features seen in the figures. As they are similar with zonal averages, they are not meant to have a geophysical signification. A qualitative summary is also provided in Table 4. In the UT, MOCAGE-M shows a the stratospheric influence on the UT is overestimated in the simulation. The inter-regional averages shown in Table 3 confirms the significant difference between the two data sets in the UT, both from O 3 mixing ratio (103 ± 6 ppb in MOCAGE-M 5 compared to 68 ± 8 ppb in IAGOS-DM) and from the seasonality (r = 0.17). In the LS, the simulation reproduces well the cycles including the seasonality (r = 0.87 as shown in Table 3), the magnitude, the amounts of ozone (199 ± 25 ppb compared to 229 ± 44 ppb from IAGOS-DM) and the inter-regional differences. The latter are characterized in both data sets by lower ozone levels in the two southernmost regions (West Mediterranean basin and Middle East) and higher ozone levels in the two northernmost regions (Western North America and Siberia). Without the noisy signal characterising Western North America 10 and the West Mediterranean basin in IAGOS-DM, the springtime interannual variabilities spread from ∼200 ppb up to ∼400 ppb in both data sets, showing another point well reproduced by the model. Though on a yearly basis, according to Table 3, the model tends to underestimate ozone IAV on average by a factor 1.8.  Table 3. However, the simulation overestimates the CO mixing ratios in the two Pacific coast regions, and the seasonal maxima generally take place during late winter-early spring in the simulation, earlier than the observed middle-of-spring maxima. The seasonal minima are in phase with the observations. In the LS, the seasonal cycles concerns the summer season, in contrast to the UT, suggests that summertime convection also plays a non-negligible role.

Summary and conclusions
We developed a methodology that makes the IAGOS database ready to assess chemistry-climate long-term model simulations to locate strengths and weaknesses of the model, but also for the whole UTLS grid cells for the purpose of a bulk comparison that could be reiterated on other model simulations.
Another step consists in a comparison of the seasonal cycles between IAGOS and MOCAGE simulation in the upper troposphere (UT) and the lower stratosphere (LS). It lies on the use of a monthly mean calculated PV field to define a UT and a LS separated by a transition layer, following the same principle as in Thouret et al. (2006). The mean seasonal cycles have been 25 compared over the eight well-sampled regions defined and analysed in Cohen et al. (2018). The application to the assessment of this REF-C1SD experiment by MOCAGE is preceded by an analysis of the changes induced in IAGOS seasonal cycles by the projection on the model monthly grid. As expected, going from IAGOS-HR to IAGOS-DM systematically leads to an increase (resp. decrease) in upper-tropospheric (resp. lower-stratospheric) O 3 , to an increase in lower-stratospheric CO and generally to a slight decrease in upper-tropospheric CO. The use of a monthly mean PV field and the ∼ 800 m vertical reso- The present methodology could easily be applied to CCMI REF-C1SD simulations from other models, both for an intermodel comparison and for assessing CCMI products against IAGOS database, notably intermodel-averaged fields. To a greater extent, it can be used on a wide range of long-term simulations including CCMs free runs in order to perform climatological comparisons. Furthermore, the assessment illustrated in this study is based on two chosen applications of our methodology, i.e. MOZAIC-IAGOS database is supported by AERIS (CNES and INSU-CNRS). Data are also available via AERIS web site www.aeris-data.fr.
Yann Cohen acknowledges the University of Toulouse for providing adminstrative support for his PhD.
Financial support. This research has been supported by the Occitanie region and Météo-France.
Competing interests. The authors declare that they have no conflict of interest.