The AROME-WMED re-analyses of the first Special Observation Period of the Hydrological cycle in the Mediterranean experiment

Period of the Hydrological cycle in the Mediterranean experiment. Nadia Fourrié1, Mathieu Nuret1, Pierre Brousseau1, Olivier Caumont1, Alexis Doerenbecher1, Eric Wattrelot1, Patrick Moll1, Hervé Bénichou2, Dominique Puech1, Olivier Bock3, Pierre Bosser4, Patrick Chazette5, Cyrille Flamant6, Paolo Di Girolamo7, Evelyne Richard8, and Frédérique Saïd8 1CNRM, Université de Toulouse, Météo-France, CNRS, Toulouse, France 2Météo-France, Toulouse, France 3IGN, Univ. Paris Diderot, Paris, France 4ENSTA Bretagne Lab-STICC UMR CNRS 6285 PRASYS Team, Brest, France 5LSCE, Gif sur Yvette, France 6Laboratoire Atmosphères Milieux Observations Spatiales, Sorbonne Université, Université Paris-Saclay and CNRS, Paris, France 7Scuola di Ingegneria, Università della Basilicata, Italy 8Laboratoire d’Aérologie, Université de Toulouse, CNRS, UPS, Toulouse, France Correspondence: Nadia Fourrié (nadia.fourrie@meteo.fr)

The aim of this paper is to review the main characteristics of the AROME-WMED re-analysis versions in terms of data assimilation and forecast and to compare them with their real-time counterpart. The outline of the paper is as follows. Section 2 compares both configurations of the AROME-WMED re-analysis and the real-time versions. The different datasets assimilated in the re-analyses are specified in section 3. Section 4 evaluates the assimilation and forecast with respect to various observations. The qualitative and quantitative precipitation evaluation of the three AROME-WMED versions for the Intensive 5 Observation Period (IOP 8) case study is discussed in section 5. Conclusions are found in section 6.
2 Description of the AROME-WMED model

Model configurations
The AROME-WMED model strongly relies on the AROME-France model, which is the Météo-France operational limited-area model (Seity et al., 2011;Brousseau et al., 2016). This model is based on a non-hydrostatic equation system (Bénard et   using the so-called ECOCLIMAP database at 1km resolution (Masson et al., 2003). The topography is extracted from the Global 30 Arc-Second Elevation Data Set (GTOPO30, http://eros.usgs.gov/products/elevation/gtopo30.html) database for the real-time version and the first re-analysis. In the second re-analysis, the Global Multi-resolution Terrain Elevation Data 2010 (GMTED2010, Danielson and Gesch (2011)) database was used. A mean difference of -21 m was found between the orography interpolated onto the AROMEWMED grid from GMTED2010 used in the REANA2 and the one interpolated from GTOPO30 5 used in the REANA2 and SOP1 versions (Fig. 1).
Lateral boundary conditions are provided by the Météo-France global NWP ARPEGE system (Courtier et al., 1991). For REANA2, ARPEGE forecasts benefit from a maximum of assimilated data using longer cut-off analyses than for REANA1 and SOP1. Once per day, a 54-h forecast is run at 00 UTC for both re-analyses compared to the 48-h forecast range of the real time version. This allows the comparison of 24-h forecasted precipitation with raingauges which are mainly available for the The background error covariance matrix (the so-called B matrix) is a key component of the variational assimilation system, as it weighted the spread of the observation impact in the data assimilation system. As in AROME-France, a climatological background error covariance matrix is used and has been computed from an AROME-WMED data assimilation ensemble using the ensemble approach proposed by Brousseau et al. (2011). In the real time version and in the first re-analysis, the background error covariance matrix was computed over a 1-week period in October 2010, characterized by convective systems 5 over southern France and Catalonia. For the second re-analysis, the background error covariance matrix was computed over a longer period of the HyMeX special obervation period (17 to 31 October 2012) : this new B matrix is more representative of the encountered meteorological conditions. Comparing the variance error spectra of both matrices, (see for example in figure 2 error variance spectra at around 600 hPa), it appears that for all parameters, the error variances for REANA2 are smaller for the smaller horizontal scales of the model and on the contrary, are above for the larger ones, due to meteorological situations involving fewer small scale features 5 than during the period in October 2010, used to estimate the B matrix for SOP1 and REANA1. These changes in variance spectra are twofold: First, for temperature and specific humidity (resp. vorticity and divergence), this increase (resp. decrease) occurring for scales in the maximum of the variance spectra leads to a general increase (resp decrease) of spectrally averaged background errors (figure 3) in the new B matrix. This means that using the same background and a same observation, the analysis fits better 10 (resp. lesser) the temperature and humidity (resp. wind) observations using the REANA2 B matrix than the SOP1/REANA1 one.
Secondly, horizontal correlations length-scales are slightly longer in REANA2 than in REANA1 and SOP1 which allows each observation to modify the analysis over a more horizontally extended area.
The other components of the background error covariances (i. e. vertical correlations and cross-correlations between the 15 different analyzed model fields) are similar for both B matrices (not shown).

Observations common to all AROME-WMED versions
Both REANA1 and REANA2 re-analyses used all available data with no time constraint (cut-off), contrary to the SOP1 (realtime) version. These observations come from radiosondes, including mobile sites along the French Mediterranean coast, surface 20 stations and buoys, aircraft and wind profilers. Satellite data are dominant in the analysis, contributing to more than 50% of the assimilated dataflow, since a large part of the domain is over the sea. Satellite data comprise infrared and microwave radiances from polar-orbiting satellites, radiances from SEVIRI on board Meteosat Second Generation (MSG), surface wind from scatterometers over the Mediterranean Sea and atmospheric motion vectors.
The GNSS (Global Navigation Satellite System) Zenith Total Delay (GNSS-ZTD) observations from the EUMETNET EIG 25 GNSS water vapour programme (E-GVAP) network are assimilated as well. Another major data source is the French Doppler Radar network (around 18 radars in the AROME-WMED domain), which provides Doppler winds (Montmerle and Faccani, 2009) and reflectivities, from which are derived relative humidity profiles (Caumont et al. (2010); Wattrelot et al. (2014)), but their density is weather dependent, i.e. presence of rain or not. Fourrié et al. (2015) provide complementary information about assimilated data.

Observations specific to REANA2
In addition, new dataset and reprocessed observations were assimilated in REANA2. Table 2 summarises the main differences in terms of assimilated observations between both the two re-analyses. The GNSS zenithal total delays from the reprocessed dataset available in the HyMeX database  have been used. The methodology for their assimilation is described in Mahfouf et al. (2015). All available GNSS data were reprocessed homogeneously with a single software, more 5 precise satellite orbits and clocks, and additional sites were taken into account (e.g. Sardinia). This led to a better coverage as shown in Fig. 4, especially over France, the Iberian Peninsula and Italy. Furthermore, an updated static bias correction for each couple (GNSS station, analysis centre) was computed for the REANA2 version. Data from BLPBs (temperature, humidity and wind) were assimilated in both re-analyses REANA1 and REANA2. The raw data were averaged on 20-minute period approximately. Moreover, to guarantee the consistency of such data, averaging was only performed over periods corresponding 10 to stabilized flight segments. In REANA2, temperature data were discarded during daytime due to radiative bias and model errors in the boundary layer.
High vertical resolution radiosondes, where available in France including dedicated HyMeX mobile soundings, in some sites in Spain, as shown in Fig. 4), were used, instead of the classical TEMP messages, assimilated in the SOP1 and REANA1 versions as proposed in Ingleby et al. (2016). This leads to an increased data flow (100 to 150 data per profile instead of 30 15 for the TEMP message); extra sounding sites were also processed, such as L'Aquila (a research center in Italy) and Biscarosse, a French military site close to the Atlantic coast. Data from several Spanish Doppler radars (Valencia, Barcelona, Murcia, Almeria and Palma) have also been were also used in the second re-analysis after a careful quality control. Wind profilers data have also been were also carefully checked in order to remove spurious signals . Humidity retrieved from ground-based and airborne lidars have been were processed. Two ground-based research lidars were available processed: also assimilated according to the method described in Bielli et al. (2012); these data were thinned at a 15 km horizontal resolution to avoid horizontal error correlation problems in the data assimilation process. The associated observation errors 5 were deduced from the monitoring of standard deviation of differences between background simulations for new observation data types and observations and they are displayed in Figure 5. Some differences are observed on the plot for lidar data. The observation error for Leandre II data are smaller than the other ones and WALI assigned observation error is slightly larger than BASIL and TEMP ones. Concerning temperature and wind the assigned observation error are the same for dropsondes, radiosondes and profilers; the aircraft data errors are larger.

10
The amount of assimilated data per observation type for each AROME-WMED analysis version is given in Fig. 6. The number of assimilated data in REANA1 (red bars) is slightly increased with respect to the SOP1 version (black bars). This can be explained by the fact that all available observations and not only those present in real-time in the Météo-France database were assimilated. +9% additional data were thus assimilated in REANA1 compared to SOP1. Concerning the REANA2 (blue bars), +24% additional data with respect to SOP1 and +13% with respect to REANA1 were assimilated. The higher amount of observations number mainly comes from radiosondes (higher resolution and additional sites), profilers, satellite radiances, scatterometer wind estimates, surface parameters and ground based GNSS data. However, although five Spanish Doppler radars were included in REANA2, less data from radars were assimilated as a consequence of a revised statistics tuning.
Examples of the assimilated data distribution for a rainy day (26 September 2012) and a non-rainy day (5 October 2012) 5 are shown in Figure 7. First of all, satellite data contribute most to the observational set. This distribution varies depending on weather conditions (rainy/non-rainy). For the rainy day, radar data represent 6% of the total. The percentage of satellite data is reduced from 63.5% to 50% for a non rainy day. Infrared measurements (SEVIRI and IASI) are indeed strongly affected by the presence of clouds and thus discarded. In this case the proportion of radiosondes data increases for the rainy day (twice the amount of non-rainy day, due to additional radiosondes). The large increase in radiosonde data for 26 September 2012 is explained by the fact that the DTS was activated resulting in an increased frequency of radiosonde launches at specific sites. 4 Assimilation results

5
As a first validation step, the performance of the data assimilation systems from the three AROME-WMED sets were evaluated based on the analysis (AN) and first-guess (FG is the 3h forecast) departures from the assimilated observations. These departures provide information on the analysis increment for AN and on very short range forecast quality for FG. Some of these statistics (mean and RMS) are plotted on figure 8 (resp. figures 9 and 10) for observations related to humidity (resp. wind) and on figures 9 and 10 for wind. These datasets differ with respect to the AROME-WMED version as the quality check 10 based on the difference between the observation and the simulation can discard or not some observations due to a different background value. In addition some observations type such as Lidar observations or Spanish radars are specifically assimilated in REANA2. For the radiosondes and the wind profilers, the real-time observations were replaced with high resolution data and reprocessed data respectively in REANA2.
First of all, for all observations types, the RMS of AN departures are always smaller than the corresponding FG departures 15 as expected for a well-performing assimilation process.
As SOP1 and REANA1 use the same background statistics, results of these 2 sets are very close and slight differences are mainly explained by some differences in the different number of assimilated observations. For REANA2, the use of a different background error-covariances and additional observations has direct consequences on these statistics. For radiosounding in the troposphere, AN departures are smaller for humidity (in figure 8, first row) but higher for wind (in figure 9, first row)  . First guess (FG, solid lines) and analysis (AN, dashed lines) departure against radiosounding (mixing ratio (g/kg)) -row 1, against humidity derived from Doppler radar (humidity (percent)) -row 2, and against Lidars and dropsondes (mixing ratio (g/kg), only for REANA2) -row 3; columns correspond to mean departure (left), Root Mean Square departure (middle) and observations numbers (right). In the first two rows, black curves are for SOP1, red for REANA1, blue for REANA2. Orange lines are for Spanish radars in REANA2. Computation period extends from 05 September 2012 to 05 November 2012. due to the variations of the background error standard deviation described in section 2.2 : an increase for specific humidity and temperature (the background is less trusted and the resulting analysis is closer to observations), a decrease for vorticity and divergence directly related to the wind field (the background is more trusted and the resulting analysis is farther from the observations). In both cases, this has a positive effect : for these two fields the subsequent 3-hour forecasts are closer to the observations as indicated by lower FG departures, even for the wind while the RMS of analysis-observations are higher. This 5 results is enhanced by the use of high resolution vertical radiosondes which enable an increase of the observation number and a better comparison to the background than the TEMP message. For specific humidity, the RMS of AN and FG departure are respectively reduced by 30% and 15% between 1000 and 600hPa. For wind, the differences are smaller and reach +20% for AN departure and -10% for FG departure. The impact of the background statistic changes is also visible for wind measurements from aircraft (Figure 9 second row) whose number is similar between the three experiments and radial velocity from Doppler 10 radars ( Figure 10). The REANA2 AN departures are slightly larger than the SOP1 and REANA1, but the subsequent FG departures are smaller for the REANA2 than for REANA1 and SOP1 between 800 and 300 hPa. The reduction in humidity AN departures is less obvious for radar reflectivities (Figure 8, second row). These results suggest that the use of background error statistics more representative of the studied period allows for a better use of the observations.
Statistics on AN and FG departure are also informative on the quality of the additional observations only assimilated in 15 REANA2. For the second re-analysis, numerous wind profilers have been reprocessed and their number increased from 1,000 to 4,000 observations at 700 hPa (Figure 9 third row). This better quality induces a decrease of FG departures and a reduction of AN departures, despite a higher background error for wind.
Concerning the lidars (Figure 8 third row), it is worthy to note that the RMS background departures for BASIL and Leandre are very similar to the values obtained with radiosondes ( Fig. 8 first raw) showing data of comparable quality. WALI exhibits larger differences whose explanation is certainly linked to the fact that the lidar was located over land near the coast of the Menorca Island. Hence, the nearest AROME-WMED grid point is located over the Mediterranean Sea, which may introduce a discrepancy in the computation of the model equivalent, especially in the atmosphere low levels (boundary layer). It should be 5 also mentioned that lidar data represent very few data among the total number of assimilated data.
Dropsondes exhibit a larger humidity bias and RMS differences (more than 2 g/kg between 800 and 1000hPa) than radiosoundings (1.5 g/kg). Dropsonde measurements are therefore farer from the model values. This might be explained by the dropsonde sampling strategy, with launches close to convective areas, sampling low predictability areas, and leading to larger humidity differences between the model and the observations. However one can note that the AN departures are not impacted 10 by these differences in the FG departure.
Lastly, statistics for Spanish radar observations are compared to those of the French network (in figure 8 row 2 for humidity derived from reflectivities and figure 10 third for the wind force). Radar observations over Spain were available below 6000 m as a consequence of the sampling strategy. It appears that Spanish radar FG departures are higher than for French radars ones for Doppler wind below 2000 m (Fig. 10) and for reflectivities ( Fig. 8 row 2). Particularly, the latter ones exhibit a stronger dry 15 bias (i.e. observation -background > 0) which could be explained by a different observation preprocessing (in order to take into account the radar signal attenuation due to precipitation for example) for Spanish radars. If AN departures are increased for reflectivities, they remain very close to the French radar ones for radial velocity.

Surface parameter analysis and forecast
The surface observations used for the evaluation were extracted from the HyMeX database, which gathers the surface synoptic bias for each forecast range, which is a positive impact due to the modifications in the orography in REANA2. The standard deviation of forecast error, which increases with the forecast range, is also slightly and reduced up to the 18-h forecast range and this is statistically significant according a Bootstrap test. A bias reduction is also noticed for the 2-m relative humidity, together with a very small gain on the standard deviation (up to the 9-h forecast range). On the contrary, no real difference is 30 noticeable for the biases between the three systems for the 10 m wind statistics. The relative improvement in forecast RMS error brought by REANA2 is larger than REANA1 one (more than 3% for temperature and humidity at the 3-h forecast range and 1% for the wind). The benefit varies as a function of the forecast range and remains up to the 30-h forecast range (except for the wind).

Upper level atmosphere/troposphere forecast
The forecast quality of the various AROME-WMED versions is fisrt assessed against radiosonde observations. Figure 12 gathers the RMS differences between AROME-WMED forecasts and radiosondes for temperature, relative humidity and wind at 24-h, 36-h and 48-h ranges. Overall, scores of forecast starting from re-analyses are improved compared to those starting from SOP1. REANA1 improves the temperature forecast above 500 hPa at 24-h, the wind is improved over the whole troposphere  where REANA1 provides improved forecast. In addition, REANA1 forecast is better than SOP1 but generally in a less extend than REANA2.
At 48-h range, REANA1 and REANA2 improve the temperature forecast above 700hPa, the humidity forecasts are not improved, but wind forecast is improved above 600 hPa.  Figure 13. Correlation (upper panels) and standard deviations (bottom panels) of total integrated water vapour between AROME-WMED analyses (left panels) or forecasts (right panels) and GNSS observations ). AROME-WMED model was also assessed using integrated water vapour (IWV) obtained from Version 1 data of GNSS ground based stations. IWV was indeed found to be linked with heavy precipitation, a maximum being observed before heavy precipitation event and a drop of its value occuring during the maximum of precipitation Bock et al. (2016). Results are 10 presented in Fig. 13. These data being assimilated in REANA2 (and not in SOP1 and REANA1) the highest correlation (0.99) is found for each slot of the eight times of the REANA2 analysis. More than 32000 colocations were available to perform these computations. As expected, REANA1 and SOP1 correlations are lower(around 0.97); the maximum is observed for the 00UTC analysis slot and the minimum is noticed in the afternoon at 15UTC. The standard deviation of differences between IWV analyses and observation is lower (between 1.1 and 1.2 mm) for REANA2 than for SOP1 and REANA1 (above 1.8 mm).
The standard deviation is maximum at the 15UTC analysis slot (above 2 mm).
Concerning the forecast quality, as expected the IWV correlation between forecasts and observations decreases as the forecast range increases (from 0.99 down to 0.9 at 54-h). The largest score decrease is noticed in the very short forecast ranges.

5
A diurnal cycle of the score is also found (local minima at +15 hour and +39 hour ranges); REANA1 is characterized by a slightly higher correlation than SOP1 and the gain of REANA2 against REANA1/SOP1 is noticeable up to 24-h. The same conclusions apply for the standard deviation.
These results are confirmed over the sea with the validation against GNSS ZTD data (Figure 15), derived from a GNSS sensor on-board the Marfret-Niolon ship (Fig. 14). These data, which were not assimilated, represent an interesting independent 10 source of validation. This data set is made of 418 measurements collected during the period from 9 September 2012 00UTC to 1 November 2012 at 21 UTC and mainly in the western Mediterranean part of the AROME-WMED domain. Due to the small amount of data available, results are noisy. Nevertheless, it is noteworthy that the correlation between forecasts and observations is higher till the 24-h forecast range; standard deviation is lower up to the 24-h forecast range for REANA2 compared to SOP1 Validation vs ZTD Marfret-Niolon and REANA1. For the three simulations, a diurnal cycle of the ZTD bias exists. A stronger positive (moist) bias can be seen for the early forecast ranges of REANA2. At longer ranges the bias is more or less similar in the three simulations.

Surface precipitation
The evaluation is carried out with the 24-h accumulated precipitation (from 05 September to 05 November 2012) from the HyMeX database available in July 2017 (version4). These data were checked before computing scores. Only surface sta-5 tions with daily precipitation for the full period (i. e. with an uninterrupted series) were taken into account. A good coverage is obtained over France, Italy and Spain (Fig. 16). REANA2 seems to yield more precipitation compared to the other versions, especially over elevated terrain. This is confirmed with the frequency bias computed against raingauges over the shole AROME-WMED domain (Fig. bias). This bias is improved for small thresholds (<1mm/24h) in the REANA2 and these results are statistically significant. The degradation for thresholds exceeding 1mm/24h in the REANA2 is not significant according 10 boostrap test due to the lower number of observations. Even though the general precipitation pattern is similar in the three versions some differences can be noticed. For example, the maximum precipitation over Sardinia is not located at the same place. In REANA2 this local maximum is located in the eastern part of the island, whereas in REANA1 it is located over the Eastern part. In addition, more precipitation are simulated over the sea in the Gulf of Lion for REANA2. The 2 month-period accumulated rainfall amount shows some moister bias for REANA2 compared to REANA1 (not shown) and SOP1 mainly over  Figure 18 shows the Equitable Threat Scores (ETS, definition given in appendix of Ebert (2008)) and the frequency bias for the 24-h accumulated precipitation computed with all data available in version4 for Spain, France and Italy. The closer to 1 the ETS is, the better the forecast. Over Spain, the ETS is improved for both re-analyses and the gain is seen up to the 20 mm/24h 5 threshold. The ETS for small thresholds are improved with REANA2 (up to 1mm/24h) over France but no improvement is seen over Italy. In the re-analyses, the frequency bias decreases up to the 5 mm/24h threshold over Spain and France and only for small thresholds (less than 1mm/24h) over Italy. For large thresholds, the frequency bias is larger for REANA2 than for the other two AROME-WMED versions. These results are in agreement with the overall accumulation of precipitation found in Figure 16.

IOP8 qualitative evaluation
As illustrated in the quantitative forecast evaluation section, tiny improvements are noticed for REANA2 with respect to previous simulations for Quantitative Precipitation Forecasts (QPF). Such improvements in REANA2 can be found for specific periods of the HyMeX campaign. This is the case for IOP8, which took place during two days, from 28 to 29 September 2012.
The key pattern of this IOP was a cut-off low centred to the south west of the Iberian Peninsula (28 September 00UTC) moving At low levels, on 29 September 00UTC, a weak complex surface low was positioned over the Gulf of Lion, associated with the cut-off low as analysed by the global scale model ARPEGE. This cut-off drove a moist south easterly flow on its northeastern flank, towards the French Mediterranean coast, reinforced by orography (Cevennes ridge, which induced a barrier effect as shown in Buzzi et al. (2003)). On 29 September, this pressure minimum triggered heavy rainfall with embedded convection over the Gulf of Lion (morning) and later on over the northern part of Catalonia and western part of Cevennes 5 Vivarais. Daily precipitation amounts reaching 100 mm/24hr were recorded on the coastal zones along an axis from northern Catalonia to the Cevennes area, depicted by the red line extending from 40 . Such an amount of rainfall was also observed on the north-eastern part of the Gulf of Lion from the 3B42 TRMM estimates (Fig. 19d), that compare well, qualitatively and quantitatively, with in-situ measurements over land. The daily accumulated precipitation amounts for the real time and first re-analysis exceeding 50 mm/day are shifted too far westward when compared to raingauges (Fig. 20 a and b). The maximum rainfall amount, located over the Gulf of Lion is better localized, though overestimated, in the second re-analysis (Fig. 20c). The ETS was computed for the various forecasts (00-24, and 24-48 hour range) valid for 29 September (00-24UTC period). The score was also computed for the 06-30 hour forecast range (corresponding to the 24-hour period between 29 September 06UTC to 30 September 06UTC). Figure 21 presents these 5 ETS curves; one can see that generally the re-analyses (1 or 2) perform better than the real-time version of AROME-WMED; surprisingly the ETS scores are better for the 24-48-hr forecast range than for the shorter (00-24-h) lead forecast period. This degradation of the short range forecast could originate from a spin-up present in the very short ranges of the forecast that degrades the predicted precipitation during the first hours of the forecast.
The positive impact in QPF, may be linked to the better simulation of the deepening of the surface pressure low in the  00UTC forecast initialized with the first re-analysis is 1010 hPa at 03UTC (not deep enough and too early), while the forecast simulation initialized at 00UTC by the second re-analysis indicates a minimum of 1008 hPa at 09UTC.

Discussion and conclusion
The AROME-WMED model was initially developed to study and forecast heavy precipitating events over the western Mediterranean basin in the frame of the HyMeX programme. This model ran in real-time during both SOPs of HyMeX in Autumn 5 2012 and Winter 2013. Two re-analyses were run after the HyMeX Autumn campaign. The first one was carried out just after the campaign to provide a same model configuration over the whole period, because an upgrade of the AROME-WMED version occurred during the period. In addition a second re-analysis was perfomed a few years after and took into account as much data as possible from the experimental campaign (i. e. lidar and dropsonde humidity profiles) or from reprocessed data sets (such as GNSS ground stations ZTD, wind profilers, high vertical resolution radiosondes, Spanish doppler radars). It also 10 benefited from a more recent version of the AROME code including a orography change, and from improved background error statistics computed over a 15-day period of the first HyMeX observing period. The analysis and forecast fields of these three AROME-WMED versions are available in the HyMeX database (http://mistrals.sedoo.fr/HyMex)).
The characteristics and the quality of the three AROME-WMED versions are discussed in this paper. More observations are assimilated in both re-analyses. The first re-analysis included 9% additional data, and the second re-analysis assimilated 24% 15 more data. These data in the case of REANA2 mainly came from GNSS ground station, radiosondes and satellite radiances.
The use of background error statistics, more representative of the studied period, allows a better use of the observations in the second re-analysis. The root mean square differences between first-guess simulations and observations are the smallest for the second re-analysis. Depending on the change of the background statistics, the root mean square differences between analysis simulations and observations are adjusted. The observation departure study showed that the quality of research data such as  Concerning the forecast quality, the surface field forecast is better for the second re-analysis; the 2m temperature diurnal bias is reduced up to the 54-h forecast range. The forecast error standard deviation is improved for the first 18-h forecast ranges.
This improvement is mainly due to the change of the orograhy in REANA2. A reduction of the 2-m relative humidity bias is also found.
Upper-level forecasts of the three AROME-WMED versions were compared to radiosondes observations and the forecast 5 root mean square errors temperature relative humidity and wind are decreased in the mid-and upper-troposphere for both reanalyses up to the 48-h forecast range. The comparison with the reprocessed version 3 of GNSS data  shows that the second re-analysis IWV, in terms of analyses and forecasts, is better correlated than the first one and the real-time version up to the 24-h forecast range. The standard deviation of IWV differences is also lower. Moreover, a comparison to GNSS zenithal total delay independent data (i.e. not assimilated) from vessel Marfret-Niolon also shows this positive impact up to +24hour. This is an interesting result over a sensitive area, where no conventional measurement are available.
Larger values of accumulated precipitation during the 2-months period were obtained with the second re-analysis and the comparison with observations suggest an overestimation of large precipitation amount mainly over relief. However the frequency bias is decreased for smaller thresholds, over the AROME-WMED domain. Concerning the 24-hour precipitation 5 evaluation, this positive impact is less noticeable, but at least some improvement is diagnosed for the Iberian Peninsula and France for thresholds lower than 10 mm/24-h. The gain brought by the second re-analysis is smaller over Italy. Finally, the positive impact of second AROME-WMED re-analyse was detailed for the IOP8 high precipitating event which occurred over Spain and southern France, end of September 2012.
Preliminary studies with data assimilation experiments with only the code version changes including the new background 10 statistics, have shown that the gain in forecast score brought by REANA2 is due to the new observations assimilated and the new code version. Figure 22 illustrates this fact for the 36-h forecast range. A small reduction of the root mean square error is obtained with the assimilation of new observations for temperature and wind in the troposphere. The improvement brought by the observations is less clear for the humidity. Concerning the 24-h accumulated precipitation, REANA2 improves small thresholds (0.5, 1 mm/24h) compared to the preliminary experiment, REANA1 and SOP1. It is clear that the 2-m temperature 15 and humidity forecast bias improvement is related to the orography change. The improvement found in the REANA2 fields is therefore the result of all the changes made compared to REANA1 and SOP1. Studies are currently carried out to examine the respective impact of the additional observations such as reprocessed GNSS data, high resolution radiosondes, radars and lidars assimilated in the second re-analysis.
Data availability. The source code of AROME-WMED being derived from the operational AROME one, cannot be obtained but the analyses and forecast fields are available in the HyMeX database (http://mistrals.sedoo.fr/HyMex). Maziejewski are warmly thanked for helping to improve the manuscript.