CrocO_v1.0 : a Particle Filter to assimilate snowpack observations in a spatialised framework

Monitoring the evolution of the snowpack properties in mountainous areas is crucial for avalanche hazard forecasting and water resources management. In-situ and remotely sensed observations provide precious information on the snowpack but usually offer a limited spatio-temporal coverage of bulk or surface variables only. In particular, visible-near infrared (VISNIR) reflectance observations can inform on the snowpack surface properties but are limited by terrain shading and clouds. Snowpack modelling enables to estimate any physical variable, virtually anywhere, but is affected by large errors and uncer5 tainties. Data assimilation offers a way to combine both sources of information, and to propagate information from observed areas to non observed areas. Here, we present CrocO, (Crocus-Observations) an ensemble data assimilation system able to ingest any snowpack observation (applied as a first step to the height of snow (HS) and VIS-NIR reflectances) in a spatialised geometry. CrocO uses an ensemble of snowpack simulations to represent modelling uncertainties, and a Particle Filter (PF) to reduce them. The PF is known to collapse when assimilating a too large number of observations. Two variants of the PF 10 were specifically implemented to ensure that observations information is propagated in space while tackling this issue. The global algorithm ingests all available observations with an iterative inflation of observation errors, while the klocal algorithm is a localised approach performing a selection of the observations to assimilate based on background correlation patterns. Experiments are carried out in a twin experiment setup, with synthetic observations of HS and VIS-NIR reflectances available in only a 1/6th of the simulation domain. Results show that compared against runs without assimilation, analyses exhibit an 15 average improvement of snow water equivalent Continuous Rank Probability Score (CRPS) of 60% when assimilating HS with a 40-member ensemble, and an average 20% CRPS improvement when assimilating reflectance with a 160-member ensemble. Significant improvements are also obtained outside the observation domain. These promising results open a way for the assimilation of real observations of reflectance, or of any snowpack observations in a spatialised context. 1 https://doi.org/10.5194/gmd-2020-130 Preprint. Discussion started: 10 July 2020 c © Author(s) 2020. CC BY 4.0 License.

Abstract. Monitoring the evolution of snowpack properties in mountainous areas is crucial for avalanche hazard forecasting and water resources management. In situ and remotely sensed observations provide precious information on the state of the snowpack but usually offer limited spatiotemporal coverage of bulk or surface variables only. In particular, visible-near-infrared (Vis-NIR) reflectance observations can provide information about the snowpack surface properties but are limited by terrain shading and clouds. Snowpack modelling enables the estimation of any physical variable virtually anywhere, but it is affected by large errors and uncertainties. Data assimilation offers a way to combine both sources of information and to propagate information from observed areas to non-observed areas. Here, we present CrocO (Crocus-Observations), an ensemble data assimilation system able to ingest any snowpack observation (applied as a first step to the height of snow (HS) and Vis-NIR reflectances) in a spatialised geometry. CrocO uses an ensemble of snowpack simulations to represent modelling uncertainties and a particle filter (PF) to reduce them. The PF is prone to collapse when assimilating too many observations. Two variants of the PF were specifically implemented to ensure that observational information is propagated in space while tackling this issue. The global algorithm ingests all available observations with an iterative inflation of observation errors, while the klocal algorithm is a localised approach performing a selection of the observations to assimilate based on background correlation patterns.
Feasibility testing experiments are carried out in an identical twin experiment setup, with synthetic observations of HS and Vis-NIR reflectances available in only one-sixth of the simulation domain. Results show that compared against runs without assimilation, analyses exhibit an average improvement of the snow water equivalent continuous rank probability score (CRPS) of 60 % when assimilating HS with a 40member ensemble and an average 20 % CRPS improvement when assimilating reflectance with a 160-member ensemble. Significant improvements are also obtained outside the observation domain. These promising results open a possibility for the assimilation of real observations of reflectance or of any snowpack observations in a spatialised context.

Introduction
Seasonal snowpack is an essential element of mountainous areas. Monitoring the evolution of its physical properties is essential to forecasting avalanche hazard  and rain-on-snow-related floods (Pomeroy et al., 2016;Würzer et al., 2016) as well as monitoring water resources (Mankin et al., 2015). Observations alone are too scarce to monitor snowpack conditions. In situ observations provide precise observations of several key variables, but they lack spatial representativeness and have poor spatial coverage. Remote sensing of snowpack variables such as the height of snow (HS; m), snow water equivalent (SWE; kg m −2 ), visible-near-infrared (Vis-NIR) reflectance and surface temperature provides comprehensive information over large areas but usually has a limited temporal resolution for a small set of variables. Furthermore, these observations are usually available in fractions of simulation domains only, even for spaceborne data (Davaze et al., 2018;Veyssière et al., 2019;Shaw et al., 2019). For instance, snowpack Vis-NIR reflectances from moderate-resolution (250-500 m) satellites such as MODIS and Sentinel-3 can help constrain snowpack surface properties such as microphysical properties (characterised by specific surface area -SSA; m 2 kg −1 ) and lightabsorbing particle content (LAP; g g −1 snow ) (Durand and Margulis, 2006;Dozier et al., 2009). However, in areas covered by clouds or forests and/or affected by high sub-pixel variability (ridges, roughness, fractional snow cover) and shadows, satellite retrievals are less accurate (Masson et al., 2018;Lamare et al., 2020), and data should be filtered out (Cluzet et al., 2020a). The higher resolution offered by products from Landsat and Sentinel-2 might be an avenue to address this issue (e.g. Masson et al., 2018;Aalstad et al., 2020), but at these resolutions, reflectance retrievals are quite noisy due to e.g. digital elevation model errors (Cluzet et al., 2020a). Finally, note that pixel fractional snow cover (snow cover fraction, SCF) can be accurately retrieved even from noisy reflectances (Sirguey et al., 2009;Aalstad et al., 2020), but it inherits spatio-temporal limitations. SCF informativeness is also limited in deep snowpack conditions (De Lannoy et al., 2012).
Snowpack models of different complexity offer exhaustive spatial and temporal coverage (Krinner et al., 2018). They are applied within several spatial configurations, including the collection of points on regular or irregular grids . In this paper, "spatialised" indistinctly refers to any of these configurations. Detailed snowpack models are the only ones able to assess avalanche hazard and monitor water resources , but these applications are limited by their considerable errors and uncertainties (Essery et al., 2013;Lafaysse et al., 2017). In that context, combining remote sensing observations with models through data assimilation is an appealing solution . Indeed, data assimilation combines the spatial and temporal coverage of snowpack models with the available information from observations in an optimal way. Assimilation of optical reflectance could reduce modelled SWE errors by up to a factor of 2 (Durand and Margulis, 2007;Charrois et al., 2016), and preliminary studies show its potential for spatialised assimilation (Cluzet et al., 2020a). Assimilation of HS is very efficient in reducing modelled SWE errors (Margulis et al., 2019). However, the limited spatial coverage of observations stresses the need for data assimilation algorithms able to propagate snowpack observational information into unobserved areas (Winstral et al., 2019;Cantet et al., 2019;Largeron et al., 2020).
A particle filter with sequential importance resampling (PF-SIR; Gordon et al., 1993;Van Leeuwen, 2009) is a Bayesian ensemble data assimilation technique well suited to snowpack modelling (Dechant and Moradkhani, 2011;Charrois et al., 2016;Magnusson et al., 2017;Piazzi et al., 2018;Larue et al., 2018). PF-SIR is a sequential algorithm relying on an ensemble of model runs (particles) which represents the forecast uncertainty. At each observation date, the prior (or background) composed of the particles is evaluated against the observations. The analysis of PF-SIR (later on PF) works in two steps. In a first step, so-called "importance sampling", the particles are weighted according to their distance to the observations (relative to the observation errors). Then, a resampling of the particles is performed in order to reduce the variance in the weights. The ensemble Kalman filter (EnKF Evensen, 2003) has also been widely used for snow cover data assimilation (e.g. Slater and Clark, 2006;De Lannoy et al., 2012;Magnusson et al., 2014). However, the PF is more adapted to models with a variable number of numerical layers such as detailed snowpack models (Charrois et al., 2016).
The PF could be used in a spatialised context to propagate information from observations as suggested by Largeron et al. (2020) and Winstral et al. (2019). Contrary to the EnKF, such applications are rare to date (e.g. Thirel et al., 2013;Baba et al., 2018;Cantet et al., 2019). Indeed, spatialised data assimilation with the PF is not straightforward because of the degeneracy issue, i.e. only a few particles are replicated in the analysis, often resulting in a poor representation of the forecast uncertainties. Degeneracy can be mitigated by increasing the number of particles, but the required population scales exponentially with the number of observations simultaneously assimilated (Snyder et al., 2008). Furthermore, an accurate representation of spatial error statistics by the ensemble is essential for the success of the assimilation system. To achieve that, the required ensemble size also scales exponentially with the system dimension, an issue known as the curse of dimensionality . These issues are severe drawbacks when considering applications of the PF to large domains (i.e. implying a large number of observations and/or simulation points) with a reasonable number of particles (Stigter et al., 2017).
Several solutions exist to tackle PF degeneracy. A first approach is to inflate the observation errors in the PF. The tolerance of the PF is increased, leading to more particles being replicated. This approach is based on the fact that observation error statistics (including sensor, retrieval and representativeness errors) are usually poorly known and underestimated. It can also be used as a safeguard to prevent the PF from degenerating on specific dates when observations are not compatible with the ensemble. PF inflation was successfully implemented in point-scale simulations of the snowpack (Larue et al., 2018). When dealing with a large number of observations, inflation might lead to degeneracy or null analysis (posterior equal to the prior). In this work, we generalise over space the inflation of Larue et al. (2018), trying to ingest all the observations into a single analysis over the domain in a so-called global approach.
PF localisation is a more widespread alternative, tackling degeneracy by reducing the number of observations that are simultaneously assimilated by the PF (Poterjoy, 2016;Poterjoy and Anderson, 2016;Penny and Miyoshi, 2016;Poterjoy et al., 2019; italic notations are taken from the review of Farchi and Bocquet, 2018). In this method, the simulation domain is divided into blocks whereby different PF analyses are performed considering a local subset of observations (domain) based on a localisation radius. This makes it possible to constrain the model in locations that are not directly observed but with nearby observations. Contrary to global approaches, localisation has the disadvantage of producing spatially discontinuous analyses (each point receives a different analysis). This issue can be mitigated in various ways (Poterjoy, 2016;Farchi and Bocquet, 2018;Van Leeuwen et al., 2019).
The underlying hypothesis of localisation is that model points are independent beyond a certain distance; i.e. constraining one point with the observation from a point that is too distant would be meaningless and likely degrade the analysis performance (Houtekamer and Mitchell, 1998). However, in the case of small simulation domains or modelled systems driven by large-scale coherent causalities, largescale correlations (relative to the domain size) may be physically sound, and defining a localisation radius may be a difficult task. In order to address this issue, we developed a new localisation approach called k localisation, whereby localisation domains are based on background correlation patterns.
These developments were implemented into CrocO (Crocus-Observations), an ensemble data assimilation system able to sequentially assimilate snowpack observations with a PF in a spatialised context. CrocO can be implemented in any geometry (e.g. within a distributed (gridded) framework or any irregular spatial discretisation). Here, we apply CrocO in a semi-distributed framework, which is a conceptual spatialised geometry used operationally by Météo-France for avalanche hazard forecasting Morin et al., 2020). This framework is similar to many topographically based discretisations in hydrological models (e.g. Clark et al., 2015). This setup enables us to account for the snowpack variability induced by the topography at the scale of a mountain range through meteorological conditions (elevation controls the air temperature and precipitation phase) and the snowpack radiative budget (also dependent on the aspect and slope angle) (Durand et al., 1993).
CrocO uses an ensemble of stochastic perturbations from the SAFRAN meteorological analysis (Durand et al., 1993;Charrois et al., 2016) to force ESCROC (Ensemble System CROCus; Lafaysse et al., 2017), the multi-physical version of the Crocus snowpack model (Vionnet et al., 2012). The ensemble setup accounts for the major sources of uncertainties in snowpack modelling (Raleigh et al., 2015) and was formerly described and evaluated in semi-distributed geometry by Cluzet et al. (2020a).
Inflation and k localisation were implemented into CrocO. Here, we present CrocO and evaluate how it addresses the issues of reflectance observation sparseness and PF degeneracy in the context of snowpack modelling. This problem is divided into two scientific questions. (1) Is CrocO PF able to efficiently spread the information from sparse observations in space without degenerating? (2) Is the spatial information content of reflectance observations valuable for snowpack models? We assess these questions by evaluating the performance of CrocO in modelling the SWE when assimilating synthetic observations of HS and reflectance covering only a portion of the domain.
Section 2 presents the CrocO system, i.e. the ensemble modelling system and the PF algorithms. Section 3 introduces the evaluation methodology. Subsequently, Sect. 4 assesses the performance of CrocO, and Sect. 5 discusses the results. Finally, Sect. 6 provides perspectives and research directions.

Modelling geometry
Simulations are performed in semi-distributed geometry. Mountain ranges such as the Alps are discretised into socalled massifs of about 1000 km 2 to account for regional variability of meteorological conditions. Within each massif, topographically induced variability is taken into account by running the model for a fixed set of topographic classes, e.g. by 300 m elevation bands, for 0, 20 and 40 • slopes and eight aspects (see Fig. 1). This set enables us to reproduce the main features of snowpack variability (e.g. Mary et al., 2013).
In this study, we focus on the Grandes Rousses, a single massif in the central French Alps. This area of about 500 km 2 is represented by N pts = 187 independent topographic classes (see Fig. 1). In the following, specific topographic classes are denoted as elevation_aspect_slope; e.g. Figure 2. Workflow of CrocO ensemble data assimilation system with four members. x 0 : initial state at time t 0 , F i : forcing, M i : ES-CROC member, X b : background state, x i b : background particles, X a : analysis, x i a : analysis particles, y: observation, t 1 and t 2 : observation dates.
1800_N_40 stands for a 40 • slope with a northern aspect at 1800 m a.s.l.

CrocO ensemble data assimilation setup
The ensemble data assimilation workflow of CrocO is represented in Fig. 2. In the following, only a short description of the system and its elements is provided. More details on the ensemble modelling setup are available in Cluzet et al. (2020a). Information about its implementation into the Météo-France high-performance computing (HPC) system can be found in Appendix B1.

Ensemble of snowpack models
Crocus is a detailed snowpack model coupled with the ground and atmosphere in the ISBA land surface model (Interaction Soil-Biosphere-Atmosphere). It is embedded within the SURFEX_v8.1 modelling platform (SURFace EXternalisée; Masson et al., 2013). The TARTES optical scheme (Libois et al., 2013(Libois et al., , 2015 represents Vis-NIR spectral radiative transfer within the snowpack, driven by snow metamorphism ) and lightabsorbing particle (LAP; g g −1 snow ) deposition fluxes (Tuzet et al., 2017). Moreover, TARTES computes the snowpack reflectance with a high spectral resolution, making the model directly comparable to observations. As such, TARTES is both a physical component of Crocus and an observation operator.
ESCROC (Ensemble System CROCus; Lafaysse et al., 2017), the multi-physical ensemble version of Crocus, is used to account for snowpack modelling uncertainties. A random draw among 1944 ESCROC multi-physics configurations was performed and used in all the simulations and denoted (M i ) 0<i≤N e , with N e being the ensemble size (e.g. 40 or 160 members; see Fig. 2). These configurations are considered equiprobable before any data assimilation.

Ensemble of meteorological forcings
Meteorological forcings are taken from the SAFRAN (Durand et al., 1993) reanalysis, wherein forecasts from the ARPEGE numerical weather prediction (NWP) model are downscaled and adjusted with surface observations within the massif area. They are combined with MOCAGE LAP fluxes (Josse et al., 2004) interpolated at Col du Lautaret (2058 m a.s.l., inside the Grandes Rousses) to constitute the reference forcing dataset. Before the beginning of the simulation, spatially homogeneous stochastic perturbations (e.g. on a given date, the same perturbation parameter is applied across the whole domain) with temporal autocorrelations are applied to this forcing to generate an ensemble of forcings (F i ) 0<i≤N e with the same procedure as described in Cluzet et al. (2020a). More details on the perturbation procedure can be found in Appendix A. At the beginning of the simulation, each forcing F i is associated with a random M i ESCROC configuration, and this relation is fixed during the whole simulation.

The particle filter in CrocO
The PF is applied sequentially for each observation date to the background state vectors (soil and snowpack state variables, denoted BG in Fig. 2). Its analysis is an ensemble of initial conditions used to propagate the model forward. The algorithm is implemented into SODA (SURFEX Offline Data Assimilation; Albergel et al., 2017), the data assimilation module of SURFEX_v8.1, enabling a continuous execution sequence between ensemble propagation and analysis, as depicted in Fig. 2.

The particle filter equations
On a given observation date, we consider a set of observed variables available at several locations, totalling N y different observations.
-Each member x i b of the background state X b is projected into the observation space using the observation operator h. In our case, h is just an orthogonal projection of the N y observations since HS and reflectance are diagnosed within Crocus (see Sect. 2.2.1). The projec- 0<k≤N y corresponds to the modelled values at each observed variable and/or point.
-These N y observations are collected in the vector y = (y k ) 0<k≤N y . The associated observation error covariance matrix R (Eq. 1) is diagonal (e.g. observation errors are assumed independent): where σ 2 k stands for the observation error variance of observation k and depends only on the type of observation y k (e.g. HS or reflectance). The PF analysis usually works in two steps. 1. Compute the particle weights w i as the normalised observation likelihood for each particle (Eq. 2): .
(2) 2. Resample the particles based on their weights to build the analysis vector X a . Here, we apply the PF resampling from Kitagawa (1996), which returns s = (s i ) 0<i≤N e (s i ∈ [1..N e ]), a sorted vector with duplications representing the particles to replicate.
A sample reordering step was added for numerical optimisation with no expected influence on the PF behaviour (see Appendix B2 for more details).
Two simple variants of this algorithm can be identified in a spatialised context: for the global approach, perform one analysis over the domain, putting all the available observations in y; for the rlocal approach, perform one analysis per model point, assimilating only local observations if any. This corresponds to a localised PF with a block and domain size of 1.

Particle filter degeneracy
Degeneracy occurs when only a small fraction of the particles have non-negligible weights, resulting in a sample s for which only a few different indices are present. It can be diagnosed from the weights using the effective sample size N eff (Liu and Chen, 1995): With a degenerate sample, N eff 1, and with innocuous analysis (all particles are replicated) N eff = N e .
A first approach to mitigate degeneracy is to use inflation. This heuristic method iteratively inflates R values until the effective sample size is large enough. Here, we develop a variant from the Larue et al. (2018) method, which does not explicitly rely on N eff (Eq. 3). Consider applying an inflation factor 1 α to R (0 < α ≤ 1, with α = 1 being the value for no inflation) and update N eff (Eqs. 2 and 3): N eff is naturally a decreasing function of α (the more we inflate R the more different particles will be replicated). The idea of our method is to ensure that N eff exceeds a target value, N * eff . If N eff < N * eff (degenerate case), we reduce α (inflate) until N eff = N * eff using Algorithm 1. In the following, inflation is used in the global and rlocal PF (see Sect. 2.2.3). The core of Algorithm 1 is an hybrid bisection-secant method to find the zero of f : It is inspired by the rtsafe algorithm (Press et al., 1992). The guess function computes a new guess α 2 to minimise f . Note that in the unlikely case in which Algorithm 1 does not converge, all the particles are replicated.

k localisation
In the k-localisation algorithm, degeneracy is mitigated by reducing the number of observations that are simultaneously assimilated. The PF analysis is applied to each simulation point sequentially. In order to build the analysis at point n, background correlations B v are computed for each variable v (e.g. HS or reflectance) between n and all the observed points. In a first step, all observations from points exhibiting substantial background correlations (see below for the select_k_biggest function) are used. If the PF degenerates, the number of observations is progressively decreased until degeneracy is mitigated. As earlier, degeneracy is considered mitigated when N eff ≥ N * eff . This way, we ensure that a maximal number of observations has been ingested by the PF without degenerating.
In the case of degeneracy, the observation point displaying the lowest correlation is ruled out. The PF weights are computed (Eq. 2), and a new effective sample size is derived (Eq. 3). While the target sample size is not exceeded, this selection proceeds iteratively. The notation k in k localisation refers to the number, k, of retained observations for each variable. This approach is similar to the EnKF localisation algorithm whereby the localisation domain is based on background correlations (Hamill et al., 2001).
The detailed k-localisation algorithm is described in Algorithm 2, for which the following points apply.
-For each variable, the select_k_biggest method returns the domain d v of up to k observed points (named p) that are the most correlated (in absolute value) with n and match the following criteria, which were adjusted in preliminary experiments.
-In x i v , at least 10 % of members are defined in both points. As reflectance is not defined when there is no snow, spuriously high correlation can be obtained when the computation of correlations is based on a very low number of pairs.
-|B v (n, p)| > 0.3. If the absolute correlation is too low, it is likely that there is poor potential for the distant observation to constrain the ensemble locally. In such a situation, it is better to reject the observation from the local analysis. Negative ensemble correlations can be physically sound, e.g. after a rain-on-snow event between the HS of two points separated by the rain-snow line. In such a situation, an HS observation of either point can hold information on precipitation rates at both locations. At the observed location, the PF will probably select the members with the most appropriate precipitation rates. This sample is likely to perform well at both locations, so it can be used to constrain the unobserved location.
-The notation d represents the collection of domains d v .
-The extract_points function extracts d from y,x i and R.

Evaluation strategy
Our strategy is to assess the performance of the analysis by means of twin experiments, i.e. using synthetic observations (e.g. Reichle and Koster, 2003). The assimilation run is compared to an identical run without assimilation (openloop run). Synthetic observations are extracted from a model run and assimilated without adding any noise. These observations mimic real observations with perfect knowledge of the true state. Analysis and open-loop experiments can therefore be compared with this true state anywhere for any variable. In a first step, this allows us to get rid of the error and bias issues inherent in real observations (Cluzet et al., 2020a), a reason why we did not add any noise to the synthetic observations as commonly done in twin experiments (Lahoz and Menard, 2010). This way, we can focus on the following two questions (see Sect. 1).
-Is CrocO PF able to efficiently spread the information from sparse observations into space without degenerating?
-Is the spatial information content of reflectance a valuable source of information for snowpack models?
In order to disentangle these questions, we run baseline experiments assimilating synthetic observations of HS, which is strongly linked to SWE (Margulis et al., 2019). These experiments are used to evaluate the PF algorithm efficiency and as a baseline for synthetic reflectance assimilation experiments evaluating the information content of reflectance.
Three different algorithms are evaluated: the global algorithm (with inflation), the rlocal algorithm (with inflation) and the k-localised algorithm klocal.

Twin experiment setup
In our twin experiment setup, an open-loop ensemble is used as a reference and to generate synthetic observations. Openloop simulations are carried out with CrocO for four consecutive winters (2013)(2014)(2015)(2016)(2017) in the Grandes Rousses (see Sect. 2.1) with 160 members. For each year, the average of SWE over time and space is computed from each member, and members corresponding to the 20th, 40th, 60th and 80th percentiles of the ensemble are extracted to be used as synthetic observations (denoted year_ppercentile, e.g. 2014_p80). This method enables us to evaluate the efficiency of data assimilation experiments under contrasting snow condition scenarios. Before any assimilation experiment, the open-loop member (F i − M i couple in Fig. 2) used as the true state is withdrawn and replaced by a new random member.
The spatial coverage of synthetic observations was reduced, mimicking a typical reflectance mask. Synthetic observations were only available above an assumed constant treeline at 1800 m (see Fig. 1) and not available for steep slopes (over 20 • ) and northern aspects (shadows, considering a daily satellite pass around 10:00-11:00 UTC) for the whole snow season. As a result, in this case, only 35 (over 187) topographic classes are observed. Observation dates were chosen corresponding to clear-sky days with a MODIS overpass, resulting in an approximately weekly frequency (e.g. Revuelto et al., 2018;Cluzet et al., 2020a).
Reflectance is sensitive to the surface SSA and LAP (see Sect. 1). A minimal set of two different bands is used, corresponding to MODIS sensor bands 4 (555 nm, sensitive to SSA and LAP) and 5 (1240 nm, usually only sensitive to SSA) (e.g. Fig. 2 of Cluzet et al., 2020a). Observation error variances are set to 1.0 × 10 −2 m 2 for HS and 5.6 × 10 −4 and 2.0 × 10 −3 for band 4 and band 5 reflectance, respectively (Wright et al., 2014). These values are only initial values for the inflation in the global and rlocal algorithms. Since the klocal algorithm only uses inflation if k drops to 1 (see Sect. 2.3.2), observation error variances are multiplied by a factor of 5 to enable the klocal algorithm to ingest observations from several points.
In order to study the ability of the global, klocal and rlocal algorithms to spread information in space, a first set of experiments is conducted assimilating HS with 40 members (see the setup in Table 1). In order to evaluate the algorithms' ability to assimilate reflectance (band 4 and band 5) a second set of experiments is conducted with other things being equal ( Table 2). The ensemble size is increased from 40 to 160 in a third set of experiments assimilating reflectance in order to analyse the influence of a larger ensemble on the algorithm performance (Table 3). Note in Tables 1-3 that N * eff is adjusted to the ensemble size in order to preserve N e /N * eff ≈ 5-7 following Larue et al. (2018).

Evaluation scores
The performance of the assimilation and open-loop run is evaluated against the synthetic truth using several scores. The absolute error of the ensemble mean (AEM) and ensemble spread σ are two common metrics of ensemble modelling.
Given an ensemble E m,c,t of N e members m in topographic class c at time t and the corresponding truth τ c,t , the ensemble mean is described by Eq. (4): from which we can compute the AEM (Eq. 5) and the spread (or dispersion) σ (Eq. 6): where N t is the number of evaluation time steps. The continuous ranked probability score (CRPS; Eq. 7; Matheson and Winkler, 1976) evaluates the reliability and resolution of an ensemble based on a verification dataset. An ensemble is reliable when events are forecast with the right probability and has a good resolution when it is able to discriminate distinct observed events. For a reliable system, the resolution is equivalent to the sharpness, which is the spread of the produced forecasts.
If we denote F c,t the cumulative distribution function (CDF) and T c,t the corresponding truth CDF (Heaviside function centred on the truth value), the CRPS is computed at (c, t) following In this work, the CRPS c,t value is averaged over time alone or time and space depending on the desired level of aggregation.
The CRPS can be decomposed into two terms following Candille et al. (2015): where Reli quantifies the reliability of the ensemble. The associated skill scores (CRPSS and ReliS) can be used to compare the performance of an ensemble E to a reference R, here the open-loop run: A skill score of 1 denotes a perfect score, 0 a neutral performance and −∞ the worst achievable skill score.

Impact of the inflation
The inflation algorithm was introduced by Larue et al. (2018) in point-scale simulations, but to the best of our knowledge, it has never been applied in a spatialised context. Here we evaluate its impact on the global algorithm by switching it on and off. As an example, Fig. 3 shows the impact of inflation on SWE when assimilating the HS of 2015_p80 (as defined in Sect. 3.1.1) member with the global algorithm in a topographic class which is not observed (1800_N_40, as defined in Sect. 2.1). This choice of member and topographic class is representative of the impact of inflation on the global algorithm.
In this case, both inflation (N * eff = 7) and no inflation (N * eff = 1) lead to a significant reduction of the ensemble spread compared with the open loop (Fig. 3b). From January 2015 until the peak of SWE in mid-April 2015 (Fig. 3c), the simulation with inflation has significantly lower errors than without inflation and the open loop (10-20 vs. 60-80 and 30-50 kg m −2 , respectively), leading to better agreement with the synthetic truth in the melting season (Fig. 3a). During the melting season (mid-April 2015 onwards), the AEM of the assimilation algorithms reaches a peak, coinciding with an absence of observations. In comparison, the open-loop AEM is smaller in the first part of the melting season, but the spread is 3 times larger, making it almost uninformative. For several analyses (21 November 2014 and 30 December 2014, for example) the ensemble spread without inflation drops to 0, while its AEM strongly increases compared to the open loop, suggesting that it is prone to degeneracy.

Correlation patterns
The klocal algorithm relies on background correlation patterns to define localisation domains. To illustrate the potential of using such information in the PF, Fig. 4 shows the correlation patterns of the 40-member open loop in an unobserved topographic class (1800_N_40, red dot) in the midwinter (20 February 2015), several months after the snow season onset. The assimilation variables exhibit strong but contrasting correlation patterns. Band 4 (Fig. 4a) correlations are generally high (0.6-1) and uniform. Many of the observed classes (black dots) are strongly correlated with the considered class. Similar results are obtained for HS (Fig. 4c). Band 5 (Fig. 4b) exhibits substantial correlations, in particular across slopes. However, they are more restricted to the northern aspects, and only a few observed classes in the eastern aspects are substantially correlated with the considered class. Note that negative correlations are evidenced with some lower-altitude south-oriented topographic classes (e.g. 1500_S_40 in Fig. 4b). Finally, these patterns vary with time but remain substantial along the whole season (not shown), and increasing the ensemble size up to 160 leads to identical patterns (not shown).

Assimilation of the height of snow
In a first step, assimilation of HS from the different synthetic observation scenarios was conducted to serve as a reference for reflectance assimilation. Figure 5 shows the CRPSS (Eq. 9, aggregated over time only) of the HS assimilation with the three PF algorithms considering the synthetic member 2013_q20 as a reference. Results for this specific synthetic member were chosen here as a representative example of the algorithm performance.
The rlocal performance compared with the open loop is high (0.7-1) but limited to the observed classes (black dots) since there is no spatial propagation in this algorithm. The global and klocal algorithms have similar overall good per- formance, managing to strongly reduce modelling uncertainties except at very low altitudes (600-900 m) (skills of −0.2) where snow does not usually last for more than a few weeks.
This behaviour may vary with the snow conditions, i.e. between the different assimilated synthetic observation scenarios and from one year to another. In order to generalise this result, Fig. 6 shows the CRPS and Reli (aggregated over time and space) of the different algorithms for the 16 synthetic observation scenarios and differentiated between observed and unobserved classes. CRPS and reliability are considerably reduced compared with the open loop (by a factor of 2-3 and 4-5, respectively) for all the algorithms in the observed classes. This suggests that the PF manages to reduce the spread of the ensemble while reducing its errors. In the unobserved classes, the gain is almost as good (CRPSS of 0.6) except for the rlocal algorithm, which is identical to the open loop as expected. No significant difference in skill is obtained between the global and klocal algorithms.

Assimilation of reflectance
Optical reflectance is a promising assimilation variable due to its extended availability in satellite observations, but assimilation of raw reflectance products is not expected to con- strain bulk variables like SWE or HS as much as HS assimilation. In order to assess this difference, we conduct assimilation of reflectance only in the same setup as in Sect. 4.2.1 with all other things being equal. Figure 7 shows the performance of the reflectance assimilation for the 16 synthetic observation scenarios with 40 members (filled boxes). The different algorithms only lead to moderate improvements in CRPS (median CRPSS of 0-0.2, median ReliS of 0.2-0.4). Moreover, the global and klocal algorithms frequently degrade the performance, suggesting that this configuration is not robust.
Suspecting that 40 members are insufficient to properly represent the multivariate probability density function of reflectance and other model variables, the ensemble size was increased to 160 (hatched boxes), leading to marked improvements in the performance and robustness of the algorithms (median CRPSS of 0.2, median Reli of 0.4-0.6). The reliability of the global algorithm is significantly improved compared to the klocal algorithm. Figure 8 shows the spatial performance of the different algorithms for member 2016_p60. Spatial patterns similar to the HS assimilation are found. The rlocal performance is limited to the observed classes, while global and klocal manage to improve the simulations across aspects and slopes. However, skill scores are lower than for HS (0.2-0.5), and the performance of all algorithms is poor in the classes that are farther away from the observations, i.e. at lower elevations (600-900 m) and in some of the high-altitude steep northern classes (e.g. 2100_N_40 in Fig. 8b and c). Finally, note that slight degradations of performance can sometimes be evidenced, even in the observed classes, for all the algorithms (e.g. in flat conditions at 3300 m in Fig. 8a for the rlocal, not evidenced by this example for the other algorithms).

Discussion
In this section, we discuss the performance of CrocO PF algorithms using the assimilation of HS and consider the potential of the assimilation of reflectance in view of assimilating real data.

Tackling particle filter degeneracy
Because they assimilate several observations at the same time, global and klocal approaches could be prone to PF degeneracy. However, they almost never degrade the performances when assimilating HS in a variety of years and synthetic observation scenario percentiles (Fig. 6). This suggests that inflating the observation errors (as demonstrated by Larue et al., 2018, a result we have generalised in space)  and exploiting background correlations to reduce the number of assimilated observations are two efficient approaches to tackle degeneracy.
In several cases though, strong degradation of the score occurs when assimilating reflectance (Fig. 7), which could either be attributed to an algorithmic failure in the PF or an intrinsic lack of informativeness of reflectance in some situations. Based on the good behaviour of the algorithm with HS, and because by construction, the global and klocal algorithms cannot lead to a degenerate PF sample, we consider this to come from the reflectance itself (this point will be further discussed in the following sections).
Beyond tackling degeneracy, the global and klocal algorithms also beat the rlocal approach on Reli and CRPSS (Figs. 7 and 8). This suggests that assimilating multiple observations increases the quality of the PF analysis, even locally. More precisely, most of the improvement is due to the Reli term of the CRPS. This property is crucial for ensemble modelling because it ensures that events are forecasted with the right frequency. However, this is not sufficient; e.g. the climatology has perfect reliability but is not informative at all. Successful assimilation manages to improve general metrics such as the CRPS while improving the reliability. For this aspect, the global and klocal algorithms have a satisfying performance.

Propagating the observation information
Having sparse observations is one of the most challenging issues for data assimilation systems of snowpack observations (Magnusson et al., 2014;Largeron et al., 2020). In our partially observed synthetic setup, the global and klocal PF variants developed here efficiently propagate the observational information to the unobserved classes with a generally better performance than the open-loop and rlocal approach in the unobserved classes when assimilating HS (Fig. 5).
The algorithms' performance is particularly good across aspects and slopes, with only a few steep northern aspect slopes exhibiting neutral to poor performances (Figs. 5 and 8). This suggests that southern aspect and flat classes are informative for the majority of the simulation domain. Conversely, considering that there are strong background correlations between the western and eastern sides of the domain, we can speculate that observing either side could yield overall good results.
In these figures, propagation of the information is limited towards lower elevation (600-1200 m). At such elevations, the snow cover is usually intermittent and a good discrimination of the precipitation phase is crucial. The PF does this indirectly through HS and reflectance observations because rain causes a decrease in HS through compaction and melting, while band 4 and band 5 reflectances also decrease because of quick isothermal metamorphism (i.e. the surface SSA decreases). However, in our setup, the lowest observed elevation is 1800 m; therefore, indirect observation of the rain-snow line positioning under this level is not possible, potentially explaining the moderate performance of the PF there. In that case, assimilation of snow cover fraction might be the best solution; since the snowpack is intermittent there, the informativeness of this variable is maximal (Aalstad et al., 2018).
The global and klocal algorithms exhibit strong performances when assimilating HS (Fig. 5). HS is closely linked to the SWE (by the bulk density), and the interest of this variable for data assimilation is clear (Margulis et al., 2019). Here, it should be kept in mind that HS assimilation is used as a baseline experiment to evaluate the algorithms and put reflectance assimilation into perspective. The prescribed HS observation errors (σ 0 = 0.1 m) are not necessarily realistic. They should be adapted to the nature of the HS sensor. For example, spaceborne HS observation errors are typically larger (e.g. Eberhard et al., 2020;Deschamps-Berger et al., 2020). The assimilation of such observations would probably yield lower improvements.
Though the performance is lower for reflectance than in our HS experiments, it remains considerable and in line with previous results on point simulations (Charrois et al., 2016), with an average score improvement of 20 %-40 %. This study quite surprisingly suggests that reflectance information can be spread from southern slopes to the northern ones, although in many situations, the snowpack evolves in different ways for these two aspects. For example, in sunny conditions, melt and wet metamorphism will cause a drop in reflectance on southern slopes, while reflectance will not evolve much on northern slopes. Such a phenomenon could explain why low background correlations between southern and northern aspects are exhibited in band 5 (Fig. 4), which is the most sensitive to surface metamorphism through SSA. This example shows that band 5 reflectance observations on southern slopes are not necessarily informative for band 5 reflectance values in the northern aspect per se on every date. On average, however, the positive impact of reflectance observations suggests that they enable the PF to reject the ensemble members with inadequate meteorological forcings (snowfall or cloud cover would lead to wrong reflectance values) or multi-physical parameterisations (influencing e.g. the surface metamorphism), thus correcting the ensemble in the whole domain. These insights are consistent with the study of Winstral et al. (2019), wherein in situ observations are used to correct meteorological forcing parameters across large simulation domains.
Regarding the observations, our study has some methodological limits, however. Observation errors are very roughly prescribed, and the assimilated observations are not corrupted as usually done in synthetic experiments (e.g. Durand and Margulis, 2006). These choices were motivated by the fact that very little is known about the spatial correlation of reflectance observation errors in a semi-distributed setting (e.g. Cluzet et al., 2020a). In a recently submitted paper, the impact of random and systematic errors in reflectance obser-vations on point-scale assimilation experiments is thoroughly investigated (Revuelto et al., 2021). Efforts to better characterise the spatial structure of these observation errors should be conducted in future work.

Towards the assimilation of real observations of reflectance
Reflectance is an appealing variable for snowpack modelling because of its sensitivity to snowpack surface properties (Dozier et al., 2009) and the abundance of moderateto high-resolution spaceborne sensors (MODIS, Sentinel-2-3, VIIRS, Landsat) providing us with a handful of observations to assimilate, contrary to HS. The potential for assimilation of SCF, which is retrieved from reflectances, is clear (Margulis et al., 2016;Aalstad et al., 2018;Alonso-Gónzalez et al., 2020). This study demonstrates the potential of the PF to spread information and assimilate raw reflectances with a positive impact (Sect. 5.2). Yet, assimilating real observations of reflectance is another challenge for two reasons. First, spaceborne reflectance observations are generally noisy and biased (e.g. Cluzet et al., 2020a). Satellite retrievals could be improved in the future (Kokhanovsky et al., 2019;Lamare et al., 2020), and Cluzet et al. (2020a) showed that assimilating ratios of reflectance could be a workaround to tackle this issue. In the near-infrared, the signal-to-noise ratio of reflectance observations might be sufficient to constrain the surface microphysical properties (Durand and Margulis, 2007;Mary et al., 2013), whereas the required accuracy for visible reflectance retrievals to remain informative on snowpack light-absorbing particles content is high (Warren, 2013), and it has yet to be proved whether either approach can achieve this requirement.
Second, in this twin experiment framework, spatial patterns of the synthetic observations are likely compatible with the ensemble since they come from the same modelling system. This may not be the case in reality, therefore making it more difficult to assimilate, and we refer to this issue as model or ensemble realism.
We must assess the strengths and weaknesses of the global and klocal approaches by addressing those two issues. The global algorithm assumes that a global optimum can be found across the whole domain; e.g. the information from the different observations is consistent and can be ingested in one block by the PF. With this strategy, the degeneracy due to the size of the observation vector is efficiently mitigated by the inflation algorithm as discussed in Sect. 5.1. The klocal approach considers only a fraction of the observation information to be relevant to constrain the model state at a given location. This algorithm tries to ingest as much information as possible while rejecting observations coming from snowpack conditions that are too statistically different. As a consequence, because we do not account for the real spatial patterns of observation errors and because we work in a twin experiment setup, a global optimum for the whole domain can exist and can be found by the global algorithm. This might be a reason why it beats the klocal approach (Figs. 6 and 7). In the real world, from the model point of view, there might be contradictory information among the observations that would be difficult to disentangle with a global strategy. The klocal algorithm could be more suited to this situation because it looks for local optima based on the assumption that background correlations are a realistic representation of modelling errors.
These background correlation structures could be overestimated by the ensemble, and tests with real observations are necessary. Strong band 4 correlations (Fig. 4a) might be due to the spatially homogeneous perturbations of LAP fluxes used to force the simulations (see Sect. 2.2.2), a key driver of this variable, and because the same snow model configuration is applied for a given member across the simulation domain. Several studies suggest that LAP fluxes vary with elevation and other topographic parameters (de Magalhães et al., 2019;Sabatier et al., 2020), but to date no reliable model of such processes has been developed for complex terrain. In such a context, assuming uniform LAP forcing seems a reasonable compromise. Strong and almost uniform HS correlations (Fig. 4b) might be caused by the spatial homogeneity of precipitation perturbations and because we do not account for e.g. wind drift, intra-massif heterogeneity of meteorological conditions and gravitational redistribution of snow (Wayand et al., 2018). Despite this semidistributed framework suffering from obvious limitations, the potential for high-resolution snowpack modelling Fiddes et al., 2019;Marsh et al., 2020) is hampered by large errors of NWP models in mountainous areas (e.g. Nousu et al., 2019).
In the future, improving the ability of ensemble correlations to represent modelling errors could make the spreading of information an even more challenging task with the klocal algorithm. But significant potential should remain for information propagation, as suggested by results at larger scales (Magnusson et al., 2014;Cantet et al., 2019). The potential decorrelation of topographic classes would also impact the global algorithm. In an unobserved class, constraining the state of the snowpack with information from areas that are not linked to it would likely degrade the forecasting skill, as suggested by the poor performance of the algorithms at low altitudes (Figs. 5 and 8). In contrast, applying CrocO over larger domains (e.g. distributed simulations or a collection of semi-distributed massifs) would probably see the klocal algorithm outperform the global. The increased domain size would make it less plausible to find a global optimum over the domain, whereas spatial flexibility would be an asset of the klocal algorithm. Finally, in the case of modelled coupling between simulation points (e.g. snow drift), which was not the case here, the spatial discontinuities of the klocal analyses (see Sect. 1) might be a drawback compared to the global approach. Spatial discontinuities may also be revealed as impractical for the interpretation of individual simulation outputs by snow forecasters. The klocal approach is likely to reduce these discontinuities compared to the rlocal because similar locations will be treated with similar analyses (i.e. based on similar sets of observations). This issue could be partly mitigated by e.g. state-block-domain approaches (Farchi and Bocquet, 2018).

Outlook for ensemble modelling and data assimilation
In the snowpack modelling community, ensemble modelling is a powerful tool to represent modelling uncertainties (Vernay et al., 2015;Richter et al., 2020) and for data assimilation (Essery et al., 2013;Lafaysse et al., 2017;Piazzi et al., 2018;Aalstad et al., 2018). This study offers a novel approach to extract valuable information on the snowpack spatial behaviour from spatial correlation patterns of the ensemble. These patterns could be used to diagnose links between locations, transfer information between areas or assess the representativeness of point simulations. More broadly, ensemble background correlations have long been exploited in the NWP and oceanographic communities to refine modelling error representation, which led to significant improvements in the data assimilation systems (Evensen, 2003;Buehner, 2005). Ensembles might open a possibility for the assimilation of point-scale observations or sparse remotely sensed observations into spatialised simulations of the snowpack, as suggested by Winstral et al. (2019) and the present work. For instance, there are numerous snow gauges and snow pit observations at ski resorts in the French Alps. These data could be assimilated to correct the ensemble in spatialised simulations (Winstral et al., 2019). The spatial pattern of assimilated observations in the experiments of Sect. 4 does not correspond to the real-life spatial coverage of these kinds of observations. To give insight into their potential, we also applied our methodology to assimilate only five synthetic HS observations with the global PF in the 1200 to 2400 m flat classes. The results are shown in Fig. 9. The assimilation improves the performance in all aspects and slopes. Naturally, this suf-fers from the same limitation as discussed in Sect. 5.3, not to mention the limited spatial representativeness of in situ observations, but it shows some potential for this idea.
In that way, a more rational use of the available observations could be implemented towards a new ensemble data assimilation system. In the present CrocO system, the SAFRAN reanalysis only assimilates weather station information (precipitation phase, temperature, wind) and makes no use of the numerous snow observations available. Here, snow observations are assimilated by the PF but are not used to correct meteorological forcings (only snow variables; see Fig. 2). In a new ensemble data assimilation system, within CrocO, the SAFRAN meteorological analysis could be bypassed, with the PF directly operating on both the meteorological and snowpack variables through a more comprehensive and coupled strategy.

Conclusions
In this study, we introduced CrocO, a new ensemble data assimilation system able to reduce the errors of a spatialised snowpack model in locations that are not observed. The ensemble is built by a combination of meteorological and multi-physical ensembles to represent modelling uncertainties. A particle filter assimilates observations of HS and reflectance. We developed two variants of the PF using inflation or k localisation in order to spread the information from partial observations of the system, without degeneracy of the PF. In the framework of synthetic experiments, we have shown in particular the following: 1. these variants are able to ingest numerous observations without degeneracy; 2. an efficient spreading of the observational information towards the unobserved areas is achieved with the global and klocal approaches; and 3. reflectance assimilation leads to an overall 20 % improvement in CRPS and 60 % in reliability.
We suggest that this approach could be used in any spatialised framework to assimilate sparse observations from e.g. networks of in situ snowpack observations. Beyond the snowpack modelling community, the inflation and klocalisation strategies could help address the problem of partially observed systems. This work is also a first step towards the operational assimilation of reflectance in a semidistributed context. To reach that goal, biases of reflectance retrievals should be studied and observation error structures duly quantified. Snow cover fraction would be a good companion variable to jointly assimilate with reflectances, requiring the use of an appropriate observation operator. Extending the simulation domain to several massifs would allow the exchange of information between neighbouring massifs with the klocal algorithm.

Appendix A: Stochastic perturbations of the forcings
The stochastic perturbation procedure of the forcings is introduced in Sect. 2.2.2 and is identical to Charrois et al. (2016) for the meteorological parameters and Cluzet et al. (2020a) for the light-absorbing particle (LAP) fluxes. For a given date and forcing variable, perturbation values are the same for all the points in space (no spatial autocorrelation is considered), as SAFRAN semi-distributed massifs have a limited spatial extent (about 1000 km 2 ). Precipitation, incoming radiation, wind speed and air temperature from SAFRAN are perturbed with temporally autocorrelated stochastic parameters. The precipitation, incoming shortwave radiation, and wind speed are perturbed with multiplicative noise. Longwave radiation and air temperature are perturbed with additive noise.
For meteorological variables, the perturbation vector V is built as follows: where φ = e −dt/τ , with dt the forcing time step, τ the decorrelation time (h) and ε a normal law of mean 0 and variance σ 2 (1 − φ 2 ). Parameter values for each variable are described in Table A1. The significantly high autocorrelation time of precipitation, 1500 h, was tuned to roughly adjust the ensemble spread to the observed intra-massif variability of yearly accumulated precipitation. Note that the precipitation phase is adjusted with the perturbed air temperature to ensure physical consistency. Further details on the procedure can be found in Charrois et al. (2016). Regarding LAP fluxes, dry and wet black carbon and mineral dust deposition fluxes from MOCAGE are perturbed with a random factor which is constant throughout the year. Each member has a single multiplicative factor following a log-normal law of mean µ and variance σ (see Table A2). The mean of black carbon random perturbations was adjusted based on comparisons between simulations and field observations at Col du Lautaret, a mountain pass within the considered SAFRAN massif.  Appendix B: Complements on the implementation

B1 Technical implementation and code performance
CrocO is implemented within the Météo-France HPC (highperformance computing) environment, enabling us to fully parallelise the ensemble (one core per member) and bridge the gap with operational applications Morin et al., 2020). This implementation is strongly parallel.
As an example, the execution time of a 1-year assimilation run of 187 model points with 160 members on four nodes of 40 cores each lasts only 2 h. The PF is a lightweight algorithm, and most of the computational burden is due to the propagation of the ensemble and input/output (I/O). Also note that no significant difference in execution time can be noted between the different PF algorithms.

B2 PF sample reordering
As mentioned in Sect. 2.3, a reordering step was implemented after the PF resampling from Kitagawa (1996) for practical reasons.
Indeed, I/O represents a bottleneck in the PF. When building the analysis X a , the background X b is already loaded in memory. Since X a is just a reordering of X b columns based on s, a reordering of s avoids building a copy of X b . This way, X a is built by an online modification of X b using two pointers. Reordering is a growing consideration in the PF community (Farchi and Bocquet, 2018).
Code availability. The Crocus snowpack model (including all physical options of the ESCROC system) and the particle filter algorithm are developed in the framework of the open-source SUR-FEX project. The source files of SURFEX code are provided at https://doi.org/10.5281/zenodo.3774861 (Cluzet et al., 2020b) to guarantee the permanent reproducibility of results. However, we recommend that potential future users and developers access the code from its Git repository (http://git.umr-cnrm.fr/git/Surfex_Git2. git, last access: 15 April 2020) to benefit from all tools of code management (history management, bug fixes, documentation, interface for technical support, etc.). This requires a quick registration, and the procedure is described at https://opensource.cnrm-game-meteo. fr/projects/snowtools/wiki/Procedure_for_new_users (last access: 4 January 2021). The version used in this work is tagged as CrocO_v1.0.
Python software called CrocO_toolbox was specifically developed in order to pre-process, post-process and launch CrocO experiments. It is available on GitHub (https://github.com/bertrandcz/ CrocO, release v1.0 of the master branch, last access: 4 May 2020) along with documentation.
The article version of CrocO_toolbox is archived at https://doi.org/10.5281/zenodo.3784980 (Cluzet, 2020). This software strongly relies on two external Python projects ensuring file management between the different steps of a simulation and the interface with the Météo-France HPC system (including parallelisation and data storage): snowtools and vortex. Their sources are available at https://doi.org/10.5281/zenodo.3774861 (Cluzet et al., 2020b) (same archive as SURFEX) to guarantee the permanent reproducibility of results. However, as for the SURFEX project and for the same reasons, it is recommended to access snowtools code from its Git repository (https://git.umr-cnrm.fr/git/snowtools_git.git, last access: 4 May 2020). The version used in this work is also tagged as CrocO_v1.0. The vortex project gathers all environment-specific codes of Météo-France modelling systems relative to its HPC system. For this project, only the sources specific to this article's simulations are provided. Common object inheritance is based on vortex version 1.6.1. The version used in this work is also tagged as CrocO_v1.0 in the vortex Git repository.
Because these software programmes could not be applied outside the Météo-France HPC environment, CrocO Python software offers the possibility to run CrocO simulations locally. This functionality was not used here due to the high numerical cost of our simulations, which required the use of the Météo-France HPC environment.
Data availability. Input and output data necessary to reproduce the simulations and figures in this paper are provided at https://doi.org/10.5281/zenodo.3775007 (Cluzet et al., 2020c). This archive includes the SAFRAN reanalyses (also available at https://doi.org/10.25326/37, Vernay et al., 2021), MOCAGE forcings, namelists, configuration files and spin-up files necessary to reproduce the simulations. Raw model outputs can be provided on request, but since they can be up to 500+ GB, only post-processed simulation outputs are provided in this archive, along with scores and scripts to reproduce the figures in the paper.
Author contributions. BC wrote the paper. BC, ML and MD designed the study, and BC developed the code with help from ML, LFM and CA. BC, MD, ML and EC designed the PF variants. All authors contributed to results analysis and discussion.
Competing interests. The authors declare that they have no conflict of interest.