A numerical scheme to perform data assimilation of concentration measurements in Lagrangian models is presented, along with its first implementation called Ocean Plastic Assimilator, which aims to improve predictions of the distributions of plastics over the oceans. This scheme uses an ensemble method over a set of particle dispersion simulations. At each step, concentration observations are assimilated across the ensemble members by switching back and forth between Eulerian and Lagrangian representations. We design two experiments to assess the scheme efficacy and efficiency when assimilating simulated data in a simple double-gyre model. Analysis convergence is observed with higher accuracy when lowering observation variance or using a circulation model closer to the real circulation. Results show that the distribution of the mass of plastics in an area can effectively be improved with this simple assimilation scheme. Direct application to a real ocean dispersion model of the Great Pacific Garbage Patch is presented with simulated observations, which gives similarly encouraging results. Thus, this method is considered a suitable candidate for creating a tool to assimilate plastic concentration observations in real-world applications to estimate and forecast plastic distributions in the oceans. Finally, several improvements that could further enhance the method efficiency are identified.

Plastic pollution reveals itself to be an urgent matter if humans are to preserve their oceans. Previous publications such as

A modeling framework is currently undergoing development at The Ocean Cleanup towards this goal, as the company set itself out to clean 90 % of the oceans' floating macroplastics by 2040. It is used to assess and improve our ability to perform the largest cleanup in history.

This framework, the results of which are presented in

While the

Methods to assimilate plastic concentration observations over a Lagrangian dispersion model are in the early development stage

This paper introduces Ocean Plastic Assimilator v0.2, a numerical scheme developed to assimilate plastic concentration data into 2D Lagrangian dispersion models. Section 2 formulates the method, and Sect. 3 then describes its initial implementation and application. For this proof-of-concept paper, we use a dispersion simulation generated with the OpenDrift framework in a controlled environment based on a double-gyre analytical flow field. The assimilation results are presented in Sect. 4. Real-world application perspectives and future developments that could further improve the method are discussed in Sect. 5. Finally, in Sect. 6, we present a direct application of the method to a dispersion model of the actual Great Pacific Garbage Patch, with simulated observations sampled from another simulation.

This section formulates our methodology to perform data assimilation of plastic concentration (or density) observations in any 2D Lagrangian dispersion model using an ensemble Kalman filter (EnKF). It includes the two representations of data (Eulerian and Lagrangian) being used for this process, the transformation between Eulerian and Lagrangian space, the ensemble assimilation method itself, and model ensemble initialization.

The distribution of the mass of plastics in a Lagrangian dispersion model is represented by weighted particles drifting according to a flow field in a 2D domain. Each virtual particle represents a drifting concentration of plastics. In turn, virtual concentration measurements are collected at fixed locations (grid points) within the studied 2D domain, i.e., an Eulerian representation of the plastic mass distribution.

Our method aims to assimilate concentration observations collected in the Eulerian representation and update the Lagrangian representation accordingly. One cycle of this process consists of projecting particle weights on the concentration grid, assimilating observation data into the concentration grid, projecting grid cell concentration updates on particle weights, and finally letting particles drift until the next assimilation time step. This procedure is summarized in Fig.

Schematic depiction of the four steps of our method.

The complete workflow requires the following:

an assimilation method,

a dispersion model along with the flow field used,

projection methods to go back and forth between Eulerian and Lagrangian representations, and

prior estimates for model parameters and uncertainties.

This section presents our procedure for a set of

At each step

Our assimilation step relies on the use of ensemble Kalman filtering, as described in

Standard Kalman filtering allows computing the analysis state using a single equation. In standard Kalman filtering, the forecast state vector

The Kalman gain matrix

Ensemble members are different instances of our simulation with different initializations.
For ensemble member

Several ways of projecting the density updates (step 3 in Fig.

This heuristic was chosen primarily for its simplicity and its computational efficiency. The multiplicative approach also tends to prevent computing negative weights if the density analysis is lower than the density forecast.

Finally, for step 4 in Fig.

As stated by

This section presents the Python implementation of the aforementioned method, called Ocean Plastic Assimilator (v0.2). We then describe the Lagrangian dispersion model (OceanDrift) used to generate double-gyre dispersion simulations and the experiments created with it to observe how our method performs in a controlled environment.

This first implementation is coded in Python (see

Once loaded, the input weights are duplicated in

This implementation leverages the use of arrays and the fact that we only use one simulation for all ensemble members to perform vectorized computations for the computation of

This implementation allows our algorithm to perform each following test case with repeated assimilation of two observation points during 2000 time steps in a

Running the assimilator on a dispersion output and not inside a dispersion model allows it to work on outputs from different models, as long as the data are appropriately formatted. Future implementations could also offer the option of running online (i.e., embedded inside a dispersion model), which could allow more flexibility and possibilities, as discussed in Sect.

In order to create our test cases, we first need a dispersion model and a flow field. We chose the OceanDrift model from the Norwegian Lagrangian trajectory modeling framework OpenDrift (see

Generated particles

This field consists of two gyres periodically moving closer then farther away in an enclosed area. It is a simple field but complex enough to stir and disseminate particles and is regularly used as a standard case to study time-varying flows, for example in

Parameter

Particles are then generated and advected using the OceanDrift Lagrangian model from the Norwegian trajectory modeling framework OpenDrift

Thus, we can generate different dispersion simulations by changing the initial particle position seed, which changes the distribution of particle trajectories and the initial masses of the particles. We can also change the flow field parameters

In the following section, we modify the flow field parameters and the particle position seeds to create assimilation test cases that use two simulations: a reference and a forecast. We then sample observations from the reference simulation and assimilate them inside the forecast simulation. By doing so, we mimic assimilating real concentration data into an uncertain flow field in the presence of model error.

In order to assess and quantify the efficacy of the assimilator in different cases, we designed two experiments.

The first one aims to verify that, when the forecast flow field reproduces the reference flow field accurately, our implemented scheme can correct an incorrect total mass guess. It also intends to check that the estimate gets better when the observation error gets lower, as one would generally expect.

The second experiment aims to assess the assimilator's behavior and efficacy when the forecast flow field is slightly different from the reference by changing the double-gyre parameters

In both experiments, we run several test cases to assimilate observations taken from a reference simulation into a forecast simulation using the assimilator. Then, we compute the total plastic mass estimation error and the concentration field root mean square error (RMSE) to assess how close the assimilated forecast gets to the reference situation. This procedure is depicted in Fig.

Schematic depiction of a test case using a reference and a forecast simulation.

In each test case, the Ocean Plastic Assimilator is executed over the course of 2000 time steps. The double-gyre size, which is

Over the double gyre, we define a gridded domain of size

For the

In this first experiment, we want to assess the ability of our newly implemented scheme to estimate the total mass of plastics in the reference simulation correctly.

First, we generate a reference situation using

We initiate five different forecasts with

Evolution of total mass over time for five different forecast simulations with five different initial total masses (Table

Figure

Final total mass (FTM) relative to

Another indicator of the correctness of a simulation can be computed from the concentration field at each step. For one of the forecasts (with

We also compute the concentration field root mean square error (

Overall, this points to an improvement in the forecast concentration field over time, thanks to data assimilation.

Evolution of the error field between the reference concentration field and the forecast concentration field, in percent, for

Finally, in order to assess the method accuracy depending on observation errors, we set

Parameters and metrics for assimilation simulations with different values of

We find that decreasing

In this second experiment, we change the parameters used to generate the currents of the reference simulation double gyre. For example, the impact of a modification of

We initiate the forecast with an erroneous initial total mass of

Flow fields at

The forecast simulation is generated using

We then generate different reference simulations with different values of

We find that data assimilation remains effective and that simulations run with values of

Parameters and metrics for simulations with different values of

Evolution of the total plastic mass in the forecast simulation for five different runs with varying values of double-gyre parameters

This result illustrates that the assimilation method can be robust to unknown model errors.

In this section, we present an application to real-world global dispersion models. As before, we sample observations from one simulation and assimilate them into another in order to mimic the assimilation of observations that could be collected daily by a pair of moorings deployed in the real ocean. We just use an estimate of real ocean currents in place of the simplified double gyre defined in Eq. (

We generate two global dispersion simulations with the Lagrangian dispersion model presented by

We initialize plastic particle masses generated in the coastal-seeded model depending on their release year. If

The gridded domain has a resolution of 0.5

Our method is able to predict the total mass of floating plastics with a 17 % error and to divide the concentration field RMSE by 4 (Fig.

The updates to the concentration field are presented in Fig.

Concentration field updates at the end of the assimilation cycle, with the two observation locations in blue. This field is the difference between the forecast concentration field at the end of the year 2012 with assimilation and the same without assimilation.

Further experimentation will be required to assess the benefits of using this method in real-world use cases with real data. However, these results confirm the potential skill of our method, even in the presence of sizable model error.

In this proof-of-concept paper, we placed ourselves in a controlled environment to assess the efficacy of the method. In the future, our goal will be to eventually apply the method to real data by replacing the simulated reference situation observations with real-world observations, and the previous results can help in understanding what might happen in assimilating real-world data. The fact that replacing the analytic circulation field by a real-world one (in Sect.

In Fig.

Conveniently, we observed that the forecast total mass gets higher when the dispersion model is more accurate, thus acting, in a way, like a score. As a result, we might discriminate between dispersion models based on this method's output by selecting the ones that output the highest total mass.

Amongst the potential applications of the presented method, one might highlight the evaluation and design of real observational strategies. Here we considered one hypothetical, albeit plausible, scenario which might represent the deployment of a few relatively accurate moorings. In future studies it would be interesting to investigate how data coverage in space and time may affect forecast skill in more detail, for example, or use this data assimilation system as a benchmark for proposed field campaigns. Several directions to further develop the method and make it more accurate also seem worth considering, as outlined below.

Throughout the last 2 decades, the ensemble Kalman filter has been extensively developed and improved, with numerous variants published in the scientific literature. Using different ensemble sampling strategies or a square root algorithm was described as a way to improve accuracy in

The method presented here uses the same dispersion simulation as a base for the trajectories of the particles for all ensemble members. In all members, the particle positions through time are the same; the only variables that differ are the particle masses. In particular, the particle trajectories are the same in each member. This approach greatly reduces the storage cost and increases computation speed.

However, it significantly lowers the diversity of the ensemble, so in future work one might want to decouple the ensemble member trajectories, i.e., have a unique set of trajectories for each member. We anticipate that extending the method to use an ensemble with diverse particle simulations should help the forecast converge towards a concentration field closer to the reference one. We regard this possibility as a leading candidate to make the method even more accurate.

In Sect.

Another alternative would be to generate new particles so that their weights sum up to the updated density, possibly fewer or more particles. This could be more technically challenging to implement and requires implementing the assimilation scheme directly inside the dispersion model loop. However, it could also have the advantage of conveniently increasing resolution where there are high concentrations of plastics.

This paper presents a simple yet readily effective method to assimilate observations of plastic concentration data into a Lagrangian dispersion model and its first implementation called the Ocean Plastic Assimilator (v0.2). We apply it in a controlled environment to assess its efficacy. We study the impact of observation errors on the prediction accuracy and changed some of the dispersion parameters (

Thus, the Ocean Plastic Assimilator will be further developed at The Ocean Cleanup to assimilate plastic concentration data from the oceans and improve our cleanup operations in oceanic gyres. This method will undergo more research to develop its features and assess its efficacy when using real-world observations. We expect it to be used to assess the cleanup operations of The Ocean Cleanup in real time.

The simplicity of the developed data assimilation method means that it should be easy to generalize to various other popular open-source Lagrangian frameworks such as OceanParcels

The current version of Ocean Plastic Assimilator is available from the github repository:

BSR conceived and presented to AP the idea of applying data assimilation to a dispersal model. AP studied the data assimilation literature and suggested using an ensemble Kalman filter. AP wrote and maintained the code and applied it initially to real oceanic data. JMC introduced AP to GF, and GF with BSR recommended applying the method to an analytical flow field to assess its performance. AP structured the paper, wrote the initial draft and the next versions, and prepared figures. GF and AP met every 2 weeks or so to discuss the paper as AP was writing it. GF provided advice on tweaking the method and improving the paper. BSR and JMC were sometimes also present to provide advice during these meetings.

The authors declare that they have no conflict of interest.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Axel Peytavin and Bruno Sainte-Rose would like to thank The Ocean Cleanup and all its funders for supporting them. The authors acknowledge the reviewers for their careful reading of our paper and their comments.

This research has been supported by The Ocean Cleanup, NASA-IDS (award no. 80NSSC20K0796), NASA-PO (award no. 80NSSC17K0561), and the Simons Foundation (award no. 549931).

This paper was edited by Volker Grewe and reviewed by Erik van Sebille and one anonymous referee.