WAP-1D-VAR v1.0: development and evaluation of a one-dimensional variational data assimilation model for the marine ecosystem along the West Antarctic Peninsula

Kim, Hyewon Heather; Luo, Ya-Wei; Ducklow, Hugh W.; Schofield, Oscar M.; Steinberg, Deborah K.; Doney, Scott C.

doi:https://doi.org/10.5194/gmd-14-4939-2021

Articles | Volume 14, issue 8

https://doi.org/10.5194/gmd-14-4939-2021

© Author(s) 2021. This work is distributed under
the Creative Commons Attribution 4.0 License.

https://doi.org/10.5194/gmd-14-4939-2021

© Author(s) 2021. This work is distributed under
the Creative Commons Attribution 4.0 License.

Articles | Volume 14, issue 8

Development and technical paper

|

12 Aug 2021

Development and technical paper |

| 12 Aug 2021

WAP-1D-VAR v1.0: development and evaluation of a one-dimensional variational data assimilation model for the marine ecosystem along the West Antarctic Peninsula

Hyewon Heather Kim, Ya-Wei Luo, Hugh W. Ducklow, Oscar M. Schofield, Deborah K. Steinberg, and Scott C. Doney

Download

Final revised paper (published on 12 Aug 2021)
Supplement to the final revised paper
Preprint (discussion started on 04 Feb 2021)
Supplement to the preprint

Interactive discussion

Status: closed

RC1: 'manuscript review', Anonymous Referee #1, 17 Mar 2021

The authors present a parameter estimation and sensitivity study, applying a variational optimization technique to a 1-dimensional ecosystem model for a station off the coast of Antarctica. The manuscript is concise (sometimes too much so) and presents a sophisticated optimization framework with an interesting application. It could be improved by showing if and to what extent the results generalize to different initial parameter values.

general comments

The manuscript is generally well written but in places it overestimates the project-specific knowledge of the reader. Some implementation details are hinted at, but not described fully or until later in the manuscript. I have pointed out several of those instances in the specific comments, one example is the "removal" of parameters from the optimization which is a key aspect of this study but not described in much detail. According to the description, an optimized parameter with sigma_f larger than 50% is updated but then removed from the optimization procedure, but according to Table 2 most of the parameters are never updated and somehow removed before the update. It is unclear how this happens.

It is nice to see a large Monte Carlo experiment with 1000 simulations to examine the model solution in the parameter space close to the optimized parameter values. However, I would argue that a more important aspect of an uncertainty analysis was left out of the manuscript. How sensitive are the results presented here to the initial choice of the parameters? If starting in another place in parameter space, does the algorithm end up in a different local minimum? Is the same subset of 14 parameters selected consistently? Do the results of the Hessian matrix analysis differ significantly in a different local minimum? Is an 80% reduction in the model-observation misfit typical? These are questions that may be more important to a modeler who would like to perform a similar parameter estimation experiment. I would suggest that the authors perform a second (but perhaps smaller) Monte Carlo experiment and assess the sensitivity of their results to the initial parameter values. This would not need to include every aspect of their original analysis and the results for their "reference" experiment could stay and be augmented with the new results.

When comparing the results of the optimized model with the initial one (Fig. 5 vs Fig B3), the most striking difference to me was the presence of much higher frequency variation in the optimized solution. What is causing this variation and is it realistic? If so, were the initial parameters based on particularly bad guesses, since they do not show the variation? Where possible, it would be useful to overlay these plots with the data used in the optimization procedure. Also, I think lots of readers may be interested in the before-and-after optimization comparison and I would suggest to add Fig. B3 into the manuscript body for comparison, and perhaps even show corresponding panels side-by-side.

Along the same lines, why is a comparison to the initial solution avoided in so many places, for example why not include its results in the Taylor diagram in Figure 4? But maybe I am misinterpreting the initial solution a little: A reader may assume that the initial parameter values were the best values one could come up with, without going into a rigorous parameter optimization exercise such as the one presented in this manuscript. But here it appears more like the initial parameter values are based on a first (informed) guess without yet checking the effect of the parameter values on the model state. Is there any indication how "good" the initial values are, are the initial state estimates somewhat reasonable?

specific comments

l 18: "Here we developed a one-dimensional, data assimilation planktonic ecosystem model [...] the pre-existing food-web and biogeochemical components of the WAP-1D-VAR model": It is not clear here if the model was developed from scratch, as the first sentence states, or that parts of the model existed before and were used here.

l 24: What are "intercompartmental flows"? Later in the abstract, "model state variables" is used and it would be good to use one expression consistently.

l 27: "... and comparable values of the assimilated and non-assimilated model state variables and flows to other studies": This is a run-on sentence and difficult to understand, I would suggest to break it up and rephrase the second part.

l 188: At this point it is not clear what the water depth at the model site is, how deep the model extends and how it reacts to a growing (deepening) ice cover or if this is even a concern at the site. If the authors want to keep this implementation aspect general, it would be good to make this explicit and mention under which circumstances the modeling framework can be used.

l 232: "using the new set of modified parameters": After reading a bit ahead, it is unclear if these modified parameters refer to modified parameter values or new parameters, as some have been removed from the optimization. In general, I would suggest to use the term "parameter value" instead of "parameter" whenever the text refers to the values rather than the parameter itself.

l 240: The removal should be mentioned and explained in the description above.

l 262: "for growth seasons' relatively complete data coverage for modelling purposes": I am not sure what this means, please rephrase. Listing the years without any further information is not very helpful to the reader, does it imply that a new CV is estimated for every one of those time periods?

l 289: "optimized model simulation": Does this imply parameter estimation?

l 291: It would be beneficial to the reader to rephrase "2002-2003, 2003-2004, 2004-2005, 2005-2006, 2006-2007, 2008-2009, 2009-2010, 2010-2011, and 2011-2012 period" to something like "9 growth periods between 2002 and 2012 (the 2007-2008 growth period is missing because of X)"

l 293: "averaging across all these 9 years did not reflect distinct seasonal phytoplankton peaks": What is meant by "reflect" here? Am I understanding this correctly, that phytoplankton blooms occurred at different times in the years that were examined, and so the "simple" climatology did not show any bloom? Then a more complex climatology was constructed using the time shifts to align all blooms. Was the model then optimized using parameter estimation? Were the optimized parameters used to inform boundary conditions or was the climatology used for that? More information would be useful here.

l 308: "but given cryptophytes being the second dominant species in the water samples they are considered to represent non-diatom species": I assume I know what is meant here but it would be good to clarify this point a bit more: The dataset contains data for diatoms and "non-diatoms", and here in this modeling study cryptophytes are assumed to represent all non-diatoms?

l 310: "POC(N)": I presume this is meant to mean "POC and PON" and not "POC and N" in this context. Here the "()" are used a little differently than they would normally be used and I fear it could be confusing to some readers. I would suggest to use "POC (PON)" or make the sentences a bit longer use "POC and PON" and avoid the "()".

l 349: "Some of these model biases cases were evidently shown on a point-to-point basis": It is not clear what this means, please rephrase.

l 353: "The data types with relatively high correlation coefficients tended to have relatively low centred RMSD and vice versa.": In Figure 4 it looks like "BP" with a "relatively high" correlation coefficient (according to the definition from the preceding sentence) has maybe the highest centered RMSD value.

l 354: "the model fitted average observations slightly better": It would be nice to add different colors for those in Figure 4. Otherwise, the reader may have to go back to a few sections to review which observations were averaged.

l 359: "Among the total of 72 optimizable model parameters, subsets of 14 [...] parameters changed": It is still not entirely clear to me how this is done in the algorithm, are the remaining parameters dropped from the optimization before the first iteration? Also, "subsets" appears to imply multiple experiments with different results, yet Table 2 only shows one subset, i.e. 14 parameters with a value for p_f.

l 369: "The optimized model results at each model time step and grid were associated with generally small errors derived

from the Monte Carlo experiments (Figure B2).": This sentence is difficult to understand, is it mean to say that the ensemble of state estimates obtained from the Monte Carlo experiment, that was conducted following the optimization, has a low standard deviation? As an aside, I would suggest not to mix the terms error and uncertainty here: shown is the uncertainty in the state estimate not an error that needs to be corrected.

Figure 1: In the figure description: "oragnic"

Citation: https://doi.org/10.5194/gmd-2020-375-RC1
RC2: 'Comment on gmd-2020-375', Anonymous Referee #2, 17 Mar 2021

General comments

This work develops a one-dimensional data assimilation model for the West Antarctic Peninsula and compares the model output with data from the Palmer Long-Term Ecological Research site. Overall, this paper provides a useful model to aid in data assimilation techniques and modeling efforts for the WAP. One general comment is that it would be beneficial to describe how well the optimized and updated parameters compare to measured parameter values from the field, lab, or both. That is, are the optimized and update parameters realistic or is there no baseline?

Specific and technical comments are described below as further suggestions to improve the manuscript.

Specific comments

Lines 72-73--Can you please provide support for the appropriateness of including just two phytoplankton groups (diatoms and cryptophytes) in this region?

Lines 130-135—The authors mention microzooplankton had a limit on the amount of diatoms they grazed and instead grazed cryptophytes to be able to simulate elevated diatom Chl. Can you comment on the ecological appropriateness of these grazing dynamics? That is, is there evidence to support that this prey switching occurs? What other mechanisms might lead to elevated diatom chlorophyll beyond microzooplankton changing their food preferences?

Lines 159-160 state that microzooplankton growth is based on grazing on cryptophytes and bacteria, while krill growth is based on grazing on diatoms and microzooplankton. However, in lines 130-135, the authors mention there is a limited amount of microzooplankton grazing on diatoms. Please reconcile this information. Does it mean the limit grazing mentioned in lines 130-135 is no grazing on diatoms by microzooplankton? That would contradict lines 74-75. Line 317 also mentions microzooplankton only grazing on bacteria and cryptophytes.

Relatedly, in Fig. 1, there are two grazing arrows going from diatoms to microzooplankton and no grazing arrows for cryptophytes. Please reconcile this figure with information in the text.

Line 224 and Table 2 – Consider indicating the literature sources from which the model parameters were taken.

Fig. B1 – can you please clarify what you mean by “Errors represent how much larger model output is compared to observations”?

Fig. 5—Can you please clarify what is leading to the oscillating patterns seen in the model state variables such as diatoms and cryptophytes?

Technical corrections

Line 153 – should the sentence read “…by remineralizing NH4 and PO4 if C is in short” rather than “if C in short”?

Line 428—should this read “There could be several additional…” instead of “There would be several additional…”?

Lines 467-468, 473—the citation formatting changed. Please fix it to be consistent

Fig. 3, 5, 6 – consider putting the year on the x-axis

Fig. 5 – the caption should be updated with the correct figure (presumably not “Figure SX”)

Fig. 7 – The arrows in this figure are presumably for the same processes in Fig. 1. Consider labeling them and/or referencing the reader to the arrows in Fig. 1, assuming that is appropriate.

Citation: https://doi.org/10.5194/gmd-2020-375-RC2
AC1: 'Final Author Comments for gmd-2020-375', Hyewon Kim, 28 Apr 2021

The comment was uploaded in the form of a supplement: https://gmd.copernicus.org/preprints/gmd-2020-375/gmd-2020-375-AC1-supplement.pdf

Citation: https://doi.org/10.5194/gmd-2020-375-AC1

Peer review completion

AR: Author's response | RR: Referee report | ED: Editor decision | EF: Editorial file upload

AR by Heather Hyewon Kim on behalf of the Authors (28 Apr 2021) Author's response Manuscript

ED: Referee Nomination & Report Request started (10 May 2021) by Heiko Goelzer

RR by Anonymous Referee #1 (21 May 2021)

Suggestions for revision or reasons for rejection

The authors followed the suggestion of including multiple parameter estimation experiments (started with different parameter values). I welcome this addition, yet I would like to see more results included from these new experiments.

# general comments

It is great to see that the authors repeated the parameter estimation experiment with different initial values. However, the results could be presented better and should probably be given more space. First of, the repeat experiments could already be mentioned and motivated in Section 2.4. Currently they are only presented in a relatively short passage in Section 4.1. Here, some statements create more questions than answers: "this was achieved by similar subsets of the optimized parameters" (line 397) Which subset of parameters remained the same, which only in a few experiments?
Apparently, the different experiments had similar results according to this passage "[...] and there was up to 76% of the reduction in the model-observational misfit (vs. 58% of the reduction in the reference case; Table B1) These results suggest that no matter where in parameter space the optimization started from, the adjoint/optimization scheme took the model cost function to similar local minima. (line 395)" Yet, a similar reduction in the cost function value does not imply similar parameter values.
In the current version, Table 1 paints a nice picture with uncertainty intervals for some parameters that are locally derived from the Hessian matrix. Having multiple cost function minima with different uncertainty intervals (and different parameters that are optimized/constrained) may distort that picture a bit and the presented uncertainty intervals may just be representative for 1 out of 16 experiments. Because this information is not available to the reader, it is unclear how generalizable the results in Table 1 are with respect to the other 15 experiments. A few more details/results to clarify, and maybe some discussion would be very helpful here.

In the updated manuscript, it is now much clearer how parameters are "removed" from the optimization. I am a bit surprised that removed parameters are abruptly reset to their initial values. Given the strong correlations that may be present between the parameters, a sudden reset of parameter values after several iterations could lead to "shocks" in other parameters. Or is the estimation restarted with all parameters starting from their initial values and fewer included in the optimization?

Section 4.3 is currently divided into two paragraphs. One is listing the changes brought to ecosystem indices by the optimization, without indicating if these changes are an improvement. The second paragraph then compares the model output to data and results from other studies and only briefly touches on the changes brought by assimilation. Here, it would be really useful to mix the two paragraphs and report the changes with reference to the data that is available.

# specific comments

l 29 "we discuss fully potential underlying reasons": Sounds a bit convoluted, maybe delete the "fully", also because it is difficult to claim an exhaustive discussion.

l 53: "its strength": I would suggest that there are multiple, changing it to "its strengths".

l 75: I think the first "dominated" in this sentence refers to large phytoplankton only, and the second one to smaller ones but I still think that two "dominated" are a bit confusing here.

l 142: "In principle, optimization should be able to capture the elevated diatom Chl by adjusting free parameters unless: 1) the right parameters are not adjusted and/or the baseline (non-optimized) parameters need significant adjusting, and/or 2) the model equations are not adequate even with the optimized parameters." What if the nutrient initial values are too low, would errors in the state estimates be a third option?

l 145: I am not sure if there has been much evidence for it in the WAP region but could their thick shells be a reason for less preferential grazing on diatoms?

l 250: "or estimated using a subset of the observations, without examining the effects of the initial parameter values on the model results prior to optimization": It's not clear how this should work. A subset of observations is used but the effect of the parameter values is never examined? What is the subset of observations used for then?

l 258: "with one parameter per each state variable, the change of which yields the largest decrease in the total cost function": How was this determined? Was the one parameter per state variable variable put in place first or did it turn out that the parameters yielding the largest effect, were one for each state variable?

l 271: "If parameters are optimized to ecologically unrealistic values, they are kept back to the initial parameter value": Even if they have undergone some changes in the previous steps, they are reset to the initial values? If so, could this have an effect on the the other parameters which may be correlated?

Eq 6: It would be good to clearly state the difference between the mean that is used in the computation of CV and the climatological mean that multiplied with it.

l 300: "J equivalent to J/M hereafter": Why not introduce it immediately?

l 317: What about increased wind-driven turbulence as the ice disappears, is this a concern?

l 319: "Also, because our model simulates only the spring-summer growth season, winter sea-ice growth is less of a concern.": Use a different term for sea-ice growth or change first instance of growth to something like "phytoplankton growth", so that phytoplankton growth won't be confused with sea-ice growth in this sentence.

l 331: I know that I had a question about this in my last review and I still think it should be explained better or just made more explicit. "Initial conditions are prepared by first optimizing the full growth seasonal cycle forced by climatological physics and assimilated with climatological observations and with the same bottom boundary conditions used in the optimization of the 2002-2003 growth season"
I think it should be pointed out here what kind of optimization is performed. Talking about the initial conditions, one could assume that optimization implies state estimation here, i.e. adjusting the initial conditions directly. However, based on the comments to my question, it appears that parameters were estimated for a climatological simulation which was then used to create the initial conditions. But where do the initial conditions for that climatological simulation come from? I think my problem is that I don't still understand what exactly "first optimizing the full growth seasonal cycle" really means.

l 390: "presented in the manuscript": Change to "presented above".

l 471: Is the decreased or increased (for NCP and POC) correlation realistic?

Fig. 4: Use the same coordinate system in both plots. Preferably, combine both plots into one, with different symbols for prior and posterior solution and different colors for the different observation types.

Fig. 6A/B: Join into the same figure, just like Fig 5B.

Hide

ED: Publish subject to minor revisions (review by editor) (02 Jun 2021) by Heiko Goelzer

Dear authors,

there are still some revisions required before the manuscript can be accepted for publication. Please address the reviewer comments in your next version.

Kind regards

Heiko Goelzer

### Reviewer comments

The authors followed the suggestion of including multiple parameter estimation experiments (started with different parameter values). I welcome this addition, yet I would like to see more results included from these new experiments.

# general comments

It is great to see that the authors repeated the parameter estimation experiment with different initial values. However, the results could be presented better and should probably be given more space. First of, the repeat experiments could already be mentioned and motivated in Section 2.4. Currently they are only presented in a relatively short passage in Section 4.1. Here, some statements create more questions than answers: "this was achieved by similar subsets of the optimized parameters" (line 397) Which subset of parameters remained the same, which only in a few experiments?
Apparently, the different experiments had similar results according to this passage "[...] and there was up to 76% of the reduction in the model-observational misfit (vs. 58% of the reduction in the reference case; Table B1) These results suggest that no matter where in parameter space the optimization started from, the adjoint/optimization scheme took the model cost function to similar local minima. (line 395)" Yet, a similar reduction in the cost function value does not imply similar parameter values.
In the current version, Table 1 paints a nice picture with uncertainty intervals for some parameters that are locally derived from the Hessian matrix. Having multiple cost function minima with different uncertainty intervals (and different parameters that are optimized/constrained) may distort that picture a bit and the presented uncertainty intervals may just be representative for 1 out of 16 experiments. Because this information is not available to the reader, it is unclear how generalizable the results in Table 1 are with respect to the other 15 experiments. A few more details/results to clarify, and maybe some discussion would be very helpful here.

In the updated manuscript, it is now much clearer how parameters are "removed" from the optimization. I am a bit surprised that removed parameters are abruptly reset to their initial values. Given the strong correlations that may be present between the parameters, a sudden reset of parameter values after several iterations could lead to "shocks" in other parameters. Or is the estimation restarted with all parameters starting from their initial values and fewer included in the optimization?

Section 4.3 is currently divided into two paragraphs. One is listing the changes brought to ecosystem indices by the optimization, without indicating if these changes are an improvement. The second paragraph then compares the model output to data and results from other studies and only briefly touches on the changes brought by assimilation. Here, it would be really useful to mix the two paragraphs and report the changes with reference to the data that is available.

# specific comments

l 29 "we discuss fully potential underlying reasons": Sounds a bit convoluted, maybe delete the "fully", also because it is difficult to claim an exhaustive discussion.

l 53: "its strength": I would suggest that there are multiple, changing it to "its strengths".

l 75: I think the first "dominated" in this sentence refers to large phytoplankton only, and the second one to smaller ones but I still think that two "dominated" are a bit confusing here.

l 142: "In principle, optimization should be able to capture the elevated diatom Chl by adjusting free parameters unless: 1) the right parameters are not adjusted and/or the baseline (non-optimized) parameters need significant adjusting, and/or 2) the model equations are not adequate even with the optimized parameters." What if the nutrient initial values are too low, would errors in the state estimates be a third option?

l 145: I am not sure if there has been much evidence for it in the WAP region but could their thick shells be a reason for less preferential grazing on diatoms?

l 250: "or estimated using a subset of the observations, without examining the effects of the initial parameter values on the model results prior to optimization": It's not clear how this should work. A subset of observations is used but the effect of the parameter values is never examined? What is the subset of observations used for then?

l 258: "with one parameter per each state variable, the change of which yields the largest decrease in the total cost function": How was this determined? Was the one parameter per state variable variable put in place first or did it turn out that the parameters yielding the largest effect, were one for each state variable?

l 271: "If parameters are optimized to ecologically unrealistic values, they are kept back to the initial parameter value": Even if they have undergone some changes in the previous steps, they are reset to the initial values? If so, could this have an effect on the the other parameters which may be correlated?

Eq 6: It would be good to clearly state the difference between the mean that is used in the computation of CV and the climatological mean that multiplied with it.

l 300: "J equivalent to J/M hereafter": Why not introduce it immediately?

l 317: What about increased wind-driven turbulence as the ice disappears, is this a concern?

l 319: "Also, because our model simulates only the spring-summer growth season, winter sea-ice growth is less of a concern.": Use a different term for sea-ice growth or change first instance of growth to something like "phytoplankton growth", so that phytoplankton growth won't be confused with sea-ice growth in this sentence.

l 331: I know that I had a question about this in my last review and I still think it should be explained better or just made more explicit. "Initial conditions are prepared by first optimizing the full growth seasonal cycle forced by climatological physics and assimilated with climatological observations and with the same bottom boundary conditions used in the optimization of the 2002-2003 growth season"
I think it should be pointed out here what kind of optimization is performed. Talking about the initial conditions, one could assume that optimization implies state estimation here, i.e. adjusting the initial conditions directly. However, based on the comments to my question, it appears that parameters were estimated for a climatological simulation which was then used to create the initial conditions. But where do the initial conditions for that climatological simulation come from? I think my problem is that I don't still understand what exactly "first optimizing the full growth seasonal cycle" really means.

l 390: "presented in the manuscript": Change to "presented above".

l 471: Is the decreased or increased (for NCP and POC) correlation realistic?

Fig. 4: Use the same coordinate system in both plots. Preferably, combine both plots into one, with different symbols for prior and posterior solution and different colors for the different observation types.

Fig. 6A/B: Join into the same figure, just like Fig 5B.

Hide

AR by Heather Hyewon Kim on behalf of the Authors (09 Jun 2021) Author's response Manuscript

ED: Publish subject to minor revisions (review by editor) (10 Jun 2021) by Heiko Goelzer

Dear authors,

thank you very much for updating your manuscript according to the review comments.

Looking through your responses to the questions and the manuscript, I didn't see any changes in relation to the two statements below. It seems to me that in both cases clarifications should be added to the text to avoid confusion of the reader on those points that the reviewer remarked.

###
l 250: "or estimated using a subset of the observations, without examining the effects of the initial parameter values on the model results prior to optimization": It's not clear how this should work.
A subset of observations is used but the effect of the parameter values is never examined? What is the subset of observations used for then?
Apologies for the confusion, we followed the literature values listed instead of using a subset of the observations.
###

If you were not "using a subset of the observations" maybe this part of the sentence should be removed?

###
l 258: "with one parameter per each state variable, the change of which yields the largest decrease in the total cost function": How was this determined? Was the one parameter per state variable put in place first or did it turn out that the parameters yielding the largest effect, were one for each state variable?
It was the latter case. The parameters yielding the largest change in cost function were the ones we selected for the initial parameter subset, which also happened to be usually one per each state variable.
###

Could you add this explanation also to the text?

For the figures, I want to follow up on the points below.

###
Fig. 4: Use the same coordinate system in both plots. Preferably, combine both plots into one, with different symbols for prior and posterior solution and different colors for the different observation types.
Please note that we had tried the way you suggested but as a result a significant number of data points overlapped with each other and compromised legibility.
###

I understand and accept your reason for keeping two separate figures. But I agree with the reviewer that the coordinate system should be identical in both panels (a and b) to make comparison possible.

###
Fig. 6A/B: Join into the same figure, just like Fig 5B.
Thanks for your suggestion, but we intend to show A for “initial/unoptimized results” and B for “optimized results” for consistency.
###

I agree with the reviewer that direct comparison between “initial/unoptimized results” and “optimized results” would be largely facilitated in a left/right comparison. For consistency, you could do the same in Figure 5, that would then span over two pages. I leave the decision on panel arrangement to your discretion, but strongly suggest to use matching scaling/colour ranges for corresponding panels in either case.

Kind regards

Heiko Goelzer

Hide

AR by Heather Hyewon Kim on behalf of the Authors (14 Jun 2021) Author's response Manuscript

ED: Publish as is (21 Jun 2021) by Heiko Goelzer

AR by Heather Hyewon Kim on behalf of the Authors (29 Jun 2021) Manuscript

Post-review adjustments

AA: Author's adjustment | EA: Editor approval

AA by Heather Hyewon Kim on behalf of the Authors (03 Aug 2021) Author's adjustment Manuscript

EA: Adjustments approved (09 Aug 2021) by Heiko Goelzer

Short summary

The West Antarctic Peninsula (WAP) is a rapidly warming region, revealed by multi-decadal observations. Despite the region being data rich, there is a lack of focus on ecosystem model development. Here, we introduce a data assimilation ecosystem model for the WAP region. Experiments by assimilating data from an example growth season capture key WAP features. This study enables us to glue the snapshots from available data sets together to explain the observations in the WAP.