Advancing crop modeling and data assimilation using AquaCrop v7.2 in NASA's Land Information System Framework v7.5

De Lannoy, Gabriëlle J. M.; Busschaert, Louise; Bechtold, Michel; Lanfranco, Niccolò; de Roos, Shannon; Heyvaert, Zdenko; Bielinis, Martynas; Mortelmans, Jonas; Scherrer, Samuel A.; Van den Bossche, Maxime; Kumar, Sujay; Mocko, David M.; Kemp, Eric; Heng, Lee; Steduto, Pasquale; Raes, Dirk

doi:10.5194/gmd-19-2551-2026

Articles | Volume 19, issue 6

https://doi.org/10.5194/gmd-19-2551-2026

Articles | Volume 19, issue 6

Model description paper

31 Mar 2026

Model description paper |

| 31 Mar 2026

Advancing crop modeling and data assimilation using AquaCrop v7.2 in NASA's Land Information System Framework v7.5

Gabriëlle J. M. De Lannoy, Louise Busschaert, Michel Bechtold, Niccolò Lanfranco, Shannon de Roos, Zdenko Heyvaert, Martynas Bielinis, Jonas Mortelmans, Samuel A. Scherrer, Maxime Van den Bossche, Sujay Kumar, David M. Mocko, Eric Kemp, Lee Heng, Pasquale Steduto, and Dirk Raes

Abstract

This paper introduces the open-source AquaCrop v7.2 model as a new process-based crop model within NASA's Land Information System Framework (LISF) v7.5. The LISF enables high-performance crop modeling with efficient geospatial data handling, and paves the way for scalable satellite data assimilation into AquaCrop. Through three exploratory showcases, we demonstrate the current capabilities of AquaCrop in the LISF, along with topics for future development. First, coarse-scale (0.1°) generic crop growth simulations with various crop parameterizations are performed over Europe. Satellite-based estimates of land surface phenology are used to inform spatially variable crop parameters. These parameters improve canopy cover simulations in growing degree days compared to using uniform crop parameters in calendar days. Second, ensembles of coarse-scale simulations over Europe are created by perturbing meteorological forcings and soil moisture. The resulting uncertainties in root-zone soil moisture and biomass are often greater in water-limited regions than elsewhere. The third showcase aims to improve fine-scale ( $1 / 112 °$ ) winter wheat simulations through satellite data assimilation. Fine-scale canopy cover observations are assimilated with an ensemble Kalman filter to update the crop state in the Piedmont region of Italy. The state updating is beneficial for the intermediary biomass estimates, but leads to only small improvements in yield estimates relative to reference data. The spatiotemporal variability of simulated yield is poor due to strong model (parameter) constraints, and the assimilated satellite-based canopy cover observations are not sufficiently informative of yield. Furthermore, the exact fields of the reference yield data are unknown and thus hard to compare to simulations. The showcases highlight pathways to advance future crop estimates, e.g. through crop parameter updating and multi-sensor and multi-variate data assimilation.

Download & links

Article (PDF, 9229 KB)

Download & links

How to cite.

De Lannoy, G. J. M., Busschaert, L., Bechtold, M., Lanfranco, N., de Roos, S., Heyvaert, Z., Bielinis, M., Mortelmans, J., Scherrer, S. A., Van den Bossche, M., Kumar, S., Mocko, D. M., Kemp, E., Heng, L., Steduto, P., and Raes, D.: Advancing crop modeling and data assimilation using AquaCrop v7.2 in NASA's Land Information System Framework v7.5 , Geosci. Model Dev., 19, 2551–2575, https://doi.org/10.5194/gmd-19-2551-2026, 2026.

Received: 13 Sep 2025 – Discussion started: 06 Oct 2025 – Revised: 25 Feb 2026 – Accepted: 12 Mar 2026 – Published: 31 Mar 2026

1 Introduction

Food production, agricultural land and water management, and their interaction with socio-economic demands are facing increasing challenges in our changing world. Tapping into advanced computational models and an abundance of satellite data on crop-water systems, we can enhance our ability to understand, monitor, and manage these challenges from the level of single fields all the way to the global scale.

Crop models allow us to dynamically simulate crop phenology and derive yields and irrigation needs, given meteorological and soil input along with crop information. These models can be grouped into (i) biophysical mechanistic or process-based crop models, and (ii) empirical or statistical crop yield models, including machine learning models. Although the latter data-driven models are rapidly gaining attention and precision (van Klompenburg et al., 2020; Paudel et al., 2022; Gaso et al., 2024), the former remain indispensable as a reference to understand interactions between crop phenology and the environment, and to estimate poorly observed variables such as root-zone soil moisture. Most process-based crop models were originally developed for field (< 1 ha) applications, and are now increasingly used for regional to global applications. This is stimulated by the growing availability of satellite-based data to serve as input, to calibrate parameters, or to update the model during simulation via data assimilation (DA) for larger regions. Advancing large-scale satellite-based crop DA requires computationally efficient and scalable frameworks. This includes flexible ways (i) to drive crop models with various types of parameters and meteorological forcings, (ii) to produce uncertainty estimates via ensemble simulations, and (iii) to interface models with various types of satellite data using a range of DA techniques. The NASA Land Information System Framework (LISF, Kumar et al., 2006, 2008) offers such a framework, but so far it was limited to hosting land surface and hydrology models. Some of these models allow simulations of crop growth and agricultural practices, such as irrigation (Busschaert et al., 2025 b). However, land surface models are designed to ultimately estimate water, energy and carbon interactions between the land and the atmosphere, and thus typically serve to e.g., initialize weather forecasts, and hydrology models aim at simulating the water budget, floods and droughts. By contrast, crop models aim at simulating crop yield, and the yield response to water availability, with more land management options, a more detailed description of crop phenology stages, and a refined stage-specific sensitivity to various stresses. This paper builds on the long-standing development of LISF to explore the potential of crop DA, by integrating a crop model into the LISF. Specifically, AquaCrop v7.2 is integrated within LISF v7.5, and various capabilities of this new system are demonstrated through three exploratory showcases.

In the realm of process-based crop models, AquaCrop (Raes et al., 2009; Steduto et al., 2009) is a water-driven crop model with relatively limited input requirements. Launched in 2009 to simulate the yield response of herbaceous crops to water availability at the field scale, the model now includes, for example, the simulation of perennial herbaceous forage crops (e.g. alfalfa), salinity and improved biomass-stress relationships. AquaCrop comes with a database of calibrated crop parameters for a range of crops, but specific crop calibration is often recommended (Wellens et al., 2022). The sensitivity of yield simulations to crop (and soil) parameters has been well studied for AquaCrop at the field scale, both in humid and dry regions (Vanuytrecht et al., 2014; Lu et al., 2021 a). More recently, AquaCrop has been used for coarse-scale regional to global simulations of soil moisture and biomass (de Roos et al., 2021), irrigation (Busschaert et al., 2025 b), water footprints (Mialyk et al., 2024), food-energy-water nexus analyses (Akbari Variani et al., 2023), and to quantify the impact of future climate on irrigation requirements and yield (Busschaert et al., 2022, 2026). While many of these applications are interested in the yield of specific crops within a coarse-scale grid cell, other coarse-scale applications require generic crop parameters to represent the patchwork of fields within a grid cell. In a first showcase using AquaCrop in the LISF, we will demonstrate that such generic crop parameters for coarse-scale AquaCrop simulations can be derived from the satellite-based Global Land Surface Phenology product (GLSP; Zhang et al., 2018) over Europe.

The performance of the crop model depends not only on parameter choices but also on other input, and the model structure. Multi-model ensembles have been used to quantify the modeled yield response to temperature, CO₂, fertility and management, for within-season agricultural forecasting and climate simulations (Müller et al., 2021). For yield or irrigation forecasts with a single model, weather input in particular strongly influences the quality of simulations (Challinor et al., 2005; Busschaert et al., 2024; Zare et al., 2024). The initial crop and soil moisture state are also crucial in determining the trajectory of crop and water balance forecasts. By perturbing model input, parameters, and/or the state, an ensemble of forecasts can be created to quantify the forecast uncertainty or to analyze the sensitivity of particular crop variables to input uncertainty. This is useful in itself, or to facilitate DA in a next step. In a second showcase of this paper, we will illustrate how the uncertainty in coarse-scale AquaCrop biomass simulations is related to uncertainty in soil moisture and weather conditions over Europe.

In the context of crop modeling, the aim of DA is to optimally integrate observed data through parameter, forcing, or state estimation (Evensen et al., 2022) to improve crop estimates. DA also adds value to observations by inferring unobserved variables of the crop system, e.g. by estimating root-zone soil moisture from surface soil moisture or vegetation observations. Field-scale crop DA has been explored using in-field, satellite, or synthetic observations, both for parameter and state estimation (Pauwels et al., 2007; Linker and Ioslovich, 2017; Lu et al., 2021 b, 2022; Zare et al., 2024; Yang and Lei, 2024). For larger domains, crop DA has mainly used satellite observations of soil wetness and crop biophysical variables (Jin et al., 2018) to estimate soil or crop parameters (Dente et al., 2008; Huang et al., 2023). However, regional to global sequential satellite DA to improve crop yield forecasts via soil moisture or crop state updating (de Wit and van Diepen, 2007; de Roos et al., 2024) is still in its infancy compared to current operational or state-of-the-art assimilation systems for land surface models (Reichle et al., 2019; De Lannoy et al., 2022). In a third showcase, we will explore the assimilation of fine-scale satellite observations of fractional vegetation cover (FCOVER) to refine biomass and yield estimates for the Piedmont area in Italy, Europe.

The objective of this paper is to introduce the open-source AquaCrop v7.2 model as a dynamical state propagation model within NASA's LISF v7.5 (Sect. 2), and to pave the way for large-scale satellite DA into a crop model. The latter is achieved in three showcases, demonstrating the current potential and limitations of (i) coarse-scale crop growth simulation with various crop parameterizations, (ii) coarse-scale ensemble simulations, and (iii) ensemble Kalman filtering of fine-scale satellite data into AquaCrop. The background and methods for the three showcases are given in Sect. 3, the results and discussion in Sect. 4, and the paper ends with conclusions in Sect. 5.

2 Modeling and Data Assimilation Framework

2.1 AquaCrop v7.2 Assets

Originally, AquaCrop of the Food and Agriculture Organization (FAO) was developed in a Delphi/Pascal programming language to be used in a graphical user interface (GUI). In 2022, the core engine of AquaCrop version 7.0 (v7.0) was converted to Fortran 2003 and optimized, to promote wider use, transparent version control, and long-term community-based maintenance on GitHub. This conversion was done in a semi-automatic way by KU Leuven researchers with guidance from KU Leuven high-performance computing (HPC) support, and extensively verified via a range of testcases covering all of Europe and the full range of crop and management scenarios. Fortran is a green programming language, widely used in the earth and climate sciences community, and allows easy interfaces with other languages. Unlike older open-source versions in other languages (Kelly and Foster, 2021), the Fortran code includes all AquaCrop functionalities (e.g. salinity, fertility, perennial crops, etc.). This paper uses AquaCrop v7.2 (https://github.com/KUL-RSDA/AquaCrop, last access: 21 March 2026) plus bug fixes that are released in v7.3. Documentation can be found on https://ees.kuleuven.be/en/aquacrop (last access: 21 March 2026).

The five key AquaCrop assets are summarized in Fig. 1. They include (i) the GUI application, (ii) the open-source version-controlled Fortran code available on GitHub, (iii) the stand-alone programs for Windows, MacOS, and Linux, compiled from the Fortran code, (iv) a simple Python wrapper around the stand-alone program to run AquaCrop in parallel across grid cells (https://github.com/KUL-RSDA/RegionalAC_Py, last access: 21 March 2026), and (v) the integration of the Fortran code into NASA’s LISF (https://github.com/NASA-LIS/LISF, last access: 21 March 2026, Sect. 2.3). The first three assets are made available with each version release on GitHub (https://github.com/KUL-RSDA/AquaCrop/releases, last access: 21 March 2026) and offer three forms of the AquaCrop core engine. The GUI comes with an FAO copyright. The last two assets employ the stand-alone executable and plain Fortran code, respectively, to allow efficient regional- to global-scale modeling, climate impact simulations, and satellite DA (de Roos et al., 2021; Busschaert et al., 2022, 2025 b, 2026; de Roos et al., 2024).

https://gmd.copernicus.org/articles/19/2551/2026/gmd-19-2551-2026-f01

Figure 1The five main AquaCrop v7.2 assets: (top) three forms of the AquaCrop core engine, (i) embedded within a GUI, (ii) compiled for Windows, MacOS and Linux, and (iii) open source on Github, and (bottom) two derived assets to support regional simulations, within a (iv) Python wrapper and (v) NASA LIS. Logos for illustrative purposes, credit: FAO, Microsoft, Apple, Linux, GitHub, Python, NASA LIS.

2.2 AquaCrop v7.2 Processes

AquaCrop computes, for each daily timestep i, the (1) canopy development, (2) water balance, (3) biomass, and (4) yield, as shown in the flowchart of Fig. 2. We summarize the model here with a focus on the crop and soil water state propagation in time, and with special attention to the system's time-variant prognostic and diagnostic variables (with subscript i), and time-invariant parameters (without subscript i). Prognostic or state variables depend on state estimates of a previous time step, whereas diagnostic variables are derived from the current state variables. Like most other crop models, AquaCrop can use either calendar time or thermal time (also known as growing degree days, GDD) to track crop development.

For each day, the prognostic canopy cover CC_i [m² m⁻²] describes the crop phenology development, either based on calendar days or GDD [°C d]. First, we focus on the potential crop canopy cover evolution, CC_pot,i [m² m⁻²] without crop stress. The gray area in the first part of Fig. 2 reflects CC_pot,i and is determined by a stage-dependent piecewise function f_cc(i,α_cc), described in Appendix A. This function depends on time i and several parameters, collectively referred to as the parameter vector $α_{cc} = [{CC}_{o}, {CC}_{x}, CGC, CDC, \dots]$ . CC_o [m² m⁻²] is the initial CC_pot,i at emergence. Thereafter, the growth sets in to reach CC_x [m² m⁻²] or maximal CC_pot,i, with growth defined by the canopy growth coefficient CGC in [d⁻¹] or [°C⁻¹ d⁻¹]. At the end of the season, the canopy decline coefficient CDC in [d⁻¹] or [°C⁻¹ d⁻¹] describes the rate of green canopy decline to the final stage of crop maturity, which marks the end of the crop growth season. Again, CC_pot,i solely depends on time i and α_cc, and serves as a parameterized upper limit to the actual CC_i. By contrast, the actual CC_i computation accounts for soil water, soil fertility and soil salinity stresses, by substituting α_cc with time-variant rescaled equivalents α_cc,i, i.e. the parameters are dynamically adjusted for CC_i−1 and water, fertility and salinity stresses:

\begin{matrix} (1) & {CC}_{pot, i} = f_{cc} (i, α_{cc}) \\ (2) & {CC}_{i} = f_{cc} (i, α_{cc, i} ({CC}_{i - 1}, θ_{i})) \end{matrix}

where θ_i here refers to the vector of prognostic soil water estimates in all soil compartments for simplicity, but to be more precise, it also includes salinity and soil fertility. In the equations of this paper, we specifically choose to highlight the dependencies on the soil water (∈θ_i) and CC_i state variables, and we refer to Raes et al. (2026) for a full model description.

The piecewise f_cc(.) is thus a set of different exponential growth and decay functions (Steduto et al., 2009; Raes et al., 2009), constrained by time (or GDD) and crop parameters α_cc (incl. crop stages), in such a way that CC $_{i} \leq {CC}_{pot, i}$ . This means that for a model trajectory with a given parameter CC_x, the CC_i should or could never be perturbed or updated above the corresponding CC_pot,i. Furthermore, if soil fertility stress is set, the CC_x will be modulated by the fertility budget to CC $_{x, sf, i}$ assuming no water stress, and the attainable potential canopy cover is CC $_{pot, sf, i}$ , i.e. CC $_{i} \leq {CC}_{pot, sf, i} \leq {CC}_{pot, i}$ (see Sect. 3.3, and Appendix A). Soil fertility lumps the effect of field management (pest control, nutrients, post-harvest loss, …) on crop production.

After rescaling for micro-advective affects, the variable CC $_{i}^{*}$ is used in the second step to diagnose transpiration Tr_i [mm d⁻¹]:

\begin{matrix} (3) & {Tr}_{i} = K_{s, i} (θ_{i}) \times [{CC}_{i}^{*} (θ_{i}) \times K_{c, Tr, x, i}] \times {ET}_{o, i} \end{matrix}

with ET_o,i [mm d⁻¹] the input reference grass evapotranspiration, K_s,i [–] ( $0 \leq K_{s, i} \leq 1$ , with 1 being no stress) a stress coefficient accounting for cold and soil water stress (diagnosed from θ_i, i.e. water shortage and logging, and also salinity), and $K_{c, Tr, x, i}$ [–] is the maximum crop transpiration coefficient that depends on the specific crop, and is adjusted for aging, senescence and elevated CO₂. After normalizing the Tr_i for the climatic conditions using ET_o,i [mm d⁻¹], it is used to update the prognostic dry aboveground biomass production B_i [t ha⁻¹] via a crop-specific water productivity factor ${WP}_{i}^{*}$ [t ha⁻¹ d⁻¹], which is normalized for the effect of climatic conditions (meteorology and CO₂ concentration), adjusted for the crop stage, and rescaled by a soil fertility stress coefficient K_sf,i [–] ( $0 \leq K_{sf, i} \leq 1$ , with 1 being no stress):

\begin{matrix} (4) & B_{i} = B_{i - 1} + Δ B_{i} Δ t \\ (5) & Δ B_{i} = K_{sf, i} \times {WP}_{i}^{*} \times \frac{{Tr}_{i} (θ_{i}, {CC}_{i})}{{ET}_{o, i}} \end{matrix}

ΔB_i is thus the biomass production per model time step [t ha⁻¹ d⁻¹], and Δt refers to the daily model time step. Finally, dry yield Y_i [t ha⁻¹] is diagnosed from biomass by multiplication with a crop-specific harvest index HI_o, [–] and an adjustment factor K_HI,i [–] that accounts for time, a crop-specific growth coefficient, and water, heat and cold stresses at different stages in the growing cycle:

\begin{matrix} (6) & Y_{i} = K_{HI, i} (θ_{i}) \times {HI}_{o} \times B_{i} (θ_{i}, {CC}_{i}) \end{matrix}

The harvested Y_i is obtained at the time of crop maturity.

The equations above are deliberately written in a way that highlights their dependence on θ_i. The prognostic soil water, salinity and fertility budgets are computed at each time step. These three budgets are driven by meteorological input of daily temperature, precipitation (rainfall) and ET_o, and in turn these budgets drive the stresses in AquaCrop. Here, we focus only on the soil water budget. A soil reservoir with 12 compartments (prognostic state variables ∈θ_i) receives water through infiltration of precipitation and possibly irrigation P_i, draws water from shallow groundwater through capillary rise C_i, and releases water through deep percolation D_i, soil evaporation E_i and crop transpiration Tr_i. These water fluxes [mm d⁻¹] are cumulated over 1 d (from i−1 to i) and move through the profile following soil physical laws represented by functions f_θ(.), that result in a change in soil moisture Δθ_i, as follows:

\begin{matrix} (7) & θ_{i} = θ_{i - 1} + Δ θ_{i} Δ t \\ (8) & Δ θ_{i} = f_{θ} (P_{i}, C_{i}, D_{i}, E_{i}, {Tr}_{i}) \end{matrix}

The water stored in the dynamic root zone determines some of the crop growth stresses mentioned above. The prognostic root-zone depth Z_i [m] is a parameterized function of time (calendar days or GDD), modulated by soil water availability as follows:

\begin{matrix} (9) & Z_{i} = Z_{i - 1} + Δ Z_{i} Δ t \\ (10) & Δ Z_{i} = K_{z, i} (θ_{i}) [f_{z} (i, α_{z}) - f_{z} (i - 1, α_{z})] \end{matrix}

where f_z(.) is a set of functions that describe the potential rooting depth solely as a function of time (or GDD) and time-invariant parameters α_z, containing the minimal and maximal potential rooting depth parameter, Z_n and Z_x [cm] respectively, illustrated in Fig. 3. K_z,i(θ_i) [–] reduces the maximal potential expansion of the rooting depth, based on a diagnosis of stomatal stress (determined by root-zone soil moisture) or dry subsoil at the front of the root-zone expansion. Just like CC_i is double bounded by f_cc(.), so Z_i is double bounded by f_z(.) as a strong model constraint determined by parameters.

Equations (2), (3), (6) and (10) in particular contain piecewise functions and thus discontinuities that are determined by crop stages, i.e., the crop-water system propagates through different crop development regimes. AquaCrop considers the following phenological stages, which are parameterized in either calendar days or GDDs, and marked on Fig. 3: (I) time from planting/sowing to emergence, i.e. the onset of greening, (II) time to maximum rooting depth, (III) time to flowering, (IV) time to senescence, and (V) time to maturity, i.e. the end of the crop cycle. A particularity of AquaCrop (and many other crop models) is that the timing of the crop stages is assumed to be known at the beginning of the crop simulation, i.e. it is parameterized. Consequently, when a simulation is run in GDD mode, the entire temperature record of a simulation year is used to precompute the timing of the stages at the beginning of that simulation year. This is theoretically not needed when AquaCrop is run in calendar days, but in practice, the current model still expects the full temperature record at the start of the simulation. This setup results from the original purpose of AquaCrop, which was to use historical meteorological data to test strategies to improve sustainable crop production in the future. This limitation will be addressed in future model development.

https://gmd.copernicus.org/articles/19/2551/2026/gmd-19-2551-2026-f02

Figure 2Schematic diagram of the AquaCrop model with the four main computational steps from crop development to yield, adapted from Raes et al. (2025). Symbols are explained in the text.

https://gmd.copernicus.org/articles/19/2551/2026/gmd-19-2551-2026-f03

Figure 3Crop stages assumed in (black I–V) AquaCrop, with an indication of some other (blue) essential AquaCrop crop parameters that define the CC_pot and rooting depth evolution: CC_o is the CC_pot at emergence, CC_x is the maximal CC_pot, the canopy growth coefficient CGC defines the growth, the canopy decline coefficient CDC defines the canopy decline, Z_n and Z_x define the minimal and maximal potential rooting depth. The timing of the crop stages can be inferred from (green A–F) the time of the various satellite-based phenology stages in the GLSP product. Note the difference in the assumed shape of the curve for (blue) AquaCrop potential vegetation and (dashed green) GLSP actual vegetation during senescence.

Download

2.3 NASA LISF v7.5

NASA's LISF is a scalable software framework with three components: (i) the Land surface Data Toolkit (LDT) that handles the data-related requirements to prepare for model simulations, (ii) the Land Information System (LIS) that integrates multiple land surface models (LSMs), satellite data readers, and DA techniques (Kumar et al., 2006, 2008), and (iii) a Land surface Verification Toolkit, not yet used in this paper. It provides a portable infrastructure to transition Earth science research into operations (Kumar et al., 2019), can be coupled to Earth system models (Heyvaert et al., 2025), and it offers a digital twin environment for land surface hydrology (De Lannoy et al., 2024), and from now on also for agricultural cropping systems. LISF utilizes high-performance computing along with data management technologies to address the computational difficulties associated with high-resolution land surface modeling.

So far, LIS has hosted a range of LSMs, but no dedicated crop models yet. AquaCrop v7.2 is coupled into LIS v7.5 using shared plugins and standard interfaces. Because the model was not originally designed to serve as a state propagation model within an assimilation system, and because it behaves differently than the other models already embedded in LIS, some technical aspects required attention. First, AquaCrop cannot be run at subdaily resolution, whereas most LSMs are typically run at subhourly resolutions. This means that meteorological forcings, typically provided at an hourly resolution, must be aggregated to daily values, and the maximum and minimum air temperatures of the day must be extracted to be used directly as input to the crop model and in the calculation of ET_o. Second, AquaCrop relies on numerous global variables, which include state variables (e.g. soil moisture, biomass) as for the LSMs, but also many other variables that are diagnosed from – or not at all related to – state variables, and are necessary to describe the current crop system conditions. These include flags that are turned on and off to mark a certain process (e.g. early senescence). These global variables are passed on from one timestep (day) to another, and therefore need to be saved to restart the simulation later on. Third, the time-stepping mechanism and model progression in AquaCrop were originally integrated into AquaCrop's own file management system. To enable coupling with the LIS, a new routine is developed that enables the model to advance by a single time step when called externally, following after the initialization of all required global variables.

AquaCrop requires input on meteorology and CO₂ (collectively called “climate input” in AquaCrop literature), soil and crop parameters, initial soil moisture conditions, and if desired, management, irrigation and groundwater options. Except for climate and soil input, AquaCrop in LIS still uses its original text files as input for crops, management, etc. In LIS, the meteorology is typically read from global netcdf files of re-analysis products. The atmospheric CO₂ concentrations are taken from measurements at the Mauna Loa Observatory in Hawaii (Thoning et al., 2025). Spatially distributed soil and crop type classes are entered via geospatial netcdf files that are created via the LDT as front-end processor for LIS, and then mapped to the associated soil hydraulic or crop parameters via an AquaCrop lookup table and original crop parameter files, respectively. The latter are provided in the configuration file of LIS. Likewise, management options, such as those related to fertility and irrigation, are provided in the LIS configuration file via an original AquaCrop parameter file.

Currently, the configuration file for LIS is limited to enabling just a few of the many available AquaCrop options. Inactive features that might be introduced in the future include soil profiles with varying textures at different depths, varying groundwater levels, crop rotation, perennial crops, and salinity stress. Furthermore, except for meteorological input, only the soil texture and crops are currently spatially variable input. The specification of initial conditions, groundwater, irrigation and management parameters (provided via original AquaCrop files) and planting/sowing criteria are uniform, but this can easily be addressed in future versions.

Through three exploratory showcases (Sect. 3) at different spatial scales, we demonstrate some AquaCrop applications that are facilitated by LIS. First, LIS enables us to efficiently handle spatially distributed input forcings and parameters at any resolution. In Sect. 3.1, we constrain coarse-scale spatially distributed generic crop parameters with satellite data, and then use them in AquaCrop simulations over Europe. Second, LIS facilitates ensemble perturbations of the forcings, parameters and state variables, possibly with spatial and temporal correlations, and ensemble perturbation bias correction. In Sect. 3.2, forcings and soil moisture state variables are perturbed to demonstrate their impact on biomass simulations for Europe. Finally, LIS hosts different DA algorithms and interfaces with a range of satellite data. In Sect. 3.3, the potential and current limitations of high-resolution FCOVER satellite DA are illustrated for the Piedmont region of Italy. Note that the time index i is omitted from the time-varying variables from here on for simplicity, unless needed for clarity (in Sect. 3.3 only).

3 Methods

3.1 Showcase 1: Crop Parameterization

3.1.1 Background

AquaCrop was originally designed to simulate a generic annual herbaceous crop. With time, crop parameters were adjusted via calibration to simulate various specific herbaceous species, and perennial forage crops were included. The AquaCrop crop parameters consist of so-called conservative parameters that are independent of climate, time, location, management, or cultivar, and other parameters that depend on, e.g., the cultivar or planting mode and should be calibrated. The most important crop parameters are related to the length of the different stages of phenological development, CC_x, CGC, CDC, and Z_x marked in Fig. 3, and HI_o. The relative importance of these parameters depends on the crop type and the environment (Vanuytrecht et al., 2014; Lu et al., 2021 a).

Specific crop parameters (e.g. for maize, wheat, …) are necessary for applications that aim at crop-specific yield estimates (Busschaert et al., 2026) or field management. Such parameters can be derived from field data or high-resolution (10–100 m) satellite data (Franch et al., 2022; Gobin et al., 2023). Including satellite-based specific crop phenology information has proven to improve crop model simulations (Bregaglio et al., 2023). In contrast, generic crop parameterizations are needed for coarse-scale simulations of crop growth, irrigation, and overall carbon and water budgets in mixed crop grid cells (de Roos et al., 2021; Busschaert et al., 2025 b), and cannot be used to estimate specific crop yield. Generic crop parameters for mosaics of crops with different crop stages in rotating field locations are hard to define, but effective spatial climatological patterns of phenological metrics can be derived from lower-resolution (> 100 m) satellite data (Hmimina et al., 2013; Zhang et al., 2018). To our knowledge, no study has reported whether satellite-based parameterization of a generic crop can improve coarse-scale AquaCrop simulations.

3.1.2 Experiment

We compare coarse-scale AquaCrop simulations that use two different generic crop parameterizations. The crop parameters are summarized in Appendix C. The first crop parameterization defines the development stages in spatially uniform calendar days, as in de Roos et al. (2021) for C3 crops. It is solely based on agronomist expertise and tailored to obtain good coarse-scale biomass estimates. The second approach uses spatially variable parameters in GDD mode. These parameters are derived from maps of day of the year (DOY) for various crop stages provided by the 0.05° Global Land Surface Phenology product of the Visible Infrared Imaging Radiometer Suite (VIIRS GLSP; Zhang et al., 2018) for the period 2013–2023 (11 years). The mapping between GLSP stages and AquaCrop parameters is summarized in Table 1, illustrated in Fig. 3 and details are provided in Appendix B. In short, for each year, the first growing season that has its start and end in the same year is extracted from the GLSP product. Next, the DOY for each particular stage in each 0.05° grid cell is converted into a GDD value using re-analysis temperature, and the CGC and CDC parameters are estimated by inverting the f_cc(.) functions (Eq. 2 and Appendix A). Finally, the median 0.05° crop parameters are computed across the 11 years.

Table 1Link between AquaCrop parameters and VIIRS-GLSP crop stages, marked in Fig. 3, and explained in Appendix B. For AquaCrop experiments in GDD mode, the times for the crop stages (A, D, F) are first converted from DOY to GDD before deriving the time to maximal rooting depth, CGC or CDC. n/a – not applicable

Download Print Version | Download XLSX

Each of both generic crop parameterizations is used in an AquaCrop simulation experiment for the period 2015 through 2020. The meteorology is taken from the lowest model level (∼ 10 m) forecasts of the fifth-generation European Center for Medium-Range Weather Forecasts (ECMWF) Reanalysis (ERA5; Hersbach et al., 2020), bilinearly interpolated to the model grid, and the air temperature is corrected to 2 m above the elevation of the land grid cell through a lapse-rate correction. The simulations are performed at 0.1° resolution for all grid cells with dominant cropland in Europe according to the 1 km global land cover data set from the University of Maryland (Hansen et al., 1998). The satellite-constrained parameters in GDD are aggregated from the 0.05° to the 0.1° resolution, by taking the average of all 0.05° grid cells with dominant cropland. The soil texture is taken from the Harmonized World Soil Database 1.21 (FAO/IIASA/ISRIC/ISSCAS/JRC, 2012) as a weighted combination of surface and subsurface texture, following De Lannoy et al. (2014), and the texture is mapped to default soil hydraulic parameters for AquaCrop v7.2 (Raes et al., 2026). To parameterize field management, a uniform soil fertility stress of 30 % is assumed for both experiments as in de Roos et al. (2021). All other boundary conditions are also set uniformly and in a simplistic way, i.e. no irrigation and no shallow groundwater table (no capillary rise). A deterministic run for one simulation year and 33 670 crop grid cells takes about 45 min, when run on 36 central processing units (cpus). The compute times reported in this paper are conservative and measured on the Tier-1 Hortense and Tier-2 wICE clusters of the Vlaams Supercomputer Centrum (VSC).

We aggregate the daily results of simulated CC and ΔB to 10 d values and evaluate them against independent satellite data, i.e. Copernicus Global Land Service (CGLS) Fraction of Vegetation Cover (FCOVER) and Dry Matter Productivity (DMP) for both simulations. The performance is quantified for the period between the first and last timestep for which any of the two AquaCrop simulations or the CGLS DMP reference data exceed 5 % of their maximum ΔB value in the year. This period is referred to as the “maximal growing season” below.

3.2 Showcase 2: Ensemble Simulations

3.2.1 Background

Crop model simulations are never perfect, due to uncertainties in crop, soil and management parameters, model structure, meteorological input, and initial conditions. Sampling a range of possibilities for these aspects allows to create an ensemble of crop model trajectories to determine the model sensitivity to these aspects individually. However, ensembles are also used to quantify (i) the total time-varying uncertainty of the simulation output, or forecast error, and (ii) the correlation of the forecast errors between the various simulated variables. These dynamic ensemble uncertainty estimates are particularly important for DA in a next step. With the exception of a few studies geared towards DA for state updating (de Roos et al., 2024; Lu et al., 2022), most ensemble simulations with AquaCrop have been performed to study the model sensitivity to crop parameters, and not to estimate the total model uncertainty in response to errors in the state or meteorological estimates.

3.2.2 Experiment

Because AquaCrop is a water-driven crop model, CC and B simulations and their uncertainty depend on estimates of the root-zone soil moisture and their uncertainty. To quantify this dependency, we utilize the same setup as in showcase 1 above for the generic crop with GDD parameterization over Europe, but we now perturb the ERA5 meteorological forcings and the soil moisture state variables in the top 10 compartments to generate an ensemble of 24 members. Each member corresponds to a model trajectory, resulting from adding small random values with zero mean to some variables, and multiplying other variables by a small random number around 1. One member is left unperturbed (see below). The random number distribution is determined by perturbation parameters (mean, standard deviation, correlation), which are spatially and temporally constant, and summarized in Table 2. The setup is inspired by state-of-the-art land surface DA studies (Kumar et al., 2008; Heyvaert et al., 2023). The resulting perturbations are applied hourly to shortwave radiation and precipitation, and daily to soil moisture variables, and all values are kept within physically possible bounds through resampling. The hourly perturbed ERA5 data are converted to daily AquaCrop forcing input of ET_o and P as in Busschaert et al. (2022). Note that unlike de Roos et al. (2024) (who used a calendar day crop parameterization), temperature is not perturbed, because this can create inconsistencies with the precomputed crop growth stages in GDD (see Sect. 2.2).

A perturbation bias correction (Ruy et al., 2008) is applied at each time step during the ensemble simulation to keep the soil moisture ensembles centered around the unperturbed deterministic simulation (one member is left unperturbed): even if all perturbations should have a zero mean, nonlinear effects could introduce a bias in the resulting ensemble mean output. Perturbation bias correction avoids that unintended biases in ensemble soil moisture propagate into the ensemble uncertainty estimates of CC and B. Because the structure and parameters of the model are kept fixed, the model here acts as a strong constraint. An ensemble run for 33 670 crop grid cells with 24 members for one simulation year takes about 90 min, when run on 128 cpus.

Table 2Ensemble perturbation parameters for showcases (SC) 2 and 3. For the forcings, downward shortwave radiation (SW) and precipitation (P) are perturbed every hour, with slightly different standard deviations (std) for SC 2 and 3. For the prognostic state variables, either soil moisture in the 10 soil compartments (θ_k, $k = 1, \dots, 10$ ) or CC and B are perturbed every day, in SC 2 and 3, respectively. Additive (+) perturbations have a mean of 0 and are drawn from a normal distribution, whereas multiplicative (×) perturbations have a mean of 1 and are drawn from a lognormal distribution. The std is shown relative to the mean (1 or 0) multiplicative or additive perturbation value. Temporal autocorrelations (tcorr) are applied through a first-order autoregressive model. Since the forcings are perturbed every hour, the tcorr needs to be at least 24 h to obtain meaningful daily aggregated perturbed forcings as input to AquaCrop. Perturbations to prognostic state variables are not cross-correlated with forcing perturbations. SC2 and SC3 use different state perturbations.

Download Print Version | Download XLSX

Simulations for 3 years, from 2015 through 2017, are used to compute the multi-year average ensemble standard deviation (also called “spread” below) in root-zone soil moisture and B for all croplands in Europe. This period covers contrasting hydroclimatic conditions across Europe, with considerable interannual and spatial variability in precipitation and soil moisture regimes, resulting in both water-limited and energy-limited conditions depending on the region. This variability is sufficient for the purpose of this showcase, namely to demonstrate the feasibility of large-scale ensemble simulations with AquaCrop, and to quantify spatial patterns in ensemble spread of root-zone soil moisture and B. The spread in CC is not further discussed, because it is primarily limited by the parameterized crop stages, which are identical for all members because the temperature (GDD) is not perturbed. For simplicity, the root-zone soil moisture spread for the maximal rooting depth (0–100 cm, 10 compartments) is computed and averaged over 3 years without accounting for the varying rooting depth and without masking for the growing season. The spread in B results from the perturbation of soil moisture (and precipitation), but is also directly influenced by the perturbation in radiation and thus ET_o (Eqs. 3 and 5), and multiple threshold-driven stress functions and feedbacks among CC, transpiration, and B. For this showcase, we only relate the ensemble spreads of soil moisture and B to each other, and to environmental conditions, more specifically to the relative soil water content (RSW). RSW is defined here as the ratio (root-zone soil water content – wilting point) $/$ (field capacity – wilting point), where the wilting point and field capacity are a function of the spatially varying soil texture.

3.3 Showcase 3: Satellite-based Data Assimilation

3.3.1 Background

Most crop DA research aims at parameter estimation, as crop parameters dictate the crop simulation dynamics (Sect. 2.2). But initial and prior state conditions at each time step also determine crop forecasts, which is the focus of this showcase. The ensemble Kalman filter (EnKF) is the most widely applied method for periodical DA into crop growth models for state updating (Ines et al., 2013) without altering the model structure. The particle filter is also emerging to deal with non-Gaussian error distributions, and to handle multi-model ensembles (Zare et al., 2024), and by extension ensemble trajectories with different (e.g. perturbed) parameters.

Satellite-based crop DA studies typically use retrievals of leaf area index (LAI), surface soil moisture, the ratio of actual evaporation to potential evapotranspiration, or a combination thereof (Vazifedoust et al., 2009). Only a few studies have used raw satellite signals (de Roos et al., 2024). In any case, the goal is to improve either or both the spatial and temporal variability of (unobserved) biomass, irrigation, or yield estimates over what a model-only simulation can achieve. The hope is that state updating will compensate for spatiotemporal errors in meteorological and parameter input that cumulate in the state memory. The success of crop DA varies widely in the literature (Pauwels et al., 2007; de Wit and van Diepen, 2007; Nearing et al., 2012; Lu et al., 2021 b, 2022) and depends on (i) the coupling strength between observed and unobserved variables, and how accurately the coupling is represented in the model, (ii) the timing of DA, and (iii) the quality of observations and the DA system. For example: some crop models are unable to adequately propagate surface soil moisture observations to root-zone soil moisture that is essential for crop growth; soil moisture DA is critical in water-limited situations but might have little impact otherwise; and LAI is more directly related to yield during some crop stages (e.g., grain filling) than others.

The AquaCrop state variables are CC, B, and soil moisture in all compartments (θ∈θ). In addition, salinity and fertility are part of the system state. These state variables retain information about past interactions and events such as water stress in the crop system. Depending on the assimilated observation type, a subset of (observable) state variables can be updated through DA, and the other variables will follow the update through model propagation. The cumulative B is best considered as a separate state variable, even if an update to CC propagates to an update in ΔB via model propagation, that is, the second term in Eq. (5). However, those updates to B through ΔB are limited by the assimilation frequency. Updating the cumulative B in AquaCrop is essential to improve yield, because B is more directly related to yield than CC (Jin et al., 2020). A direct update of B will therefore also be included in our experiment below.

Unlike many other models, AquaCrop thus computes its canopy development using CC as a state variable rather than LAI. This means that FCOVER DA is more trivial than LAI DA, because the latter would require an additional function to connect CC to LAI, which could add uncertainty. Whereas CC is strictly limited to [0,1], the upper value of LAI depends on the crop. Therefore, LAI DA in other crop models can possibly introduce larger updates in LAI (Ines et al., 2013) than what is possible with CC for AquaCrop. The consequences thereof for the estimation of biomass and yield need to be further analyzed, because the model pathways from LAI or CC to yield differ. Some studies (Dalla Marta et al., 2019; Lu et al., 2021 b; Abi Saab et al., 2021; Lu et al., 2022) have assimilated in situ or satellite-based FCOVER data at the field scale, but the potential of satellite-based FCOVER retrievals for state updating in AquaCrop across many fields at the regional scale is understudied.

3.3.2 Experiment

In this showcase, we investigate the potential of high-resolution satellite-based FCOVER DA for the estimation of winter wheat biomass and yield, over the Piedmont region in Italy, Europe. Three simulations are performed: deterministic model-only, ensemble model-only (also called open loop, OL), and DA.

AquaCrop is run at $1 / 112 °$ (∼ 900 m) resolution with crop-specific parameters for winter wheat in GDD mode, using the default conservative parameters offered by the AquaCrop database, and cultivar-specific parameters and fertility stress parameters (management) that are manually calibrated in Lanfranco (2025). The crop parameters are summarized in Appendix C. The calibration is done by minimizing the mean difference between simulated and observed yields from the Global Yield Gap Atlas (GYGA, 2021). This way, the sowing date is set at 20 October, CC_x=0.96 [m² m⁻²], and the soil fertility stress is 40 % on B, with a 5 % effect on CC_x. High-resolution spatially distributed soil texture information is taken from Geoportale Piemonte (2025 d) (1:50 000), and the topsoil texture is used for the entire profile. The latter source has been used to set the domain boundaries. The meteorological forcings are bilinearly interpolated from the forecasts of ERA5. A deterministic run with this setup for one simulation year over 19 301 crop grid cells takes about 6 min, when run on 12 cpus.

For the ensemble OL, 24 ensemble members are generated by perturbing the forcings similarly as in showcase 2, but unlike showcase 2, the soil moisture state variables are not perturbed, and the state variables CC and B are perturbed instead, as shown in Table 2. Our primary goal is to update vegetation next, and for simplicity soil moisture will be excluded from the control vector, because there is a feedback lag between soil and crop responses (see Sect. 4.4). In addition, each ensemble member j is assigned a different CC_x,j parameter (time-invariant), evenly spaced between 0.92 and 1 around the calibrated center value of 0.96 [m² m⁻²], and the sowing date is varied in response to the perturbed precipitation. The latter ensures enough forecast spread and weakens the model constraints, but none of the crop parameters is updated. Sowing can happen from 4 October onward when at least 15 mm rain falls in 4 d or less, and this must happen twice. The design of the ensembles has a strong impact on both the skill of the OL and DA, and is further discussed in Lanfranco (2025). An OL run for one simulation year over 19 301 crop grid cells takes about 30 min, when run on 36 cpus.

For the DA, we assimilate 10 d $1 / 336 °$ (∼ 300 m) CGLS FCOVER observations into the $1 / 112 °$ (∼ 900 m) gridded ensemble AquaCrop simulations for the years 2017 through 2023. The FCOVER observations are masked using yearly high-resolution crop maps, i.e. the 10 m High Resolution Layer Croplands product of Copernicus for 2017 through 2020 (CLSM HRL Crops, 2025) and detailed parcel-level regional crop maps of the Piedmont region (1:2000) for 2021 through 2023 (Geoportale Piemonte, 2025 a, b, c). If at least 70 % of the area within a ∼ 300 m FCOVER pixel is covered with winter wheat according to the crop map, then the FCOVER observation is kept. These masked ∼ 300 m FCOVER observations are then averaged to ∼ 900 m for assimilation. The FCOVER observation error standard deviation is set to 0.087 [m² m⁻²], representing the mean error documented in the CGLS product across the wheat fields and time periods considered for assimilation. When FCOVER observations are available (every 10 d) at day i, the EnKF jointly updates the forecasted AquaCrop CC and B state variables for each ensemble member j:

\begin{matrix} (11) & {\hat{[\begin{array}{c} CC \\ B \end{array}]}}_{j, i}^{+} = {\hat{[\begin{array}{c} CC \\ B \end{array}]}}_{j, i}^{-} + K_{i} (FCOVER - {\hat{CC}}^{-})_{j, i} \end{matrix}

where ${\hat{[.]}}^{-}$ refers to the forecast or prior state estimate, and ${\hat{[.]}}^{+}$ to the analysis or posterior estimate. The difference $(FCOVER - {\hat{CC}}^{-})_{j, i}$ is called the “innovation” and is not a priori bias corrected, in line with earlier LAI DA studies (Albergel et al., 2020; Scherrer et al., 2023). The ensemble innovations are mapped to analysis increments for CC_j,i and B_j,i through the Kalman gain K_i. The K_i is derived from ensemble forecast error (co-)variance matrices and the set observation error variance. The increments are limited to what is physically possible for each ensemble member given its own parameter constraints. Specifically, ${\hat{CC}}_{j, i}^{+}$ cannot exceed CC $_{pot, sf, j, i}$ (which depends on the perturbed CC_x,j parameter). Note that soil moisture is also part of the state, but it is not perturbed or directly updated here. A DA run for one simulation year over 19 301 crop grid cells takes about 45 min, when run on 36 cpus.

The deterministic, OL, and DA runs are evaluated for the times and grid cells with wheat during the years 2017–2023. For the OL and DA simulations, the ensemble mean output is evaluated. The 10 d averaged CC estimates are compared with the assimilated CGLS FCOVER to check internal consistency, and the 10 d averaged ΔB estimates are derived from the B time series and evaluated against CGLS DMP retrievals, both only during the growing season. Only positive ΔB values are retained. Furthermore, yield estimates are evaluated against in situ data of end-of-season yield, using regional survey data from RICA (2025) after aggregation to the municipality level. These yield data are based on individual fields and cover only a small fraction of all wheat fields that are modeled (and for which FCOVER data are extracted).

4 Results and Discussion

4.1 Showcase 1: Crop Parameterization

Figure 4 shows an example of AquaCrop CC and ΔB time series, obtained with a generic crop parameterization either in calendar or in GDD mode, along with independent CGLS reference data for a location in East Europe. The simulations in calendar mode use uniform parameters, whereas those in GDD mode use parameters constrained by satellite information (VIIRS-GLSP). At this location, the 10 d averaged AquaCrop ΔB compares well with the CGLS DMP for both crop parameterizations, but the parameterization in GDD outperforms the parameterization in calendar days in terms of CC compared to CGLS FCOVER data: the seasonal variability of CC in GDD mode is more in line with the reference FCOVER data, mainly due to an improved timing of the crop stages. At this location, the time series correlation over the maximal growing period (defined in Sect. 3.1) increases from 0.51 to 0.69 for CC, and 0.89 to 0.92 for ΔB. Since the CC_x parameter is not calibrated in either approach (Appendix B), there is a trivial absolute bias that is further amplified for the crop in GDD mode due to its more realistic, but lower, early- and late-season CC values. This bias also propagates into the root-mean-square difference (RMSD) metrics.

The four left panels of Fig. 5 show multi-year averaged CC and ΔB for both crop parameterizations, for the maximal growing period. The averaged CC is higher for the crop in calendar days (fixed growing season) than in GDD mode, and it has a latitudinal pattern that responds to the temperature pattern, discussed in Appendix C. The CC averages are much lower for the crop in GDD mode, because the growing season is generally shorter (Fig. C1). The four right panels of Fig. 5 show the time series correlation between the AquaCrop simulations and independent FCOVER and DMP satellite data for all cropland grid cells in Europe. The simulations with GDD crop parameterization improve CC over almost the entire European continent, as can be seen in Fig. 5d, except in Norway. In cold regions, the growing season length computed in GDD can vary substantially between years (Appendix B) and a median parameter estimate as input may lead to high errors. In addition, the VIIRS-GLSP product may experience errors due to early or late snow. A more realistic CC and onset of B accumulation improves ΔB for much, but not all of Europe (Fig. 5h), due to spatial variability in several factors – related to both the modeling framework and the reference data. A first factor is inaccurate simulation of transpiration, water stress, and other stresses. In this study, we applied vertically uniform soil profiles, and a spatially uniform parameterization of the generic crops (only the crop stages, CGC and CDC vary spatially), introducing spatial modeling errors. Second, the reference DMP dataset is retrieved using many simplifying assumptions, and it represents a mixture of different crop types after aggregation.

https://gmd.copernicus.org/articles/19/2551/2026/gmd-19-2551-2026-f04

Figure 4Example time series of AquaCrop (a) CC and (b) ΔB for generic crops in calendar and GDD mode, together with reference CGLS FCOVER and DMP, respectively. The GDD mode uses satellite-based crop parameter constraints. The location is marked on Fig. 5a.

Download

https://gmd.copernicus.org/articles/19/2551/2026/gmd-19-2551-2026-f05

Figure 5(a, b) Maps of multi-year averaged AquaCrop CC for a generic crop (a) with uniform parameters in calendar day mode and (b) with satellite-based crop parameter constraints, in GDD mode. (e, f) same as (a, b) but for ΔB. (c, d) Performance of 10 d AquaCrop CC in terms of time series correlation (R) between simulations and CGLS FCOVER for a generic crop in (c) calendar mode, and (d) difference in R for simulations with crop parameters in GDD and calendar day mode. (g, h) same as (c, d), but for ΔB evaluated against CGLS DMP. All panels are computed across the maximal growing period (across both simulations and CGLS data, defined in Sect. 3.1) of 2015–2020. The center of the circle in panel (a) marks the location for Fig. 4.

4.2 Showcase 2: Ensemble Simulations

Figure 6a shows the 3-year average relative soil water content (RSW, defined in Sect. 3.2) and Fig. 6b the ensemble standard deviation (spread, uncertainty) in root-zone soil moisture. The ensemble spread in soil moisture is often higher in regions with low average RSW that indicate water-limited conditions (e.g. Spain). The associated spread in B is shown in Fig. 6c and computed at the end of the growing season when the ensemble mean B is maximal. More specifically, the ensemble standard deviation is computed for the percentage deviations from the end-of-season ensemble mean B for each year, and the average spread across the 3 years is shown.

Regions with a high average relative B spread are often, but not exclusively, associated with water-limited conditions (in the absence of simulated irrigation) and high soil moisture spread. AquaCrop’s stress functions translate soil moisture into nonlinear physiological responses. This implies that the relationship between soil moisture and B spread is nontrivial and varies across both time and space. Furthermore, the ensemble spread in B is directly influenced by the ensemble perturbation of radiation (and thus ET_o) and the magnitude of maximum biomass, which is highest at intermediate latitudes where the balance between temperature, radiation and water availability is optimal (Fig. 5e).

Figure 7a and b further illustrate the sensitivity of B to soil moisture for two locations with different water stress levels, but both on silt loam soils (identical wilting point and field capacity). Here, both the ensemble and temporal sensitivity of B to soil moisture can be seen. The site in Fig. 7a experiences minimal water stress, supports yearly high B production, and shows little ensemble sensitivity of B to soil moisture. In contrast, Fig. 7b depicts a more southern location where the relative ensemble B spread is higher, as more ensembles members of soil moisture more frequently approach the wilting point. Under these conditions, interannual variations in soil moisture also directly translate into interannual B variations.

https://gmd.copernicus.org/articles/19/2551/2026/gmd-19-2551-2026-f06

Figure 6Multi-year averages of (a) relative root-zone soil water content (RSW), (b) ensemble spread in root-zone soil moisture (SM), and (c) ensemble spread in end-of-season (max) B, for 2015 through 2017. The centers of the circles in panel (b) mark the locations for Fig. 7.

https://gmd.copernicus.org/articles/19/2551/2026/gmd-19-2551-2026-f07

Figure 7Time series of (line) ensemble mean root-zone soil moisture (SM) and biomass, along with (shading) their ensemble spread, for a location (a) without water limitations and (b) with water limitations, both marked on Fig. 6b. The lines for FC and WP refer to the location's water contents at field capacity and wilting point.

Download

4.3 Showcase 3: Satellite-based Data Assimilation

The average 2017–2023 yield values observed for winter wheat per municipality in the Piedmont area are shown in Fig. 8a, along with the number of samples for each value. The samples come from different fields (locations and sizes not disclosed) within the municipality each year, and are not collected every year (Fig. 8c). The goal of this showcase is to steer AquaCrop yield simulations to the reference yield values by assimilating completely independent FCOVER satellite data, when and where wheat is present. The number of assimilated FCOVER observations per grid cell is shown in Fig. 8b. Since the availability of satellite observations is rather uniform, the number of assimilated observations here reflects how often a ∼ 900 m grid cell contains wheat fields during the 7-year study period.

https://gmd.copernicus.org/articles/19/2551/2026/gmd-19-2551-2026-f08

Figure 8(a) Average yield reference values for winter wheat per municipality, for 2017–2023. The dot size represents the number of samples in time and space within one municipality. (b) Total number of assimilated FCOVER retrievals for each ∼ 900 m model grid cell. The central outline in both plots delineates the Piedmont study area. Panel (a) also shows municipalities, and the blue arrow marks the municipality overlying the grid cell in Fig. 9, and panel (b) shows the borders of Italy. (c) Sketch of the various spatial supports of gridded modeling, satellite observations, and reference yield data at the municipality level. The green polygons are wheat fields according to crop maps. The number of fields contributing to each yearly yield estimate is known, but locations are not.

Figure 9a shows time series of CC for the deterministic, ensemble OL, and DA simulation, along with the assimilated FCOVER observations, for a single model grid cell within the municipality of Tortona, Province of Alessandria (marked on Fig. 8). Note that the location of the wheat fields within this grid cell differs every year and the FCOVER data are thus extracted over different parts of the model grid cell. For the deterministic run, the sowing date is fixed on 20 October, and the yearly maximal CC is close to CC_x,sf, i.e. CC_x (0.96 m² m⁻²) modulated by a fertility stress effect of 5 %, with little interannual variability. This means that there is little variation in water stress across the years (winter-spring) at this location. The growing season length varies with the number of heat units (GDD). The OL CC simulations are often lower than the deterministic ones, because (i) the dynamic sowing criterion delays the start of the growing compared to the deterministic run which assumes a (too early, homogeneous) fixed sowing date on 20 October, and (ii) the perturbations to CC cannot exceed CC_pot,sf, leading to a perturbation bias. (In some locations, the OL can advance the sowing date, which leads to a slightly higher CC at the beginning of the growing season.) Both the deterministic and OL simulations often track the independent FCOVER observations already very well. At this location, the model overestimates CC in 2020, 2021 and 2022. The DA pulls the simulations towards the FCOVER observations, as intended. However, from senescence onward, the CC estimates no longer depend on values of the previous time step (Appendix A), and DA updates to CC do no longer propagate in time.

The ΔB estimates are derived from daily differences of updated B time series and are shown for the various simulations in Fig. 9b. Again, the precipitation-dependent sowing criterion in the OL corrects the early erroneous ΔB that are simulated by the deterministic run with a fixed sowing date at this location. In the DA, the ΔB nicely responds to CC updates, e.g. by pushing the high productivity period to a later time in 2020. In doing so, the DA output corresponds better with the independent DMP reference data (not assimilated).

Figure 9c shows the simulated yield time series for the single example model grid cell, and the observed values for the entire overlying municipality. The deterministic simulation shows very little variation across the years (discussed below) and the OL yield is consistently lower than the deterministic simulation at this location. FCOVER DA only has a small impact on yield estimates, and the variation introduced through DA does not better align with that of the reference yield data. However, the reference yield data pertain to varying fields over the years, and can thus not be directly compared to the yield simulations nor to the assimilated FCOVER observations for a single model grid cell in Fig. 9a (to do so, multiple model grid cells need to be aggregated).

https://gmd.copernicus.org/articles/19/2551/2026/gmd-19-2551-2026-f09

Figure 9Time series of (a) CC, (b) ΔB, (c) yield and (d) root-zone soil moisture (SM, smoothed with a 7 d window), for a grid cell within the municipality of Tortona, Province of Alessandria (centered at $44 ° 54^{'} {16.6}^{''}$ N, $8 ° 47^{'} {39.8}^{''}$ E). Note that the wheat fields within this grid cell rotate every year. The lines refer to deterministic (Det), ensemble mean OL and DA simulations, with the shading reflecting the OL and DA ensemble spread. The dots refer to (a) assimilated 10 d FCOVER retrievals extracted over the wheat field(s) of each year, (b) independent 10 d DMP retrievals, and (c) in situ yield data for the entire municipality (marked by a blue arrow in Fig. 8a, covering many other model pixels in addition to the single one selected to plot the model results).

Download

A summary of the spatiotemporal performance metrics for the three simulations is given in Table 3. For CC, the ensemble OL outperforms the deterministic run in terms of correlation (R), RMSD and bias, across all grid cells and times with wheat. This is because the precipitation-dependent varying sowing date is likely more in line with field practices. By design, FCOVER DA further improves CC in all metrics. Following CC, the ensemble OL and DA improve ΔB over the deterministic run, with the R increasing from 0.52 (and 0.64) for the deterministic (and OL) output to 0.75 for the DA output. For the yield, the model evaluation is done after spatial aggregation to the municipality level. The deterministic AquaCrop yield estimates only deviate from the in situ observations with a small bias of −0.08 t ha⁻¹, which is a natural result of the calibration against the GYGA data. However, the simulated variation in time and space is very low, leading to a very low R of 0.09 for the deterministic run. This is even further deteriorated in the OL simulations (R=0.07) and the DA can only recover some variation (R=0.12), but at the expense of more bias (−0.78 t ha⁻¹).

Table 3Spatiotemporal performance of the deterministic (Det), OL and DA simulations in terms of R, RMSD and bias, for CC and ΔB against satellite data, and end-of-season yield against in situ data, across all grid cells (or municipalities for yield) and times with winter wheat during 2017–2023.

Download Print Version | Download XLSX

Figure 10a shows that the simulated variation in the deterministic yield (spatially aggregated over the municipality) across all years and municipalities is low and does not agree with the reference data (R=0.09). The OL in Figure 10b has a stronger negative bias (−0.43 t ha⁻¹). DA introduces variability in Fig. 10c, leading to a slight improvement in spatiotemporal yield pattern (R=0.12). However, the RMSD and bias increase, because the DA increments are biased negative in much (not all, see Lanfranco, 2025) of the study domain, due to model constraints: CC_i cannot be updated above the CC $_{pot, sf, i}$ parameter (Sect. 2.2, Appendix A), which is set by uniform crop and fertility parameters, and the yield range is limited by CC $_{pot, sf, i}$ and HI_o parameters.

The very low spatiotemporal variation in simulated yield estimates is thus due to parameter constraints, and also a too low sensitivity of yield to environmental conditions (weather, soil). There might also be a model coupling bias (Crow et al., 2024) due to crop (or other) parameter choices that prevent an adequate propagation of CC and B updates to yield updates. Furthermore, the FCOVER observations might not be sufficiently informative about wheat yield, due to some mixing of other crops in the 300 m data, retrieval assumptions, or because FCOVER retrievals cannot capture some stresses (e.g. stomatal closure due to water stress). Finally, the yield reference data are not representative of the entire municipality: the dot sizes of the scatter plots in Fig. 10 highlight that many yield data points are based on just one or a few fields per year in the entire municipality (Fig. 8c).

https://gmd.copernicus.org/articles/19/2551/2026/gmd-19-2551-2026-f10

Figure 10Spatiotemporal performance of the deterministic, OL, and DA simulations for end-of-season yield against in situ data, across all municipalities and years with winter wheat in 2017–2023. The dot size refers to the number of individual fields that contribute to the municipality average, for each observed yearly yield estimate.

Download

4.4 Pathways Forward

The AquaCrop model alone already offers good simulations of seasonal crop development for generic (showcase 1) and specific (showcase 3) crops, at both coarse and fine resolutions, when adequate crop parameters are given. However, when parameters related to the sowing date, crop stages, fertility, and maximum CC_x are fixed, then the interannual and spatial variation in crop simulations are limited. Satellite observations can be used to a priori determine spatially varying crop parameters (showcase 1) or to sequentially update the in-season crop state (showcase 3) and parameters, and thereby improve spatiotemporal patterns in crop simulations. However, future research is needed to optimize crop modeling and DA.

First, showcase 3 shows that crop state updates through satellite DA are beneficial but limited by parameter constraints. For example, crop growth can be advanced or delayed through DA, but the dates to switch to another crop stage remain unaltered, and the updated CC cannot exceed CC_pot,sf. Future work should focus on adapting the timing of the stages to updated CC levels, i.e. to make the crop stages truly state-dependent, rather than precomputing them based on knowledge of the temperature record. Another option is to update (a subset of) the crop and fertility parameters along with the state during the DA: more variation in parameters could result in improved spatiotemporal variation of crop simulations, and could support more state update flexibility. Furthermore, parameter updating could possibly improve the model coupling, i.e. the propagation of updates from observable model variables (e.g. CC, θ₁, …) to (unobserved) variables, such as yield. To keep each model trajectory self-consistent in the presence of the strong model structural constraints set by AquaCrop, particle filtering might be recommended over EnKF for joint parameter and state updating. This option is available in LISF, but not yet tested with AquaCrop.

Second, a crucial aspect of satellite DA is the characterization of forecast and observation errors. In showcases 2 and 3, we introduced perturbations to create ensemble forecasts, guided by iterative study, expert insights into the AquaCrop model, and land surface data assimilation studies (Kumar et al., 2008; Heyvaert et al., 2023). However, the forecast perturbations and observation errors should be further optimized (e.g. varying in space and time) to enhance the effectiveness of DA. Lanfranco (2025) shows how removing perturbation bias correction and assigning distinct parameters to individual ensemble members, as in showcase 3, causes ensemble OL simulations to deviate more from the deterministic ones. Furthermore, given the bounded nature of key variables such as CC, an analysis of ensemble medians rather than means as simulation output can be considered.

Third, showcase 3 only updates crop state variables directly, and soil moisture follows through model propagation (Fig. 9). Soil moisture is excluded from the update vector due to the lag between soil and crop responses. However, soil moisture updating can correct for water stresses and biomass will respond to this (as suggested by showcase 2). A joint update of soil moisture and crop variables through multivariate and multi-sensor DA (Heyvaert et al., 2024) might further improve estimates of the entire crop-water system. Additionally, satellite DA can aid in estimating other variables, such as irrigation (Massari et al., 2021; Busschaert et al., 2024; Corbari et al., 2025).

Fourth, our study uses satellite data, whereas most previous studies have relied on field observations to improve crop simulations. The satellite retrievals may not accurately capture the spatiotemporal variation in crop conditions, due to e.g. assumptions in the retrieval algorithm, crop classification errors, or insensitivity to some stresses (e.g. stomatal closure due to water). Future work can explore other or higher resolution data, e.g. from Sentinel-2.

Finally, the evaluation data are prone to errors. As noted, satellite retrievals are not perfect, but neither are yield or other in-field reference data. Figures 8 and 10 highlight that some yield data are based on very few samples from varying fields of different sizes, and these might not be representative for grid cell estimates that are aggregated to the municipality level. A systematic collection and distribution of field data would help the development and tuning of future crop DA systems.

5 Conclusions

Crop modeling and data assimilation can be boosted by exploiting the increasing availability of compute power and satellite data. By incorporating AquaCrop v7.2 into the NASA LISF v7.5, it is possible to run AquaCrop at any spatial resolution, with a range of different input sources for meteorology, soil and crop parameters. Furthermore, the system facilitates producing ensembles to obtain uncertainty estimates of crop variables, and it gives access to tools for satellite data assimilation. This opens unprecedented opportunities for regional biomass, yield and irrigation estimation. However, crop models are often structurally less flexible than the land surface models that are typically used within the LISF, and therefore a community-based effort is needed to advance crop data assimilation.

In this paper, we present three exploratory applications of AquaCrop v7.2 within LISF v7.5, employing different study domains, spatial resolutions, perturbation settings and other input. First, the potential of using satellite information to constrain spatially distributed generic crop parameters in GDD mode is illustrated for coarse-scale simulations of canopy cover and biomass over Europe. Compared to a uniform calendar-day parameterization, spatially variable GDD growth stage parameters consistently improve simulations of canopy cover for a generic crop over all of Europe, and biomass simulations improve for much of the southern and eastern parts of Europe.

Second, ensemble simulations are created by perturbing meteorological input and soil moisture state variables for coarse-scale simulations over Europe, using the satellite-constrained generic crop parameterization in GDD. These ensembles confirm that biomass uncertainty is most sensitive to uncertainties in root-zone soil moisture in water-limited regions. This is in line with the expectation (by design) that for these regions, the interannual variation in soil moisture determines the temporal variation in biomass estimates. These findings allow to anticipate what soil moisture data assimilation may (or may not) offer to improve biomass in upcoming studies.

Third, high-resolution satellite-based FCOVER data assimilation is performed over the Piedmont region in Italy, focusing on winter wheat fields. The goal is to improve spatiotemporal estimates of canopy cover, and the unobserved biomass and yield. Through FCOVER assimilation, the canopy cover is improved by design, and the intermediary biomass is also improved. FCOVER assimilation also slightly improves the yield estimates, but the impact is very small. The model on its own already performs well in terms of mean absolute yield values (root mean square difference RMSD < 1.5 t ha⁻¹), but performs poorly in spatiotemporal variability (correlation R=0.07 for ensemble open loop). Data assimilation cannot yet much improve the yield estimates (R=0.12, RMSD =1.67 t ha⁻¹) because of strong model constraints related to the timing of the crop stages, the maximum canopy cover, fertility and harvest index as parameters. The assimilated satellite-based canopy cover observations are likely also not sufficiently informative of yield. Furthermore, the yield reference data pertain to (at most a few) individual fields which vary in time and space and are hard to directly compare to gridded model or data assimilation estimates. More assimilation impact can be expected by making the model crop stages state-dependent, by joint state and parameter updating, or by assimilating higher resolution satellite data.

Despite their continental coverage (showcases 1 and 2), or high resolution (showcase 3), all showcase experiments are completed in a few hours of walltime, when parallelized on a high-performance Linux cluster. The computational efficiency of NASA LISF and the open-source AquaCrop model are encouraging to advance regional-scale and high-resolution crop modeling and data assimilation. This efficiency enables future multi-sensor and multi-variate assimilation to effectively constrain the soil water, fertility, and vegetation components of crop models.

Appendix A: AquaCrop Canopy Development

The canopy development in AquaCrop is determined by piecewise functions, described below. The same functions are used to calculate the potential CC_pot,i, the potential CC $_{pot, sf, i}$ with fertility stress only (no water stress), and the actual CC_i [m² m⁻²] which accounts for all stresses. However, CC_pot,i is computed with time-invariant parameters $α_{cc} = [{CC}_{o}, {CC}_{x}, CGC, CDC,$ times to crop stages], whereas the potential CC $_{pot, sf, i}$ uses time-variant parameters $α_{cc, i} ({CC}_{i - 1}, θ_{i})$ that are adjusted for fertility stress only, and the actual CC_i uses time-variant parameters $α_{cc, i} ({CC}_{i - 1}, θ_{i})$ that are adjusted for water, fertility and salinity stresses and CC_i−1 until senescence. If fertility stress is set, then CC_pot,i without any water stress is effectively constrained to CC $_{pot, sf, i}$ , which is determined by (among others) the time-variant CC $_{x, sf, i}$ parameter. Specifically, fertility stress is used (i) to rescale (increase or decrease) the maximum attainable CC_x parameter to CC $_{x, sf, i}$ , (ii) to determine the decrease in CC $_{x, sf, i}$ during the mid-season stage, and (iii) to adjust WP_i (Eq. 5).

The stage-dependent piecewise function for actual CC_i (determined by e.g. CC $_{x, i} \in α_{cc, i}$ ) is defined as follows:

\begin{array}{l} (A1) & {CC}_{i} = f_{cc} (i, α_{cc, i} ({CC}_{i - 1}, θ_{i})) = \\ \{\begin{array}{lr} {CC}_{o, i} e^{{CGC}_{i} Δ t_{0 \to i}} & {CC}_{i} \leq {CC}_{x, i} / 2, and t_{0} \leq i \\ {CC}_{x, i} - 0.25 \frac{{CC}_{x, i}^{2}}{{CC}_{o, i}} e^{{CGC}_{i} Δ t_{0 \to i}} & {CC}_{x, i} / 2 < {CC}_{i} \leq {CC}_{x, i} \\ {CC}_{x, i} & t_{ccx} \leq i < t_{se} \\ {CC}_{x, i} (1 - 0.05 (e^{\frac{3.33 {CDC}_{i}}{{CC}_{x, i} + 2.29} Δ t_{se \to i}} - 1)) & t_{se} \leq i \end{array} \end{array}

with t₀ indicating the day of crop emergence (defined as input parameter ∈α_cc), t_ccx representing the first day on which CC_x,i is achieved (diagnosed during the simulation), and t_se the day when senescence sets in (defined as input parameter ∈α_cc). The expression $Δ t_{. \to i}$ refers to the number of days from a specific crop stage to the current time i.

The first two function pieces describe the crop growth from emergence up to maximum canopy, here marked as CC_x,i. However, depending on whether we compute CC_pot,i, CC $_{pot, sf, i}$ , or actual CC_i, the maximum canopy refers to either the given parameter CC_x, its equivalent CC $_{x, sf, i}$ modulated by fertility stress alone, or its equivalent modulated by water, fertility and salinity stresses CC_x,i. The same holds for the CGC_i (and later the CDC_i) parameter. The third piece covers the period between reaching CC_x,i until the onset of canopy senescence. The fourth piece represents the canopy decline after senescence until maturity and the end of the season.

For the first three pieces, the parameters α_cc,i are scaled by CC_i−1, so that CC_i is effectively based on memory of the previous vegetation state. However, for the last piece, the parameters are only rescaled by CC_se, i.e. CC_i at the time of senescence, and CC_i forecasts no longer depend on CC_i−1 values, soil moisture or fertility conditions. Again, for CC_pot,i, the canopy development is described by the same functions as Eq. (A1), but the parameters α_cc are independent of water or fertility conditions, i.e. CC $_{pot, i} = f_{cc} (i, α_{cc})$ (Eq. 2).

Appendix B: Crop Parameterization Using GLSP Satellite Data

Showcase 3 uses satellite data to estimate crop parameters related to crop stages, growth and decline. Only one crop season per calendar year is considered. The first full crop season within the calendar year is extracted from the GLSP product (Zhang et al., 2022) for each year in the period 2013–2023. The GLSP dataset provides estimates of crop stages (A–F in Fig. 3) in day of the year (DOY) that need to be converted to growing degree day (GDD), if AquaCrop is run in GDD mode.

Due to a lack of information on sowing or planting, the simulations are initialized on the first of January each year, and the GDD value for each crop stage is computed relative to this start date for simplicity. The time in GDD at CC_o is used to define the time from transplanting to recovered transplant, where the CC_o immediately starts at a value of 0.1 [m² m⁻²] as a recovered transplant. The GDDs [°C d] for each crop stage are calculated following the default AquaCrop method (Raes et al., 2026) by cumulating only positive residual average temperatures after subtracting a crop-specific base temperature T_base [°C], below which crop development does not occur, i.e.

\begin{matrix} (B1) & GDD = T_{avg} - T_{base} \end{matrix}

The average day temperature T_avg is the average of that day's maximum and minimum temperature, after limiting the maximum temperature to the range set by T_base and T_upper. The latter is the upper temperature at which crop development no longer occurs and it is set to an estimated value of 30 °C. T_base is set to a universal value of 5 °C, in line with e.g. Zaks et al. (2007).

Specifically, the DOY values are extracted from the GLSP dataset for each 0.05° grid cell and for the different phenology stages (A, B, C, D, E) shown in Fig. 3. The daily maximum and minimum 2 m air temperature from ERA5 is then used to calculate the cumulative GDD for each AquaCrop crop stage (I to V, see Table 1), starting from the day of transplanting. This is done for each individual year, with the ERA5 meteorology bilinearly interpolated to the 0.05° resolution. The median GDD value across the 11 years is then taken to determine the crop stages I through V as parameters in the crop file.

Next, the CGC and CDC parameters are estimated by inverting the exponential growth function of the first piece of f_cc(.) (Eq. A1), i.e. from emergence to CC $_{x} / 2$ , and the exponential decay function of f_cc(.) for the period between onset of senescence (point D in Fig. 3) and 50 % senescence (point E in Fig. 3). It is assumed that the potential CC_o=0.1 [m² m⁻²] and the maximal attainable CC_x=0.85 [m² m⁻²] are given, and that the observed actual timing of the median GLSP stages coincides with the timing of the unobserved potential timing of the required AquaCrop stages (Fig. 3). The potential CC_o and CC_x could be further calibrated in the future. CGC [°C⁻¹ d⁻¹] is found using the AquaCrop function for CC $_{i} \leq {CC}_{x} / 2$ , with Δt_0→i the time between CC_o (at time A in Fig. 3) and CC $_{i} = {CC}_{x} / 2$ (at time B in Fig. 3) in GDDs:

\begin{matrix} (B2) & {CC}_{i} = {CC}_{o} e^{CGC Δ t_{0 \to i}} \to CGC = \frac{\ln \frac{{CC}_{x} / 2}{{CC}_{o}}}{Δ t_{A \to B}} \end{matrix}

where the time (in GDDs) to reach CC $_{x} / 2$ is taken from the GLSP dataset (GDDs between point A and B, Δt_A→B).

Similarly, for CDC [°C⁻¹ d⁻¹], the canopy decline function is used, with Δt_se→i the time between start of senescence (CC_x at time D in Fig. 3) and CC $_{i} = {CC}_{x} / 2$ (at time E in Fig. 3) in GDDs:

\begin{matrix} (B3) & \begin{aligned} {CC}_{i} & = {CC}_{x} [1 - 0.05 (e^{\frac{3.33 CDC}{{CC}_{x} + 2.29} Δ t_{se \to i}} - 1)] \\ \to CDC = \frac{2.40 ({CC}_{x} + 2.29)}{3.33 Δ t_{D \to E}} \end{aligned} \end{matrix}

where the time in GDDs from the start of senescence (CC_x) to reach 50 % of canopy decline (CC $_{x} / 2$ ) is used, indicated by the time in GDDs between point D and E (Δt_D→E).

Finally, the time to maximum rooting depth is set to 0.7⋅D based on an expert guess. This is based on the assumption that (i) this stage will not be reached before mid-season or the time of CC_x, (ii) it should occur before senescence, and (iii) roots grow about 1 cm d⁻¹ in warm soils to a general maximum rooting depth of about 1 m (Allen et al., 1998).

Note again that AquaCrop requires estimates of the crop stages for the potential, i.e. unstressed, CC_pot,i curve. Here, it is assumed that these potential stages have the same timing as the actual stages observed by the satellite data. The CC_pot,i curve is defined by the slopes of CGC and CDC, derived above, and the lower and upper thresholds of CC_o and CC_x, which are guessed. As the CGC and CDC are derived from satellite-based and thermal data, stresses that could affect growth speed or senescence are inevitably included in the parameter estimates. By taking the median value for the GDD stages, from which CDC and CGC are derived, it is assumed that the most representative stages are used for each grid cell. However, the values for CC_o and CC_x are guessed and spatially constant in this study. Better estimates could potentially be derived from the VIIRS-GLPS dataset: CC_o could be estimated by taking the maximum CC_i over all years at the DOY of crop stage I, and CC_x can be approximated by taking the maximum CC_i over the entire growing season. However, these estimates would be strongly affected by stresses, incl. fertility stress, and future research should attempt to optimize effective potential (unstressed) parameter estimates.

Appendix C: AquaCrop Crop Parameters

Crops parameters are an essential input to AquaCrop. Showcase 1 compares two generic crops, one parameterized with spatially uniform stages in calendar days, and one parameterized with spatially-varying satellite-based stages in GDD mode. Showcase 2 continues with the generic crop in GDD mode. The generic crops do not produce yield and are not calibrated. Showcase 3 uses a specific winter wheat crop in GDD mode. This crop is calibrated for the Piedmont area in Italy, using yield reference data, while constrained by ERA5 forcings, and HWSD soil texture mapped to soil hydraulic parameters using the internal AquaCrop v7.2 lookup table. The crop parameters are summarized in Table C1.

The effective growing season length for the two generic crops is compared in Figure C1, with the growing season defined from the first time CC exceeds the initial CC_o until CC drops to 0 m² m⁻². Even if the crop in calendar days is intentionally parameterized to potentially span the entire year (de Roos et al., 2021), cold stresses typically prevent growth in the first month(s), effectively limiting the growing season length. The date of senescence is fixed. This results in a latitude-dependent growing season length, both in terms of days and cumulated GDD (Fig. C1a, c). This pattern is not seen for the generic crop in GDD mode, which uses satellite-based phenology to parameterize the stages. For the crop in GDD mode, the relative coefficient of variation is higher, i.e. the growing season length is shorter, and the variation across Europe is similar to that for the crop in calendar day mode (Fig. C1b, d).

Table C1AquaCrop crop parameters for (i) the generic crop in calendar days (showcase 1), (ii) generic crop with satellite-based constraints in GDDs (showcase 1, 2), and (iii) winter wheat in the Piedmont area (showcase 3). ^* refers to the VIIRS-GLSP crop stages in GDD, as in Table 1.

¹ Some entries are still listed in the crop file, but not used in AquaCrop v7.2 and therefore set to −9 in all cases. ² More generally, yield formation period (identical to growing cycle for vegetative plants). ³ The impact or effect directly applies to biomass, not directly to HI (but indirectly it does).

Download XLSX

https://gmd.copernicus.org/articles/19/2551/2026/gmd-19-2551-2026-f11

Figure C1Multi-year (2015–2020, as in showcase 1) averaged growing season expressed in (a, b) calendar days and (c, d) GDD, for generic crops parameterized with (left) spatially uniform stages in calendar day mode, and (right) satellite-based stages in GDD mode.

Code and data availability

The AquaCrop code is available at https://github.com/KUL-RSDA/AquaCrop (last access: 21 March 2026) under a BSD-3 license. The version 7.2 of the model used for this paper is archived on zenodo with doi https://doi.org/10.5281/zenodo.17140665 (Busschaert et al., 2025 a). The implementation within NASA's LIS v7.5 is available at https://github.com/NASA-LIS/LISF (last access: 21 March 2026) under an Apache-2.0 license. The specific LIS code that includes AquaCrop v7.2 plus some bug fixes is available at https://github.com/KUL-RSDA/LISF/tree/working/ac72_GMD (last access: 21 March 2026) and also on zenodo, along with the corresponding LIS input for the three showcase experiments through doi https://doi.org/10.5281/zenodo.18458265 (Bechtold et al., 2026).

Author contributions

GDL, MB, LB, SdR, ZH, JM, MB, SAS, and MVdB contributed to the open source Fortran code of AquaCrop v7.0 and higher versions. LB and MB led the implementation of AquaCrop inside NASA LIS with the support of SK, DM, and EK. MB performed the research for showcase 1 and 2, with help from SdR for showcase 1. NL performed the research for showcase 3 with support from LB. DR is the main AquaCrop developer in Delphi and MG, PS, LH contributed to the initial AquaCrop code. GDL prioritized the research, provided supervision, managed the funding, and wrote the paper. All authors contributed to the editing of the paper.

Competing interests

The contact author has declared that none of the authors has any competing interests.

Disclaimer

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. The authors bear the ultimate responsibility for providing appropriate place names. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Acknowledgements

The authors acknowledge the constructive review comments by Matthew McCabe, an anonymous reviewer and the editor. The implementation of AquaCrop in LIS benefited from help of James Geiger, the original Delphi AquaCrop source code was co-developed by Elias Fereres, and some scientific tests were supported by Vincent Deketelaere. Other technical support was offered by Johan Boon and Samuel Corveleyn. The computer resources and services were provided by the High Performance Computing system of the Vlaams Supercomputer Center, funded by FWO and the Flemish Government.

Financial support

This research has been supported by the Fonds Wetenschappelijk Onderzoek (grant nos. 1158423N and Storage4Climate), the Horizon 2020 (grant no. 773903), the KU Leuven (grant no. C14/21/057), and the Belgian Federal Science Policy Office (grant no. SR/00/412).

Review statement

This paper was edited by Yuanchao Fan and reviewed by Matthew McCabe and one anonymous referee.

References

Abi Saab, M. T., El Alam, R., Jomaa, I., Skaf, S., Fahed, S., Albrizio, R., and Todorovic, M.: Coupling Remote Sensing Data and AquaCrop Model for Simulation of Winter Wheat Growth under Rainfed and Irrigated Conditions in a Mediterranean Environment, Agronomy, 11, https://doi.org/10.3390/agronomy11112265, 2021. a

Akbari Variani, H., Afshar, A., Vahabzadeh, M., Molajou, A., and Akbari Varyani, M. M.: Development of a novel framework for agriculture simulation model for food-energy-water nexus analysis in watershed-scale, J. Clean. Product., 429, 139492, https://doi.org/10.1016/j.jclepro.2023.139492, 2023. a

Albergel, C., Zheng, Y., Bonan, B., Dutra, E., Rodríguez-Fernández, N., Munier, S., Draper, C., de Rosnay, P., Muñoz-Sabater, J., Balsamo, G., Fairbairn, D., Meurey, C., and Calvet, J.-C.: Data assimilation for continuous global assessment of severe conditions over terrestrial surfaces, Hydrol. Earth Syst. Sci., 24, 4291–4316, https://doi.org/10.5194/hess-24-4291-2020, 2020. a

Allen, R. G., Pereira, L. S., Raes, D., and Smith, M.: Crop evapotranspiration: Guidelines for computing crop water requirements (FAO Irrigation and Drainage Paper No. 56), Tech. rep., FAO, Italy, https://www.fao.org/4/x0490e/x0490e0e.htm#soil%20water%20availability (last access: 21 March 2026), 1998. a

Bechtold, M., Busschaert, L., Lanfranco, N., de Roos, S., and De Lannoy, G.: LIS source code and input for showcase simulations with AquaCrop v7.2 (De Lannoy et al., 2026, GMD), Zenodo [code and data set], https://doi.org/10.5281/zenodo.18458265, 2026. a

Bregaglio, S., Ginaldi, F., Raparelli, E., Fila, G., and Bajocco, S.: Improving crop yield prediction accuracy by embedding phenological heterogeneity into model parameter sets, Agr. Syst., 209, 103666, https://doi.org/10.1016/j.agsy.2023.103666, 2023. a

Busschaert, L., de Roos, S., Thiery, W., Raes, D., and De Lannoy, G. J. M.: Net irrigation requirement under different climate scenarios using AquaCrop over Europe, Hydrol. Earth Syst. Sci., 26, 3731–3752, https://doi.org/10.5194/hess-26-3731-2022, 2022. a, b, c

Busschaert, L., Bechtold, M., Modanesi, S., Massari, C., Brocca, L., and De Lannoy, G. J. M.: Irrigation Quantification Through Backscatter Data Assimilation With a Buddy Check Approach, J. Adv. Model. Earth Sy., 16, e2023MS003661, https://doi.org/10.1029/2023MS003661, 2024. a, b

Busschaert, L., Bechtold, M., De Lannoy, G., de Roos, S., Heyvaert, Z., Mortelmans, J., Scherrer, S., Van den Bossche, M., Fereres, E., Heng, L., Steduto, P., and Raes, D.: AquaCrop v7.2, Zenodo [code], https://doi.org/10.5281/zenodo.17140665, 2025a. a

Busschaert, L., Bechtold, M., Modanesi, S., Massari, C., Raes, D., Kumar, S. V., and De Lannoy, G. J. M.: On the gap between crop and land surface models: comparing irrigation and other land surface estimates from AquaCrop and Noah-MP over the Po Valley, EGUsphere [preprint], https://doi.org/10.5194/egusphere-2025-2550, 2025b. a, b, c, d

Busschaert, L., Deketelaere, V., Thiery, W., Raes, D., and De Lannoy, G.: Future Projections of European Maize Yields Using Aquacrop with an Adaptive Growing Season, Eur. J. Agron., 173, 127920, https://doi.org/10.1016/j.eja.2025.127920, 2026. a, b, c

Challinor, A., Slingo, J., Wheeler, T., and and, F. D.-R.: Probabilistic simulations of crop yield over western India using the DEMETER seasonal hindcast ensembles, Tellus A, 57, 498–512, https://doi.org/10.3402/tellusa.v57i3.14670, 2005. a

CLSM HRL Crops: CLMS High Resolution Layer Croplands, https://land.copernicus.eu/en/products/high-resolution-layer-croplands?tab=overview (last access: 21 March 2026), 2025. a

Corbari, C., Paciolla, N., Sheffield, J., Labbassi, K., Dos Santos Araujo, D. C., Berendsen, S., and Szantoi, Z.: Estimates of Irrigation Water Volume by Assimilation of Satellite Land Surface Temperature or Soil Moisture Into a Water-Energy Balance Model in Morocco, Water Resour. Res., 61, e2024WR038926, https://doi.org/10.1029/2024WR038926, 2025. a

Crow, W. T., Kim, H., and Kumar, S.: Systematic Modeling Errors Undermine the Application of Land Data Assimilation Systems for Hydrological and Weather Forecasting, J. Hydrometeorol., 25, 3–26, https://doi.org/10.1175/JHM-D-23-0069.1, 2024. a

Dalla Marta, A., Chirico, G. B., Falanga Bolognesi, S., Mancini, M., D’Urso, G., Orlandini, S., De Michele, C., and Altobelli, F.: Integrating Sentinel-2 Imagery with AquaCrop for Dynamic Assessment of Tomato Water Requirements in Southern Italy, Agronomy, 9, https://doi.org/10.3390/agronomy9070404, 2019. a

De Lannoy, G. J. M., Koster, R. D., Reichle, R. H., Mahanama, S. P. P., and Liu, Q.: An updated treatment of soil texture and associated hydraulic properties in a global land modeling system, J. Adv. Model. Earth Sy., 6, 957–979, https://doi.org/10.1002/2014MS000330, 2014. a

De Lannoy, G. J. M., Bechtold, M., Albergel, C., Brocca, L., Calvet, J.-C., Carrassi, A., Crow, W., de Rosnay, P., Durand, M., Forman, B., Geppert, G., Girotto, M.,Hendicks Franssen, H.-J., Jonas, T., Kumar, S., Lievens, H., Lu, Y., Massari, C., Pauwels, V., Reichle, R., and Steele-Dunne, S.: Perspective on satellite-based land data assimilation to estimate water cycle components in an era of advanced data availability and model sophistication, Front. Water, 4, 981745, https://doi.org/10.3389/frwa.2022.981745, 2022. a

De Lannoy, G. J. M., Bechtold, M., Busschaert, L., Heyvaert, Z., Modanesi, S., Dunmire, D., Lievens, H., Getirana, A., and Massari, C.: Contributions of irrigation modeling, soil moisture and snow data assimilation to the skill of high-resolution digital replicas of the Po basin water budget, ESS Open Archive [preprint], https://doi.org/10.22541/essoar.171535793.33881670/v1, 2024. a

Dente, L., Satalino, G., Mattia, F., and Rinaldi, M.: Assimilation of leaf area index derived from ASAR and MERIS data into CERES-Wheat model to map wheat yield, Remote Sens. Environ., 112, 1395–1407, https://doi.org/10.1016/j.rse.2007.05.023, 2008. a

de Roos, S., De Lannoy, G. J. M., and Raes, D.: Performance analysis of regional AquaCrop (v6.1) biomass and surface soil moisture simulations using satellite and in situ observations, Geosci. Model Dev., 14, 7309–7328, https://doi.org/10.5194/gmd-14-7309-2021, 2021. a, b, c, d, e, f

de Roos, S., Bechtold, M., Busschaert, L., Lievens, H., and De Lannoy, G. J. M.: Assimilation of Sentinel-1 Backscatter to Update AquaCrop Estimates of Soil Moisture and Crop Biomass, J. Geophys. Res.-Biogeo., 129, e2024JG008231, https://doi.org/10.1029/2024JG008231, 2024. a, b, c, d, e

de Wit, A. and van Diepen, C.: Crop model data assimilation with the Ensemble Kalman filter for improving regional crop yield forecasts, Agr. Forest Meteorol., 146, 38–56, https://doi.org/10.1016/j.agrformet.2007.05.004, 2007. a, b

Evensen, G., Vossepoel, F., and Leeuwen, P.-J. V.: Data Assimilation Fundamentals: a Unified Formulation of the State and Parameter Estimation Problem, Springer, ISBN 978-3-030-96709-3, https://doi.org/10.1007/978-3-030-96709-3, 2022. a

FAO/IIASA/ISRIC/ISSCAS/JRC: Harmonized World Soil Database (version 1.2), https://www.fao.org/soils-portal/data-hub/soil-maps-and-databases/harmonized-world-soil-database-v12/en/ (last access: 21 March 2026), 2012. a

Franch, B., Cintas, J., Becker-Reshef, I., Sanchez-Torres, M. J., Roger, J., Skakun, S., Sobrino, J. A., Van Tricht, K., Degerickx, J., Gilliams, S., et al.: Global crop calendars of maize and wheat in the framework of the WorldCereal project, GIScience Remote Sens., 59, 885–913, 2022. a

Gaso, D. V., Paudel, D., de Wit, A., Puntel, L. A., Mullissa, A., and Kooistra, L.: Beyond assimilation of leaf area index: Leveraging additional spectral information using machine learning for site-specific soybean yield prediction, Agr. Forest Meteorol., 351, 110022, https://doi.org/10.1016/j.agrformet.2024.110022, 2024. a

Geoportale Piemonte: Uso del suolo agricolo su mosaicatura catastale di riferimento regionale 2021, https://www.geoportale.piemonte.it/geonetwork/srv/eng/catalog.search#/metadata/r_piemon:5f3b4327-41e2-4fa3-b7de-ccc66f9cf3ce (last access: 21 March 2026), 2025a. a

Geoportale Piemonte: Uso del suolo agricolo su mosaicatura catastale di riferimento regionale 2022, https://www.geoportale.piemonte.it/geonetwork/srv/eng/catalog.search#/metadata/r_piemon:3d164c06-6539-4298-ad56-f8c4161b659a (last access: 21 March 2026), 2025b. a

Geoportale Piemonte: Uso del suolo agricolo su mosaicatura catastale di riferimento regionale 2023, https://www.geoportale.piemonte.it/geonetwork/srv/eng/catalog.search#/metadata/r_piemon:7573bb81-0c2c-46d9-b3f6-609d4e64e34e (last access: 21 March 2026), 2025c. a

Geoportale Piemonte: Carta dei suoli 1:50.000, https://www.geoportale.piemonte.it/geonetwork/srv/eng/catalog.search#/metadata/r_piemon:37c6413b-b07f-4f4c-9344-f2e43ea52bbd (last access: 21 March 2026), 2025d. a

Gobin, A., Sallah, A.-H. M., Curnel, Y., Delvoye, C., Weiss, M., Wellens, J., Piccard, I., Planchon, V., Tychon, B., Goffart, J.-P., and Defourny, P.: Crop Phenology Modelling Using Proximal and Satellite Sensor Data, Remote Sens., 15, https://doi.org/10.3390/rs15082090, 2023. a

GYGA: Global Yield Gap and Water Productivity Atlas, https://www.yieldgap.org (last access: 21 March 2026), 2021. a

Hansen, M., Fries, R. J. D., Townshend, J. R. G., and Sohlberg, R.: UMD global land cover classification derived from AVHRR, 1 km, 1.0, https://iridl.ldeo.columbia.edu/SOURCES/.UMD/.GLCF/.GLCDS/ (last access: 21 March 2026), 1998. a

Hersbach, H., Bell, B., Berrisford, P., Hirahara, S., Horányi, A., Muñoz-Sabater, J., Nicolas, J., Peubey, C., Radu, R., Schepers, D., Simmons, A., Soci, C., Abdalla, S., Abellan, X., Balsamo, G., Bechtold, P., Biavati, G., Bidlot, J., Bonavita, M., De Chiara, G., Dahlgren, P., Dee, D., Diamantakis, M., Dragani, R., Flemming, J., Forbes, R., Fuentes, M., Geer, A., Haimberger, L., Healy, S., Hogan, R. J., Hólm, E., Janisková, M., Keeley, S., Laloyaux, P., Lopez, P., Lupu, C., Radnoti, G., de Rosnay, P., Rozum, I., Vamborg, F., Villaume, S., and Thépaut, J.-N.: The ERA5 global reanalysis, Q. J. Roy. Meteor. Soc., 146, 1999–2049, 2020. a

Heyvaert, Z., Scherrer, S., Bechtold, M., Gruber, A., Dorigo, W., Kumar, S., and Lannoy, G. D.: Impact of Design Factors for ESA CCI Satellite Soil Moisture Data Assimilation over Europe, J. Hydrometeorol., 24, 1193–1208, https://doi.org/10.1175/JHM-D-22-0141.1, 2023. a, b

Heyvaert, Z., Scherrer, S., Dorigo, W., Bechtold, M., and De Lannoy, G.: Joint assimilation of satellite-based surface soil moisture and vegetation conditions into the Noah-MP land surface model, Sci. Remote Sens., 9, 100129, https://doi.org/10.1016/j.srs.2024.100129, 2024. a

Heyvaert, Z., Bechtold, M., Dorigo, W., Mortelmans, J., Hagan, D., Santanello, J., and De Lannoy, G.: Land data assimilation of satellite-based surface soil moisture: Impact on atmospheric simulations over the contiguous United States, Q. J. Roy. Meteor. Soc., e70052, https://doi.org/10.1002/qj.70052, 2025. a

Hmimina, G., Dufrêne, E., Pontailler, J.-Y., Delpierre, N., Aubinet, M., Caquet, B., de Grandcourt, A., Burban, B., Flechard, C., Granier, A., Gross, P., Heinesch, B., Longdoz, B., Moureaux, C., Ourcival, J.-M., Rambal, S., Saint André, L., and Soudani, K.: Evaluation of the potential of MODIS satellite data to predict vegetation phenology in different biomes: An investigation using ground-based NDVI measurements, Remote Sens. Environ., 132, 145–158, https://doi.org/10.1016/j.rse.2013.01.010, 2013. a

Huang, H., Huang, J., Wu, Y., Zhuo, W., Song, J., Li, X., Li, L., Su, W., Ma, H., and Liang, S.: The Improved Winter Wheat Yield Estimation by Assimilating GLASS LAI Into a Crop Growth Model With the Proposed Bayesian Posterior-Based Ensemble Kalman Filter, IEEE T. Geosci. Remote Sens., 61, 1–18, https://doi.org/10.1109/TGRS.2023.3259742, 2023. a

Ines, A. V., Das, N. N., Hansen, J. W., and Njoku, E. G.: Assimilation of remotely sensed soil moisture and vegetation with a crop simulation model for maize yield prediction, Remote Sens. Environ., 138, 149–164, https://doi.org/10.1016/j.rse.2013.07.018, 2013. a, b

Jin, X., Kumar, L., Li, Z., Feng, H., Xu, X., Yang, G., and Wang, J.: A review of data assimilation of remote sensing and crop models, Eur. J. Agron., 92, 141–152, 2018. a

Jin, X., Li, Z., Feng, H., Ren, Z., and Li, S.: Estimation of maize yield by assimilating biomass and canopy cover derived from hyperspectral data into the AquaCrop model, Agr. Water Manage., 227, 105846, https://doi.org/10.1016/j.agwat.2019.105846, 2020. a

Kelly, T. and Foster, T.: AquaCrop-OSPy: Bridging the gap between research and practice in crop-water modeling, Agr. Water Manage., 254, 106976, https://doi.org/10.1016/j.agwat.2021.106976, 2021. a

Kumar, S. V., Peters-Lidard, C. D., Tian, Y., Houser, P. R., Geiger, J., Olden, S., Lighty, L., Eastman, J. L., Doty, B., Dirmeyer, P., Adams, J., Mitchell, K., Wood, E. F., and Sheffield, J.: Land information system: An interoperable framework for high resolution land surface modeling, Environ. Modell. Softw., 21, 1402–1415, https://doi.org/10.1016/j.envsoft.2005.07.004, 2006. a, b

Kumar, S. V., Reichle, R. H., Peters-Lidard, C. D., Koster, R. D., Zhan, X., Crow, W. T., Eylander, J. B., and Houser, P. R.: A land surface data assimilation framework using the land information system: Description and applications, Adv. Water Resour., 31, 1419–1432, https://doi.org/10.1016/j.advwatres.2008.01.013, 2008. a, b, c, d

Kumar, S. V., Jasinski, M., Mocko, D. M., Rodell, M., Borak, J., Li, B., Beaudoing, H. K., and Peters-Lidard, C. D.: NCA-LDAS Land Analysis: Development and Performance of a Multisensor, Multivariate Land Data Assimilation System for the National Climate Assessment, J. Hydrometeorol., 20, 1571–1593, https://doi.org/10.1175/JHM-D-17-0125.1, 2019. a

Lanfranco, N.: Satellite Data Assimilation to Improve Regional Crop Yield Estimates, Master thesis, Politecnico di Torino, KU Leuven, http://webthesis.biblio.polito.it/id/eprint/38056 (last access: 21 March 2026), 2025. a, b, c, d

Linker, R. and Ioslovich, I.: Assimilation of canopy cover and biomass measurements in the crop model AquaCrop, Biosyst. Eng., 162, 57–66, https://doi.org/10.1016/j.biosystemseng.2017.08.003, 2017. a

Lu, Y., Chibarabada, T. P., McCabe, M. F., De Lannoy, G. J., and Sheffield, J.: Global sensitivity analysis of crop yield and transpiration from the FAO-AquaCrop model for dryland environments, Field Crop. Res., 269, 108182, https://doi.org/10.1016/j.fcr.2021.108182, 2021a. a, b

Lu, Y., Chibarabada, T. P., Ziliani, M. G., Onema, J.-M. K., McCabe, M. F., and Sheffield, J.: Assimilation of soil moisture and canopy cover data improves maize simulation using an under-calibrated crop model, Agr. Water Manage., 252, 106884, https://doi.org/10.1016/j.agwat.2021.106884, 2021b. a, b, c

Lu, Y., Wei, C., McCabe, M. F., and Sheffield, J.: Multi-variable assimilation into a modified AquaCrop model for improved maize simulation without management or crop phenology information, Agr. Water Manage., 266, 107576, https://doi.org/10.1016/j.agwat.2022.107576, 2022. a, b, c, d

Massari, C., Modanesi, S., Dari, J., Gruber, A., De Lannoy, G. J. M., Girotto, M., Quintana-Seguí, P., Le Page, M., Jarlan, L., Zribi, M., Ouaadi, N., Vreugdenhil, M., Zappa, L., Dorigo, W., Wagner, W., Brombacher, J., Pelgrum, H., Jaquot, P., Freeman, V., Volden, E., Fernandez Prieto, D., Tarpanelli, A., Barbetta, S., and Brocca, L.: A Review of Irrigation Information Retrievals from Space and Their Utility for Users, Remote Sens., 13, https://doi.org/10.3390/rs13204112, 2021. a

Mialyk, O., Schyns, J., Booij, M., Su, H., Hogeboom, R., and Berger, M.: Water footprints and crop water use of 175 individual crops for 1990–2019 simulated with a global crop model, Sci. Data, 11, https://doi.org/10.1038/s41597-024-03051-3, 2024. a

Müller, C., Franke, J., Jägermeyr, J., Ruane, A. C., Elliott, J., Moyer, E., Heinke, J., Falloon, P. D., Folberth, C., and Francois, L.: Exploring uncertainties in global crop yield projections in a large ensemble of crop models and CMIP5 and CMIP6 climate scenarios, Environ. Res. Lett., 16, 034040, https://doi.org/10.1088/1748-9326/abd8fc, 2021. a

Nearing, G. S., Crow, W. T., Thorp, K. R., Moran, M. S., Reichle, R. H., and Gupta, H. V.: Assimilating remote sensing observations of leaf area index and soil moisture for wheat yield estimates: An observing system simulation experiment, Water Resour. Res., 48, https://doi.org/10.1029/2011WR011420, 2012. a

Paudel, D., Boogaard, H., de Wit, A., van der Velde, M., Claverie, M., Nisini, L., Janssen, S., Osinga, S., and Athanasiadis, I. N.: Machine learning for regional crop yield forecasting in Europe, Field Crop. Res., 276, 108377, https://doi.org/10.1016/j.fcr.2021.108377, 2022. a

Pauwels, V. R. N., Verhoest, N. E. C., De Lannoy, G. J. M., Guissard, V., Lucau, C., and Defourny, P.: Optimization of a coupled hydrology–crop growth model through the assimilation of observed soil moisture and leaf area index values using an ensemble Kalman filter, Water Resour. Res., 43, https://doi.org/10.1029/2006WR004942, 2007. a, b

Raes, D., Steduto, P., Hsiao, T. C., and Fereres, E.: AquaCrop—The FAO Crop Model to Simulate Yield Response to Water: II. Main Algorithms and Software Description, Agron. J., 101, 438–447, https://doi.org/10.2134/agronj2008.0140s, 2009. a, b

Raes, D., Fereres, E., De Lannoy, G., Vanuytrecht, E., Garcia Vila, M., and Steduto, P.: The AquaCrop model, in: Current crop models. Burleigh Dodds Series in Agricultural Science, edited by: Hoogenboom, G., Vol. 170, Burleigh Dodds Science Publishing, Cambridge, UK, https://doi.org/10.19103/AS.2025.0155.20, 2025. a

Raes, D., Steduto, P., Hsiao, T., Fereres, E., Busschaert, L., Bechtold, M., de Roos, S., Heyvaert, Z., Mortelmans, J., Scherrer, S., Van den Bossche, M., and De Lannoy, G.: Aquacrop version 7.2 – Reference Manual, Chapter 3 Calculation Procedures in: AquaCrop v7.2 Reference Manual, Zenodo, https://doi.org/10.5281/zenodo.18458272, 2026. a, b, c

Reichle, R., Liu, Q., Koster, R. D., Crow, W. T., De Lannoy, G. J. M., Kimball, J. S., Ardizzone, J. V., Bosch, D., Colliander, A., Cosh, M., Kolassa, J., Mahanama, S. P., McNairn, H., Prueger, J., Starks, P., and Walker, J. P.: Version 4 of the SMAP Level-4 Soil Moisture Algorithm and Data Product, J. Adv. Model. Earth Sy., 11, 3106–3130, https://doi.org/10.1029/2019MS001729, 2019. a

RICA: Italian survey data of winter wheat yield in Piemonte, https://rica.crea.gov.it/modulo_richiesta_dati.php (last access: 21 March 2026), 2025. a

Ruy, D., Crow, W., Zhan, X., and Jackon, T.: Correcting unintended perturbation biases in hydrologic data assimilation, J. Hydrometeorol., 10, 734–750, 2008. a

Scherrer, S., De Lannoy, G., Heyvaert, Z., Bechtold, M., Albergel, C., El-Madany, T. S., and Dorigo, W.: Bias-blind and bias-aware assimilation of leaf area index into the Noah-MP land surface model over Europe, Hydrol. Earth Syst. Sci., 27, 4087–4114, https://doi.org/10.5194/hess-27-4087-2023, 2023. a

Steduto, P., Hsiao, T. C., Raes, D., and Fereres, E.: AquaCrop – The FAO Crop Model to Simulate Yield Response to Water: I. Concepts and Underlying Principles, Agron. J., 101, 426–437, https://doi.org/10.2134/agronj2008.0139s, 2009. a, b

Thoning, K., Crotwell, A., and Mund, J.: Atmospheric Carbon Dioxide Dry Air Mole Fractions from continuous measurements at Mauna Loa, Hawaii, Barrow, Alaska, American Samoa and South Pole, 1973–present, https://doi.org/10.15138/yaf1-bk21, 2025. a

van Klompenburg, T., Kassahun, A., and Catal, C.: Crop yield prediction using machine learning: A systematic literature review, Comput. Electron. Agr., 177, 105709, https://doi.org/10.1016/j.compag.2020.105709, 2020. a

Vanuytrecht, E., Raes, D., and Willems, P.: Global sensitivity analysis of yield output from the water productivity model, Environ. Modell. Softw., 51, 323–332, https://doi.org/10.1016/j.envsoft.2013.10.017, 2014. a, b

Vazifedoust, M., van Dam, J. C., Bastiaanssen, W. G. M., and and, R. A. F.: Assimilation of satellite data into agrohydrological models to improve crop yield forecasts, Int. J. Remote Sens., 30, 2523–2545, https://doi.org/10.1080/01431160802552769, 2009. a

Wellens, J., Raes, D., Fereres, E., Diels, J., Coppye, C., Adiele, J. G., Ezui, K. S. G., Becerra, L.-A., Selvaraj, M. G., Dercon, G., and Heng, L. K.: Calibration and validation of the FAO AquaCrop water productivity model for cassava (Manihot esculenta Crantz), Agr. Water Manage., 263, 107491, https://doi.org/10.1016/j.agwat.2022.107491, 2022. a

Yang, C. and Lei, H.: Evaluation of data assimilation strategies on improving the performance of crop modeling based on a novel evapotranspiration assimilation framework, Agr. Forest Meteorol., 346, 109882, https://doi.org/10.1016/j.agrformet.2023.109882, 2024. a

Zaks, D. P., Ramankutty, N., Barford, C. C., and Foley, J. A.: From Miami to Madison: Investigating the relationship between climate and terrestrial net primary production, Global Biogeochem. Cy., 21, GB3004, https://doi.org/10.1029/2006GB002705, 2007. a

Zare, H., Weber, T. K., Ingwersen, J., Nowak, W., Gayler, S., and Streck, T.: Within-season crop yield prediction by a multi-model ensemble with integrated data assimilation, Field Crop. Res., 308, 109293, https://doi.org/10.1016/j.fcr.2024.109293, 2024. a, b, c

Zhang, X., Liu, L., Liu, Y., Jayavelu, S., Wang, J., Moon, M., Henebry, G. M., Friedl, M. A., and Schaaf, C. B.: Generation and evaluation of the VIIRS land surface phenology product, Remote Sens. Environ., 216, 212–229, 2018. a, b, c

Zhang, X., Henebry, M., Schaaf, G., and Miura, T.: VIIRS Global Land Surface Phenology Product User Guide: collection 2.0s, Tech. rep., NASA Visible Infrared Imaging Radiometer Suite, USA, https://viirsland.gsfc.nasa.gov/PDF/VIIRS_GLSP_UserGuide_C2.pdf (last access: 21 March 2026), 2022. a

Articles

Short summary

The AquaCrop model has been incorporated into the NASA Land Information System, to advance regional crop growth simulations at any spatial resolution, with a range of different input sources for meteorology, soil and crop parameters. This system also facilitates the assimilation of satellite data to update the crop and water conditions during model simulations. We present three exploratory applications to highlight pathways for future research on regional-scale crop estimation.