The importance of management information and soil moisture representation for simulating tillage effects on N2O emissions in LPJmL5.0-tillage

representation for simulating tillage effects on N2O emissions in LPJmL5.0-tillage Femke Lutz1,2, Stephen DelGrosso3, Stephen Ogle4, Stephen Williams4, Sara Minoli1, Susanne Rolinski1, Jens Heinke1, Jetse J. Stoorvogel2, and Christoph Müller1 1Potsdam Institute for Climate Impact Research (PIK), member of the Leibniz Association, P.O. Box 60 12 03, 14412 Potsdam, Germany 2Wageningen University, Soil Geography and Landscape Group, P.O. Box 47, 6700 AA Wageningen, the Netherlands 3USDA-ARS, Soil management and Sugar Beet Research Unit, 2150 Centre Ave. Bldg. D, Fort Collins, CO 80526, USA 4NREL, Colorado State University, Fort Collins, CO 80523, USA Correspondence: Femke Lutz (femke.lutz@pik-potsdam.de)

Soils emit N 2 O through a series of processes involving denitrification and nitrification. These processes are driven by microbial activity and strongly respond to soil properties such as moisture, temperature, oxygen, mineral N, and organic carbon (Mosquera et al., 2005;Snyder et al., 2009;Van Kessel et al., 2013). These soil properties are affected by tillage (Lutz et al., 2019a, c) and other management practices (e.g., fertilizer application and residue treatment) (Van Kessel et al., 2013). including missing processes and lack of process understanding. Also the parameterization of implemented processes as well as detailed information on management aspects that are explicitly addressed in the model can lead to model deficiencies that could cause the mismatch between observations and simulations. 40 For example, as detailed information about agricultural management practices is lacking for global-scale applications, assumptions on agricultural management are necessary in these global simulations about e.g., the type, amount and timing of fertilizer applications. Detailed information on fertilization can typically be dealt with in field-scale modeling experiments, whereas at the global scale, there is only general information on fertilization (e.g. Mueller et al., 2012;Potter et al., 2010) which is characterized by gaps and uncertainties (Erb et al., 2017). These generalizations may be a significant contributor 45 to the overall uncertainty for agricultural impact assessments. For instance, Folberth et al. (2019) found that differences in management assumptions (about e.g., growing season, and fertilization) resulted in substantial differences in modeled crop yields using the same crop model. Second, the formation of N 2 O in soils is very sensitive to soil moisture (Butterbach-Bahl et al., 2013). How the effect of tillage on soil moisture is simulated is thus another source of uncertainty that could explain the inaccuracy in modeling tillage 50 effects on N 2 O emissions.
In this study, we test the importance of management information as well as the representation of soil water dynamics for the ability to simulate N 2 O emissions under different tillage regimes with LPJmL5.0-tillage (Lutz et al., 2019a), for four different experimental sites across Europe and the USA. Simulation results are compared to measurements of N 2 O emissions from experimental studies under tillage and no-tillage in different simulation experiments, varying from using observed site-

Overview
In Lutz et al. (2019a), model results deviated from meta-analyses when comparing simulated tillage effects on N 2 O emissions.
First, we tested whether the deviations are due to a lack of detailed management information. Four experimental sites for which detailed information on management are available were identified. On those sites, LPJmL5.0-tillage was run using management assumptions usually used in a global simulation experiment (LPJmL.G.Orig). To find out if LPJmL5.0-tillage 70 performed better with detailed information on management, we also applied LPJmL5.0-tillage using detailed site-specific management information to derive inputs (LPJmL.D.Orig).
The site-specific DayCent model was used as benchmark to analyze the underlying mechanisms of the N 2 O producing processes. For all the simulations of DayCent, detailed information of management was used. Except for the experimental site in Boigneville, DayCent has been used and calibrated for field-scale assessments at the chosen sites (i.e. Campbell et al., 75 2014;Del Grosso et al., 2009;Yang et al., 2017). Therefore, we expect it to perform better on simulating the effects of tillage on N 2 O emissions than LPJmL. We also expect to learn from the underlying mechanisms simulated by DayCent and to use this information for improving process representation and parameterization in LPJmL. All model versions considered here require similar inputs (soil properties, vegetation type, land management information, latitude, daily precipitation, and daily air temperature (minimum and maximum). LPJmL5.0-tillage (in the following referred to as LPJmL) uses three litter pools; representing surface litter, incorporated 85 litter and below-ground litter as well as two soil organic matter (SOM) pools per soil layer, which are characterized by fast plants, leached to lower layers (NO − 3 only) or transformed to N gas emissions (e.g. N 2 O) through nitrification or denitrification Parton et al., 2001). N 2 O emissions from nitrification are calculated as a function of soil NH + 4 concentration, temperature, pH, texture and soil moisture. N 2 O from denitrification is calculated as a function of soil NO − 3 concentration, soil moisture, texture and heterotrophic CO 2 respiration rate. N 2 O emissions from denitrification increases exponentially when the WFPS exceeds the texture related threshold value and levels off as the soil approaches saturation. The 130 model can simulate different types of tillage (i.e. plowing, tandem disk and field cultivator). Depending on the type of tillage, the decomposition of litter and SOM (active and slow) pools are increased by a specific factor for a period of one month, and a fraction of above-ground residues is transferred to surface litter and top soil layer. Tillage also impacts soil temperature and water dynamics indirectly because the model assumes that precipitation intercepted by surface litter and living biomass evaporates before entering soil. The presence of surface litter insulates the soil from air temperature fluctuations.

135
If site level measurements of soil hydraulic properties required for DayCent are not available, they are calculated through the PTF from Saxton et al. (1986) and are static throughout the simulations. The PTF uses soil texture to calculate FC, WP, bulk density and Ksat. The soil water model simulates unsaturated water flow using Darcy's equation, runoff, snow dynamics, and the effect of soil freezing on saturated water flow (Pannkuk et al., 1998). DayCent has been shown to reliably model soil water content, N mineralization and N 2 O emission rates from different soil types and management practices (Kelly et al., 2000;140 Parton et al., 2001). Del Grosso et al. (2002) provides an extensive overview of validate results for DayCent.

Experimental sites
Four experimental sites were selected in which the effects of tillage and no-tillage on N 2 O emissions were studied (Table 1 and Table 2). The sites were selected based on the availability of observational data and treatment combination of tillage and no-tillage.

145
The first study site is located at the Agricultural Research Development and Education Center (ARDEC) near Fort Collins, CO (40 • 39'6" N, 104 • 59'57" W;1555 m asl). It was initiated in 1999 on a clay loam soil (fine-loamy, mixed, mesic Aridic Haplustalfs), that was continuously cropped with maize (Zea mays L.) for six years. Shortly before sowing, fertilizers (67 kg N ha −1 ) were applied. The fields were sprinkler irrigated during the growing season. In the tillage treatment, fields were tilled shortly before sowing, and with harvest, followed by tandem disking and then moldboard plowing to a depth of 25 to 30 150 cm. N 2 O emissions were measured three times per week during the growing season (2002)(2003)(2004)(2005)(2006) with closed chambers. Soil moisture was measured two to three times per month during the growing season from 2003 to 2006. Soil organic carbon (SOC) was measured once in October 2005. A detailed description of the experimental site can be found in Halvorson et al. (2006).
The second study site is located at the University of Nebraska-Lincoln Agricultural Research and development Center, Ithaca,NE (41 • 9'43.3"N,96 • 24'41.4" W;349 m asl). The experiment was established in 2002 on a silt loam soil that was 155 previously cropped with rain fed maize, soybean (Glycine max (L.) Merr.), oat (Avena sativa L.) and alfalfa (Medicago sativa L.). Maize was grown continuously on the field after 2000. During the experiment, N fertilizers were injected to a depth of 10-15 cm, once during the growing season at various rates and compositions ( Table 1). The soil in tillage treatments was tilled before sowing and at harvest to a depth of 15-20 cm. The field was irrigated with varying irrigation amounts. established in 1988 on an agricultural field that had been tilled for at least 100 years before the experiment. The crop rotation before 1995 consisted of maize followed by soybean. In 1995, wheat (Triticum aestivum L.) was planted after soybean, which resulted in a maize-soybean-wheat rotation. After the harvest of wheat, the fields stayed bare until the fields were cropped with maize again. This sequence was followed during the time span analyzed here . Different quantities of N-fertilizers were applied at sowing and/or during the growing season for maize, during the growing season for wheat, and soybean did 170 not receive fertilizers ( Table 1). The tillage treatment was tilled each year with sowing, then during the growing season and at harvest, to a depth of 20 cm. The fields were not irrigated during the experiment. N 2 O emissions were measured once or twice a month from June 1991 to October 2016 using closed chambers. Soil moisture was measured once per month during the growing season from 1989 until 2017. SOC was measured annually since 1989 at multiple sampling depths. More information regarding the experimental study site is provided by Grandy et al. (2006) and on the KBS LTER website (http://lter.kbs.msu.edu, accessed 175 November 2018).
The last study site is located in Boigneville, France (48 • 33'N, 2 • 33'E, altitude unknown) on a silt loam soil (Haplic Luvisoil) (FAO, 1998). The experiment started in 1970 that had been tilled to 30 cm depth annually. During the experiment, the site was cropped with a maize-wheat rotation, with maize being sown in April, harvested in October and directly followed by tillage (20 cm for tillage treatments) and sowing of wheat. After harvest of wheat in April, the soil was left bare and was tilled 180 (20 cm) in November, and left fallow until planting maize in the next growing season. This sequence was followed during the time span analyzed here (2003)(2004). During the experiment, the maize received N-fertilizers in May and wheat in February and April (Table 1). The fields were irrigated between the end of June and July. N 2 O emissions were measured on average every three weeks using closed chambers. Soil moisture was not measured. Soil organic carbon was measured twice in 2003 and once in 2004 at various depths. More information regarding the study site can be found in (Oorts et al., 2007).    In the LPJmL.G.Orig scenario, all management information as well as soil C and N-pools were used as within the default global simulation of LPJmL (Table 3). The amount of mineral and organic fertilizers was provided by the global gridded crop model intercomparison (Elliott et al., 2015) of the Agricultural Model Intercomparison and Improvement Project (AgMIP, Rosenzweig et al., 2013). It is based on global, gridded data sets for each crop (Mueller et al., 2012;Potter et al., 2010).
Fertilizer is assumed to consist of 50% NO − 3 and 50% NH + 4 . If fertilizer input is low (≤ 5.0 gN m −2 ), all is applied at sowing. Otherwise, only half of the fertilizer is applied at sowing and the remainder is applied when the phenological stage fraction (unitless) of the crop reaches 0.4 (Von . Irrigation events occur when the fractional soil moisture of the water holding capacity (unitless) is below an irrigation threshold value of 0.7 for maize (Jägermeyr et al., 2015).

195
In the experiments with tillage, tillage occurs twice a year; once at sowing and once at the day of harvest. Sowing dates are calculated internally following Waha et al. (2012). Thereby, the sowing dates are calculated based on a set of rules depending on crop specific thresholds and climate. Here, the sowing date depends on a crop-specific temperature threshold (i.e. 14 • C for maize; Waha et al., 2012).
The size of the C and N pools are calculated internally during the spinup (5000 years) of the natural vegetation and 200 land-use history. The land-use history is simulated as with DayCent, in order to establish a comparable starting point when the simulations for the experiments are conducted. Thereby, the spin-up is followed by a simulation of historical land-use change to account for effects on the pools based on the best available information of land management.

LPJmL detailed setup using observed input data
Site-specific observed information for all management inputs as well as soil C and N pools were prescribed for simulation 205 LPJmL.D.Orig (Table 3). For practical reasons, irrigation water was added to precipitation to enable the specification of the amount and the timing of irrigation events. This mimics a sprinkler irrigation technique as part of the irrigation water is intercepted by the canopy. As the current implementation of soil layers and tillage in LPJmL does not allow for distinguishing more detailed tillage types other than conventional tillage and no tillage, we ignored tillage activities that were less intensive (e.g. "shredding"). In order to specify the growing season, phenological heat unit requirements and base temperatures were 210 parameterized so that the simulated harvest dates were matching the reported harvest dates.
The soil C and organic N pools from the simulations were scaled to the observed values. This was done twice, once at the introduction of land-use during spin-up and once at the start of the treatment of the experimental site. If observations were not available for the start of the experiment, the first available observation was taken, assuming that pool sizes remained stable over that time period. The pools (P) at each site were scaled as in equation 1: Where P (cor) are the scaled carbon or nitrogen content of the soil pools (g C or N m −2 ) in layer l of the experimental site and P (sim) , the simulated amounts of C or N contained in the soil and litter pools of the different layers l of the experimental site. Total (obs) and Total (sim) , are the total of C or N contained in the soil and litter pools summed over the different layers (l) for which observational data of soil organic C and N were available (in g C or g N m −2 , respectively) of the experimental site.

220
The differences between simulated and observed input data are depicted in Table 3.

LPJmL experimental simulations
Agricultural management consists of several practices. To analyze the importance of individual management aspects, we conducted a set of simulations as in LPJmL.D.Orig but ignored one site-specific management practice and replace it with the global assumption as in LPJmL.G.Orig (Table 3) respectively. The naming of the simulation consists of three parts: 1) model used (LPJmL), 2) the experiment conducted (e.g.

230
I, GS or PS) and 3) whether it includes modifications ("Mod"; see 2.7) or not ("Orig"). We modified the model with respect to the treatment of the residue cover of the soil in no-tillage systems and with respect to 235 changing the soil parameterization.

Model modifications
As the soil covered by residues under no-tillage practices in LPJmL simulations is very high and thus leads to high soil moisture levels throughout the year (as soil evaporation is reduced and infiltration is enhanced), we tested modifications of the relevant functions for this aspect. To this end, we tested modifications of the parameters that translate litter amounts into soil cover (Gregory, 1982) and those that determine how long the soil is covered with residues. Rather than changing 240 well-established functions on litter decomposition (Schlüter et al., 2018), we modified the parameter on bioturbation that was introduced by Lutz et al. (2019a) and tested its effects on the reduction of the residue cover of the soil. Lutz et al. (2019a) used an average value of 0.006 (m 2 g −1 ) (falsely described as 0.004 in their publication, but used so in the code: https://doi.org/10.5281/zenodo.2652136) to translate litter biomass into a fraction of soil being covered with residues, which was applied to all litter, neglecting variations in surface litter for different materials. The bioturbation rate was 245 increased from 0.19% day −1 to 0.63% day −1 to account for the surface litter being transferred to the incorporated litter pool per day (equivalent to an annual bioturbation rate of 90%, versus 50% as assumed previously).
High N 2 O emissions can also result from biases in the parameterization of hydraulic properties. For example, small differences between FC and WSAT lead to frequent triggering of denitrification. To study the role of soil moisture for causing Where N 2 O day,notill and N 2 O day,till are daily N 2 O emissions in g N ha −1 d −1 for all the days in the year and n notill and n till the number of days with N 2 O emissions simulated or observed in the year for no-tillage and tillage, respectively.
The differences in N 2 O emissions for individual days were calculated as in equation 4: Where N 2 O notill and N 2 O notill are daily emissions in all years.

270
The relative difference (RD in %) of no-tillage to conventional tillage was calculated as in equation 5: where W is the volumetric soil water content (mm). The WFPC F C (fraction) and WFPC W P (fraction) are the field capacity and wilting point values normalized to WFPS as in equations 7 and 8: The W F C and W W P are the water content at field capacity and wilting point, respectively.

Evaluation metrics
To quantify the performance of simulated N 2 O emissions, we conducted an analyses of coincidence (equation 9) and an analysis of association (equation 10), following Smith and Smith (2007). Therefore, we calculated the deviation between simulated and observed values were by the root mean squared deviation (RMSD in g N ha −1 d −1 ) for the different sites as in equation 9: To describe how well the dynamics in the observations were captured in the simulations, we calculated the degree of association (r) as in equation 10:

295
Where O and S are the average observed and average simulated value respectively over all years (in g N ha −1 d −1 ). The significance of r corresponds to the tests, null hypothesis: r=0.
For soil moisture, the RM SD and r were calculated as well. However, there we focused on one site and calculated the average 300 RM SD and r over all the years, as not much variation in soil moisture is expected between the years. The N 2 O emissions were overestimated in the LPJmL.G.Orig experiment when analyzing yearly averages of the different sites ( Fig. 1 A). This effect was stronger for simulated emissions under no-tillage (RMSD=36.2 g N ha −1 d −1 , r=-0.07) than under tillage (RMSD= 23.6 g N ha −1 d −1 , r=-0.31). DayCent was closer to the observed values for both tillage (RMSD=7.60 g N ha −1 d −1 , r=0.67) and no-tillage (RMSD=4.61 g N ha −1 d −1 , r=0.66). For the full statistical analyses, we refer to Table A1 in the Appendix.

310
Using detailed site-specific management information in LPJmL (LPJmL.D.Orig) improved the correlation between the observed and simulated values (Fig. 1 B) When analyzing the effect of tillage (difference between no-tillage and tillage), LPJmL.G.Orig showed an increase in 315 emissions with no-tillage ( Fig. 2 A), and LPJmL.D.Orig showed both an increase and decrease with no-tillage (Fig. 2 B The simulations with different management information showed that these are relevant for the simulated tillage effects on N 2 O emissions on individual days (Fig. 3). On average, more accurate information on management improved the simulations of differences between conventional and no-tillage systems in LPJmL except for the site in Colorado. However, there was q q q q q q q q q q q q q q q q q q q q q q q q q q q q For all sites, LPJmL showed a high variability in N 2 O emissions between days ( Fig. 3 and Table A1 in Appendix A).
The interquartile ranges from LPJmL simulations were often much wider compared to observations and DayCent simulations.
Hence, the variability of no-tillage effects on daily N 2 O emissions was overestimated. DayCent tended to underestimate the variability of N 2 O emissions between days (Table A1 in Appendix A).

345
In LPJmL, the N 2 O emissions from no-tillage were entirely caused by changes in denitrification, whereas no-tillage mainly caused decreases on N 2 O emissions from nitrification ( Fig. A2 in Appendix A). This can be explained by higher soil   (Fig. 4). These combined effects showed the best performance for both tillage (RSMD=0.12 (unitless), r=0.33) and no-tillage (RSMD=0.14 (unitless), r=0.48), compared to implementing the modifications separately ( Although the simulation of soil moisture was improved with the modified settings, LPJmL simulations still overestimated soil moisture in comparison to observations.      Table A1).
The modifications did not improve the simulation of N 2 O emissions after shifting to no-tillage (Fig. 6). Although the deviations of the absolute differences between tillage systems decreased, the correlation with observations was less well captured  6). However, the observations showed both increases and decreases in N 2 O emissions after shifting to no-tillage for all sites at the yearly aggregation. q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q   indicates that the increase in decomposition rate of (soil) organic matter due to tillage, is dominant in comparison to the effect of increased soil moisture-driven denitrification rate.

410
The overall better performance by DayCent likely reflects the years of model development and testing at this scale and previous application at these sites (except the site in Boigneville) (Campbell et al., 2014;Del Grosso et al., 2008;Yang et al., 2017), which enabled more accurate reproduction of observed N 2 O emissions. The testing of the model performance as well as improvements to reproduce observed N 2 O emissions has been conducted in several studies (Necpálová et al., 2015;Fitton et al., 2014;Del Grosso et al., 2010). For example, model calibration has been conducted to test the model performance based 415 on contributing parameters and key processes that affect N 2 O emissions. For instance, the maximum amount of N 2 O emissions produced during nitrification as well as the proportion of nitrified N that is lost as N 2 O can be specified. LPJmL, is developed for global-scale applications and is therefore usually not calibrated, as suitable calibration targets are typically not available at that scale.
The application of LPJmL at the experimental sites provided much insight into the deviations of the tillage effects on 420 N 2 O emissions from observations. It enabled to use site-specific information on agricultural management, whereas missing information at global scale has to be supplemented with assumptions. As detailed information improved the simulation of tillage effects on N 2 O emissions, advancing the current state of information on agricultural management at the global scale could improve global estimates of tillage effects on N 2 O emissions. The study also highlighted the potential of improving the simulation of N 2 O emission by improving soil moisture dynamics. Any modification to improve LPJmL5.0-tillage needs to be 425 evaluated at the global scale, as LPJmL is typically applied at that scale (e.g. Heinke et al., 2019;Rolinski et al., 2018;Schaphoff et al., 2018). A first recommendation is to revisit the PTF used in LPJmL5.0-tillage. We saw in this exercise that LPJmL overestimated soil moisture independent of the tillage system. Although the modifications in residue cover improved the results on soil moisture, the most important modification was in the hydraulic properties resulting from the PTF. The modifications still resulted in relatively high soil moisture contents, and therefore possibly still overestimations in N 2 O emissions. A reason for this could be the relatively inefficient percolation of soil moisture to lower soil layers as soon as soil moisture is higher than FC.
N 2 O emissions from denitrification increase exponentially when the WFPS exceeds a certain threshold value in LPJmL.
This threshold value (which is around 0.8 of WFPS) is a proxy for assuming anaerobic conditions, and is static for all soil texture types. However, finer-textured soils have lower gas diffusivity at a given WFPS than coarser textured soils (e.g. Del

435
Grosso et al., 2000). In soils with lower gas diffusivity, denitrification is assumed to occur at lower levels of WFPS, because atmospheric O 2 may not diffuse into the soil fast enough to fully satisfy microbial demand (Parton et al., 1996). Threshold values for anoxic conditions that are soil texture type specific are currently not accounted for in LPJmL. In DayCent, the effect of gas diffusivity of different soil texture types is taken into account. An index of gas diffusivity is calculated based on the WFPS, bulk density and FC, which is a proxy for pore size distribution and air filled pore space. This index influences 440 the denitrification rate (i.e. lower diffusivity increases denitrification), N 2 to N 2 O and NO x to N 2 O ratios. Including such processes in LPJmL might improve simulated N 2 O emissions. However, this would require suitable reference data in order to parameterize these processes well.

Conclusions
Previous