<p>Agricultural nitrous oxide (N<sub>2</sub>O) emission accounts for a non-trivial fraction of global greenhouse gases (GHGs) budget. To date, estimating N<sub>2</sub>O fluxes from cropland remains a challenging task because the related microbial processes (e.g., nitrification and denitrification) are controlled by complex interactions among climate, soil, plant and human activities. Existing approaches such as process-based (PB) models have well-known limitations due to insufficient representations of the processes or constraints of model parameters, and to leverage recent advances in machine learning (ML) new method is needed to unlock the “black box” to overcome its limitations due to low interpretability, out-of-sample failure and massive data demand. In this study, we developed a first of its kind knowledge-guided machine learning model for agroecosystems (KGML-ag), by incorporating biogeophysical/chemical domain knowledge from an advanced PB model, <em>ecosys</em>, and tested it by simulating daily N<sub>2</sub>O fluxes with real observed data from mesocosm experiments. The Gated Recurrent Unit (GRU) was used as the basis to build the model structure. To optimize the model performance, we have investigated a range of ideas, including: 1) Using initials of intermediate variables (IMVs) instead of time series as model input to reduce data demand; 2) Building hierarchical structures to explicitly estimate IMVs for further N<sub>2</sub>O prediction; 3) Using multitask learning to balance the simultaneous training on multiple variables; and 4) Pretraining with millions of synthetic data generated from <em>ecosys</em> and fine tuning with mesocosm observations. Six other pure ML models were developed using the same mesocosm data to serve as the benchmark for the KGML-ag model. Results show that KGML-ag did an excellent job in reproducing the mesocosm N<sub>2</sub>O fluxes (overall r<sup>2</sup> = 0.81, and RMSE = 3.6 mg N m<sup>−2</sup> day<sup>−1</sup> from cross-validation). Importantly KGML-ag always outperforms the PB model and ML models in predicting N<sub>2</sub>O fluxes, especially for complex temporal dynamics and emission peaks. Besides, KGML-ag goes beyond the pure ML models by providing more interpretable predictions as well as pinpointing desired new knowledge and data to further empower the current KGML-ag. We believe the KGML-ag development in this study will stimulate a new body of research on interpretable ML for biogeochemistry and other related geoscience processes.</p>