Budget analysis of a tendency equation is widely utilized in numerical studies to quantify different physical processes in a simulated system. While such analysis is often post-processed when the output is made available, it is well acknowledged that the closure of a budget is difficult to achieve without temporal and/or spatial averaging. Nevertheless, the development of errors in such calculations has not been systematically investigated. In this study, an inline budget retrieval method is first developed in the WRF v3.8.1 model and tested on a 2D idealized slantwise convection case with a focus on the momentum equations. This method extracts all the budget terms following the model solver, which gives a high accuracy, with a residual term always less than 0.1 % of the tendency term. Then, taking the inline values as truth, several offline budget analyses with different commonly used simplifications are performed to investigate how they may affect the accuracy of the estimation of individual terms and the resultant residual. These assumptions include using a lower-order advection operator than the one used in the model, neglecting grid staggering, or following a mathematically equivalent but transformed format of the governing equations. Errors in these post-processed analyses are found mostly over the area where the dynamics are the most active, thus impairing the subsequent physical interpretation. A maximum 99th percentile residual can reach

The atmosphere is a complex system with different scales of motion. Its dynamics are governed by a set of fluid equations based on the fundamental laws of physics. Although the equation set cannot be solved analytically, numerical models can be used to simulate the observed weather and climate systems to improve our understanding of the atmosphere. Due to the complexity and nonlinearity of the numerical models, budget analysis is often employed to interpret the results by quantifying the contribution of each term (i.e., physical process) in a tendency equation that governs the evolution of a certain quantity in the simulated system. The accuracy of a given budget analysis can be estimated from the residual term, defined as the difference between the tendency term on the left-hand side (lhs) of the equation and the summation of all the forcing terms on its right-hand side (rhs). Budget analysis has been performed on diverse properties (e.g., momentum, temperature, water vapor, vorticity) of many systems on various scales, including the Madden–Julian oscillation (MJO; e.g., Kiranmayi and Maloney, 2011; Andersen and Kuang, 2012), tropical cyclones (e.g., Zhang et al., 2000; Rios-Berrios et al., 2016; Huang et al., 2018), squall lines (e.g., Sanders and Emanuel, 1977; Gallus and Johnson, 1992; Trier et al., 1998), supercell thunderstorms (e.g., Lilly and Jewett, 1990), and so on.

Despite the popularity of the budget analysis, it is generally acknowledged that, in model post-processing analysis, obtaining a closed budget with a negligible residual is difficult (e.g., Kanamitsu and Saha, 1996) and has been accomplished mostly in time- or domain-averaged budget calculations (e.g., Lilly and Jewett, 1990; Balasubramanian and Yau, 1994; Arnault et al., 2016; Kirshbaum et al., 2018; Duran and Molinari, 2019). Even in the case of averaged budgets, the residual term that contains non-explicitly diagnosed physics can be larger than the tendency term (e.g., Liu et al., 2016), and many studies simply do not display the residual, making the proper interpretation of the budget analysis difficult.

The “residual analysis method” is sometimes utilized to obtain an indirect estimation of the physical processes that are hard to diagnose or are unresolved in a set of analysis or observational data. In such cases, a non-negligible residual is sometimes used to gain insight into such processes. However, as just discussed, the residual term also contains the inaccuracies associated with the calculations within the budget analysis (e.g., Kornegay and Vincent, 1976; Abarca and Montgomery, 2013). It is thus unclear whether the unresolved physics in such data sets do indeed comprise the main component of the residual without considering the contributions of other sources of errors in the budget calculation (Kuo and Anthes, 1984). Whereas it is almost impossible to separate the subgrid-scale, unresolved processes from other errors in reanalysis or observational data (e.g., Hodur and Fein, 1977; Lee, 1984), the focus of this study is on numerical model data where the local tendency and all the associated resolved and parameterized physics can be obtained from the model. Thus, the residual term in this study specifically refers to errors in the budget calculation.

To reduce the residual, an inline budget analysis that extracts all the terms of a prognostic equation directly from the model during its integration is generally the most accurate. However, the procedure has been reported only in a few studies (e.g., Zhang et al., 2000; Lehner, 2012; Moisseeva, 2014; Moisseeva and Steyn, 2014; Potter et al., 2018; see Appendix A for a summary and comparison among these works). Most other studies still conduct the offline or post-processing budget analysis when the output is made available after the model integration. Some specific suggestions have been given in the past regarding how to reduce the error of post-processed budget analysis. For example, Lilly and Jewett (1990) emphasized the importance of evaluating terms using the same differencing scheme, grid stretching, and grid staggering as that used in the simulation model. However, it is uncertain whether these rules have been widely followed, and how much of a reduction in residual can be obtained with this approach.

In some post-processed budget analyses, transformed equations with different assumptions from those in the model are used and naturally lead to errors in the budget results. On the other hand, even when the same form of the equations is followed, errors can still arise from multiple sources during the post-processing. Some errors are inherent in the time discretization scheme of the model, some are traced to the numerical methods in solving the temporal or spatial derivatives with finite differencing (e.g., Kuo and Anthes, 1984), and others might emerge during the interpolation or extrapolation from model grids to analysis grids (e.g., Lilly and Jewett, 1990). While the tendency term is often the result of a few cancelations among competing forcing terms, the seemingly non-dominant terms may be as important as the large forcing terms in determining the sign and the value of the tendency. Thus, an incorrect estimation of even a small term may result in a residual with magnitude comparable to the tendency term, hindering the subsequent physical interpretation.

A few models, such as the Cloud Model 1 (CM1; Bryan and Fritsch, 2002) and the High Resolution Limited Area Model (HIRLAM; Undén et al., 2002), include inline budget diagnoses that users can choose to include in the model output. However, many other commonly used models (e.g., Fifth-Generation NCAR/Penn State Mesoscale Model (MM5; Grell et al., 1994), Weather Research and Forecasting Model (WRF; Skamarock et al., 2008), the Advanced Regional Prediction System (ARPS; Xue et al., 2000, 2001), and the Regional Atmospheric Modeling System (RAMS; Pielke et al., 1992)) do not have this capability. In this study, we develop an inline momentum budget retrieval tool in the Advanced Research WRF model, one of the most widely used numerical weather prediction models. During the period 2011–2015, there were on average 510 peer-reviewed journal publications involving WRF per year (Powers et al., 2017). Given the widespread use of WRF for both real-case and idealized modeling, such a budget tool may prove useful in numerous applications. In our budget diagnosis, each contributing term is extracted during the model integration and stored as a standard output. In so doing, we essentially solve the prognostic variables as done in the model so that the two sides of the tendency equation are always in balance regardless of the output time interval. By taking the results from the inline budget analysis as truth, we then perform several different post-processing budget analyses with commonly made simplifications or a different format of equation. Comparisons between the post-processed budgets and the inline/true values are made to investigate the potentially large errors in each forcing term and the resultant residuals.

The WRF configuration used in this study is a two-dimensional [(

To ensure conservation properties, the model equations are formulated in flux form, with the prognostic variables coupled with

To develop an inline budget retrieval tool, it is important to understand how these prognostic variables are advanced in the WRF model. Governing equations are first recast to perturbation forms with respect to a dry hydrostatically balanced reference state that is a function of height only (defined at initialization) to reduce truncation errors and machine rounding errors. Specifically, variables of

Based on Skamarock et al. (2008), Fig. 1 summarizes the WRF integration strategy. The integration is wrapped by a third-order Runge–Kutta (RK3) scheme, in which the prognostic variables (generalized as

The time integration strategy for advancing a state variable (generalized as

The main discussion of this study will focus on a 2D (

Figure 2 shows the 48 h evolution of the 99th percentiles of

Evolutions of the 99 percentiles of

For the inline budget analysis, all the terms are retrieved directly from the model for all the integration time steps, and therefore they represent the “instantaneous” terms that act over the specified short integration time window. For the large-step forcing, the WRF model accumulates all forcing terms at the beginning of each RK3 step. To separate them, we simply take the difference before and after WRF calls the subroutine for each large-step forcing, store their values separately, and output only the values at the third RK3 step (the total forcing is

Figures 3 and 4 present the results of the inline budget analysis for horizontal momentum and vertical momentum, respectively, at three selected times (6, 12, and 16 h). To demonstrate the momentum changes in a common physical unit (velocities; meter per second), every term of the flux-form budget equation shown in this paper is divided by the dry-air mass,

Inline budget analysis of horizontal momentum,

Inline budget analysis of vertical momentum,

In contrast to extracting terms directly from the model during its integration, most of the studies in which the momentum budget analysis is conducted use the model output files after the completion of the integration. Note that since the sub-output time-step information is not available between successive outputs, only the large-step forcing terms can be estimated in these post-processed budget analyses. Generally, the neglect of the acoustic or small-step modes is expected to have little impact on the results as the high-frequency modes are often considered meteorologically insignificant. However, it is mentioned in Klemp et al. (2007) and Skamarock et al. (2008) that the WRF small-step integration scheme includes not only the acoustic-wave but also some gravity-wave modes, which may not be insignificant. These gravity-wave modes form during the small-step integration due to the designated terms that are required for acoustic-wave propagation and “Consequently, in this vertical coordinate (i.e., terrain-following hydrostatic pressure coordinate), the terms governing the acoustic and gravity wave modes are intermingled to the extent that it does not appear feasible to evaluate any of the gravity wave terms on the large time steps, even if one desired to do so” (Klemp et al., 2007).

Most of the studies did not reveal the complete details about how their analysis was done, so we cannot presume their methodologies and the possible errors. However, a few simplifications commonly made in the post-processed budget analyses may introduce errors that result in deviations from the simulated results and thus a significant residual. Below we revisit the relevant features of the WRF model that should be considered and discuss how they might affect the post-processed budget if they are ignored. Then, the results are shown for different post-processed budget analyses with different simplifications (Table 1). The aim herein is to identify these potential errors hidden in the budget calculation and show how severely they affect the resulting interpretation.

A summary of all different approaches for the post-processed horizontal momentum budget analysis that are applied to the model output after the integration finishes.

In a post-processed budget analysis, the tendency term of a given variable is approximated by the difference between the value of this variable at two successive output times divided by the output time interval. Thus, the accuracy may be sensitive to the output time interval. The value at the predicted state has a form of

For computational efficiency and accuracy, WRF utilizes a C-grid staggering system (Arakawa and Lamb, 1977). This staggering system is pertinent to the numerical solution for spatial derivatives. For most of the spatial derivatives other than advection (e.g., the pressure gradient force), the second-order finite difference operator is used in the WRF model. For example, the

Evolution of the 99th percentile of the residual magnitude (meters per second squared) of the horizontal momentum

If the C-grid staggering is not considered during the post-processing analysis, i.e., all the variables have been interpolated on the universal grids before carrying out the budget calculation, in addition to the potential errors brought on by the interpolation method, the term

For advection, higher-order operators for finite differencing are provided as the default WRF setup. Taking the

Conceptually, the WRF model can be considered more of a forward scheme, i.e., using the known variables from the current state to calculate the forcing and then advancing the variables forward until reaching the prediction time. However, there are a few implicit components during the integration. For example, as discussed in Sect. 2.1, the large-step forcings are updated using a predictor–corrector method in the second and third RK3 steps. In addition, the

In numerical analysis for solving ordinary differential equations, the (explicit) forward Euler method approximates the change of a system from

Schematic plot showing the explicit (forward) and implicit (backward) solvers for the rhs forcing terms, as well as the diagnosed and the true (calculated inline during the integration of the model) lhs tendency term defined in this study.

If

The above two diagnostic methods estimate the forcing terms using instantaneous states. However, as mentioned in Sect. 3.2.1(a), the diagnosed lhs tendency depends on two successive model output times. Thus, an average between forcings diagnosed explicitly and implicitly are often considered. For a post-processed analysis, this translates into estimating the forcings using both predicted states and the most recent prior available current states:

While the momentum equations solved in the WRF model are in flux form, their corresponding advective forms can be derived and are often used for post-processed budget analyses for convenience. To derive the advective form, the flux-form

Table 1 summarizes all the post-processed budget analyses tested in this study. In the present section, we first present the results one by one, and then a qualitative intercomparison among them and the inline retrieval method is discussed. The first post-processed method (POST10min-E) for

The difference between the post-processed (POST10min-E; with an explicit or forward method on 10 min output) and inline budget analysis for the horizontal momentum,

The second post-processed analysis (POST1min-E) is done following the same approach but applied to the 1 min (same as the integration time step for this simulation) output data, and the results show strongly reduced errors in all terms (Fig. 9). The errors that remain are mostly in the PGF term and likely stem from the fact that the small-step modes and the RK3 integration scheme are not considered in the post-processed budget. These inherent errors result in a small residual term with a general order of

Same as Fig. 8, but the post-processed budget analysis is applied to the data with an output time interval of 1 min (POST1min-E).

Given that computational cost is often a major consideration, we also test whether the implicit or backward Euler method (POST10min-I) can improve the estimation of instantaneous forcing terms relative to the explicit method for the same 10 min output data (POST10min-E). POST10min-I follows the same strategy as POST10min-E except that all the rhs terms, following Eq. (13), are diagnosed with the predicted states instead of the previous output states. As depicted in Fig. 10, POST10min-I does indeed better capture the true model estimated forcing values as errors in all the rhs forcing terms diminish greatly to an accuracy similar to POST1min-E. However, as these forcings are calculated at a given instant, the imbalance of the budget would remain if the diagnosed tendency term is not calculated instantaneously (the second column from the right in Fig. 10). Therefore, if budget analysis at an instant of time is desired, we recommend adding the tendency calculation within the model as a standard output and diagnosing the forcing terms implicitly, which yields a residual term on a similar order to the one obtained in POST1min-E (the rightmost column in Fig. 10 and Fig. 5a).

Same as Fig. 8, but the post-processed rhs terms are diagnosed using the implicit or backward method (POST10min-I) and an extra column is added on the rightmost showing the residual from the true tendency (i.e., the instantaneous value obtained from the model).

For the more common situation, the post-processed analyses diagnose rhs terms using two successive outputs over an output time interval, i.e., taking the averages of the explicitly and implicitly calculated forcings using Eq. (15) on the 10 min output (POST10min-(E+I)/2). Comparing the averaged rhs forcings with the analogously diagnosed lhs momentum tendency (Eq. 8) gives a small residual to a similar accuracy level as POST1min-E and POST10min-I (the rightmost column in Figs. 11 and 5b).

Same as Fig. 8, but the forcing terms diagnosed in the post-processed budget analysis are the averages of explicit and implicit methods (POST10min-(E+I)/2). To represent the same time window as the post-processed analysis, the inline budget results used here for the difference calculation are the 10 min averages (corresponding to the output interval) instead of the instantaneous values.

We now investigate the impact of other common simplifications on top of the reference experiment, POST10min-(E+I)/2. The first such simplification is to approximate the flux-form advection term using the second-order operator (Eq. 9) for both vertical and horizontal components (POST2oadv-(E+I)/2) instead of the third- and fifth-order operators as used in the model setup. In our simulation, such inconsistency of advection operators introduced errors in the ADV term with a maximum value

Same as Fig. 11, but the post-processed analysis uses a second-order operator for advection calculation (POST2oadv-(E+I)/2).

Finally, a different format of the

Same as Fig. 11, but the post-processed analysis does not consider C staggering grids (POSTnonstag-(E+I)/2).

Same as Fig. 11, but the post-processed analysis is applied using the advective-form equation (POSTadvF-(E+I)/2).

The difference between the post-processed (POST1min-E) and the inline budget analysis for vertical momentum

A quantitative comparison of the 99th percentile of the magnitude of the residual term in the domain (excluding the boundaries) among different analysis methods is shown in Fig. 5. The residuals between the instantaneously diagnosed forcings and the true model tendency term (calculated inline) are shown in Fig. 5a while the ones between the averaged forcings of two consecutive outputs and the diagnosed tendency term are shown in Fig. 5b. The evolution of the 99th percentile residual shows generally larger magnitudes when the momentum tendency is larger (Fig. 2b), suggesting that these errors may amplify in stronger convection cases. While the post-processed budget analysis in POST1min-E, POST10min-I, and POST10min-(E+I)/2 can achieve a relatively small 99th percentile residual (peak at

For the

The application of POST1min-E for the

The growth of the residual as the convection intensifies (Fig. 5) motivates a test for a different case with stronger momentum tendencies. A WRF idealized 2D squall line test case (em_squall2d_y; Skamarock et al., 2008) is selected with a horizontal resolution of 250 m and 3 s integration time step, and the simulation is integrated for 1 h. A subgrid turbulence scheme based on the prognostic turbulent kinetic energy equation is activated (diff_opt=2 and km_opt=2; Skamarock et al., 2008, chap. 4.2.4). The simulated

Upper row shows the inline budget analysis of horizontal momentum,

While an increase in spatial resolution often requires a shorter integration time step for numerical stability and may result in stronger simulated convection, it is almost impossible to separate all these factors. We can, however, conduct the same slantwise convection simulation with a higher resolution of 2 km (and a shorter integration time step of 10 s) to exclude the effect of different physical processes in different systems and discuss the changes in the accuracy of the budget analysis when spatial resolution is increased from 10 km. As shown in Fig. 2b, in the 2 km simulation the maximum of the simulated 99th percentile

The results presented above suggest that the relative magnitude of errors in budget analysis vary with different systems or cases. Furthermore, while the absolute errors in the inline momentum budget analyses do indeed increase with increasing horizontal resolution, the relative magnitude with respect to the simulated tendency does not increase substantially. The accuracy of the post-processed budget analysis using the averages of two consecutive model outputs is highly dependent on the ratio of the output interval and the integration time step. A ratio of 10 as used in the POST10min-(E+I)/2 results in an acceptable accuracy (99th percentile residual of about 7 % of the tendency), while a lower value of 6 is required for high-resolution simulations (e.g., the 2 km case) to reach a similar accuracy. For cases with a more complex physical process like the squall line test case, the inline budget retrieval appears necessary for adequate budget closure.

Budget analysis is a commonly employed tool in numerical studies to understand the underlying mechanisms for certain simulated features of interest. However, many studies still have difficulties in achieving a balanced or closed budget especially when a full-physics model is used and when the budget is calculated instantaneously over a local area. Aside from the complexity of various (some implicit) parameterization schemes, the main challenge in closing the budget involves the analysis of post-processed data using algorithms that are inconsistent with the model solver. In this study, an inline momentum budget retrieval tool is developed for the WRF model, and its advantages for momentum budget analysis are demonstrated. The 99th percentile residual obtained from this inline retrieval is always smaller than or about 0.1 % of the actual tendency term in all the tested cases, which include idealized, 2D simulations of slantwise convection and squall lines. Taking the results from the inline retrieval as “truth”, we investigate the potential errors in each term and the resultant residual for post-processed budget analyses under different assumptions.

The comparison among different post-processed diagnoses is focused on the horizontal momentum (

For the rhs forcing terms in the

Instead of performing the calculation using model output at one given instant, a more general post-processed budget analysis can use two successive model outputs (POST10min-(E+I)/2). This method seems to work well with the 99th percentile residual being about 7 % of the 99th percentile

Three other common assumptions in post-processing analysis are made on top of the POST10min-(E+I)/2 to examine their potential impacts on the accuracy of the horizontal momentum budget analysis. First, utilizing an advection operator with a lower order than the one used in the model setup degrades the accuracy of the advection term with up to 50 % error over the area where the advection is the strongest (POST2oadv-(E+I)/2). Second, the neglect of the staggering grids would negatively impact the estimation of all the spatial differential terms, leading to a widespread residual of at least 30 % of the local tendency (POSTnonstag-(E+I)/2). Last, when the advective form of the momentum equation is used for post-diagnosis rather than the flux form, although it is mathematically equivalent to the flux form solved in the model solver, a strong negatively biased residual results (POSTadvF-(E+I)/2). Both POSTnonstag-(E+I)/2 and POSTadvF-(E+I)/2 give a peak 99th percentile residual of about

While the post-processed

In summary, different assumptions or simplifications made in a post-processed budget analysis may severely impact the estimation of each forcing term and result in a large imbalance of the budget. Based on our experiments, we conclude that the inline retrieval method like that developed herein is the most reliable one for budget analysis in numerical studies. While the budget analyses shown in this study are only for

The standard version of WRF v3.8.1 is publicly available at

TCC designed and performed the numerical experiments under the supervision of MKY and DJK. MKY proposed the idea of comparing the inline and post-processed budget analyses. TCC developed the code of the inline budget retrieval tool in the WRF v3.8.1 model and the post-processed analyses. DJK provided useful suggestions to improve the work. TCC prepared the paper and all co-authors contributed to the writing and editing of the paper.

The authors declare that they have no conflict of interest.

We thank Patrick Hawbecker, Ian Dragaud, and one anonymous reviewer for their valuable comments and suggestion that helped to improve this study.

This research has been supported by the NSERC/Hydro-Quebec Industrial Research Chair program and the Fonds de recherche du Québec – Nature et technologies (FRQNT) doctoral research scholarship grant.

This paper was edited by Chiel van Heerwaarden and reviewed by Patrick Hawbecker and one anonymous referee.