Articles | Volume 14, issue 9
Model description paper
24 Sep 2021
Model description paper |  | 24 Sep 2021

UBER v1.0: a universal kinetic equation solver for radiation belts

Liheng Zheng, Lunjin Chen, Anthony A. Chan, Peng Wang, Zhiyang Xia, and Xu Liu

Recent proceedings in radiation belt studies have proposed new requirements for numerical methods to solve the kinetic equations involved. In this article, we present a numerical solver that can solve the general form of the radiation belt Fokker–Planck equation and Boltzmann equation in arbitrarily provided coordinate systems and with user-specified boundary geometry, boundary conditions, and equation terms. The solver is based upon the mathematical theory of stochastic differential equations, whose computational accuracy and efficiency are greatly enhanced by specially designed adaptive algorithms and a variance reduction technique. The versatility and robustness of the solver are exhibited in four example problems. The solver applies to a wide spectrum of radiation belt modeling problems, including the ones featuring non-diffusive particle transport such as that arising from nonlinear wave–particle interactions.

1 Introduction

In the space plasma environment, radiation belts refer to torus-shaped regions surrounding Earth and other magnetized planets that are filled with highly energetic charged particles trapped in the planetary magnetic field. Since their discovery (Van Allen and Frank1959; Vernov et al.1959), radiation belts have been the focus of intense research due to the innumerable unknowns concerning their extremely dynamic behavior and their damaging effects on spacecraft (e.g., Baker2000; Welling2010). During slowly changing conditions, radiation belt particles undergo three types of periodic motion: gyration about field lines, bounce along field lines, and drift about the planet. With each periodic motion there is a corresponding adiabatic invariant, defined through the Hamiltonian action integral (e.g., Goldstein1980, chap. 10), that is only violated when the conditions are undergoing change on timescales shorter than the period. A widely adopted method to study the dynamics of radiation belts is to solve a kinetic equation describing the evolution of particle phase space density. In quasi-linear theory, this kinetic equation is usually a Fokker–Planck equation that takes the general covariant form (Schulz1991)

(1) f ¯ t = 1 G Q α G D α β f ¯ Q β - 1 G Q α G h α f ¯ + S f ¯ + v ,

where f¯ is the phase-averaged phase space density, G=det(JαQβ) is the Jacobian determinant for the transformation from canonical action variables Jα (α=1,2,3) to the generalized coordinates Qα, and Dαβ, hα, S, and v are coefficients of the equation. Summation on repeated Greek indices is implied throughout this paper. In different radiation belts, the number of terms emerging on the right-hand side of Eq. (1) and their respective physical backgrounds may be different. For the Earth's outer radiation belt, the second and the fourth terms are usually missing; the first term represents diffusion caused by wave–particle interactions, and the third term is often a loss characterized by the particle lifetime (e.g., Li and Hudson2019, and the reference therein). In the low-altitude inner radiation belt where wave–particle interactions are not as significant, the first and second terms are often provided by the diffusion and dynamic friction caused by inter-particle Coulomb collisions (e.g., Selesnick2012), and the fourth term may be a source from cosmic ray albedo neutron decay (CRAND; e.g., Selesnick2015; Li et al.2017). For radiation belts of the gas giants, all terms could be present (e.g., Horne et al.2008; Lorenzato et al.2012). The first two terms may be attributable to both wave–particle interactions and inter-particle collisions and, in addition, synchrotron radiation, which is negligible in Earth's radiation belts, bleeds energy for the ultra-relativistic electrons, and thus also contributes to the second term (e.g., Bolton et al.2002, 2004). The third term could represent the moon-sweeping loss, and the fourth term may come from moon volcanic activities as a plasma source (e.g., Nénon et al.2017).

In some circumstances, the dependence of phase space density on certain phases φι could be discerned, and the radiation belt kinetic equation takes the form

(2) f ¯ t + φ ˙ ι f ¯ φ ι = 1 G Q μ G D μ ν f ¯ Q ν - 1 G Q μ G h μ f ¯ + S f ¯ + v ,

where a dot over φι indicates its time derivative. The phase space density f¯ here is only averaged over the phases varying faster than φι, and the mutually exclusive indices ι and μ together form the complete set of α. The most common situation is perhaps the dependence of f¯ on the drift phase φ3, which in the Earth's outer radiation belt may be caused by the wave activity dependence on magnetic local time (e.g., Shprits et al.2009) and in the inner belt by the longitudinal variation of drift shell altitude (e.g., Tu et al.2010; Xiang et al.2019). With the spatial derivative term on the left-hand side, Eq. (2) appears as a Boltzmann equation for f¯, but by Hamiltonian mechanics, the conjugating term Q˙ι(f¯/Qι) should have also appeared on the left-hand side. Its absence is due to the fact that, for particles in the radiation belt energy range, the drift-phase-dependent electric potential energy is usually negligible in the unperturbed particle Hamiltonian so that φ3 becomes a cyclic variable. Appendix A provides a more comprehensive explanation of this equation, and we will return to the general case in which φ3 is not cyclic in the “Conclusions and discussion” section.

Various numerical models have been built to solve a specific form of either Eq. (1) or (2) (e.g., Beutier et al.1995; Selesnick et al.2003; Tao et al.2008; Albert et al.2009; Subbotin et al.2010; Tu et al.2013; Wang et al.2017; Xiang et al.2020, to name a few). Though their underlying numerical schemes might not restrict, these existing models have in practice been implemented with hard-coded choices of coordinates, relatively simple boundary geometry, and a roughly fixed number of equation terms; therefore, each model is applicable to a specific set of problems. This situation could become quite inconvenient when adiabatic invariants of particle motion are used as coordinates of phase space to model radiation belt dynamics, as promoted by Schulz (1996). The reasons are twofold: first, due to their vast range of magnitude and dramatically varying resolution, adiabatic invariant coordinates often require some kind of rescaling and transformation (e.g., Zheng et al.2014), specific to the problem, to be computationally efficient; second, boundary geometry becomes complicated in adiabatic invariant coordinates, which could be challenging for finite-difference methods and led Subbotin and Shprits (2012) to seek new coordinates from combinations of the adiabatic invariants. However, the use of adiabatic invariant coordinates is crucial for some compelling problems in the radiation belts: for example, the mechanisms of storm-time electron loss in which adiabatic modulations due to magnetic field configuration change must be separated from non-adiabatic processes (Kim and Chan1997; Turner and Ukhorskiy2020) and the relative significance of Earthward diffusion versus CRAND as a possible inner belt electron source with the drift shell splitting effect contributing (Cunningham et al.2018). It is the purpose of this article to present a numerical code, named UBER (for “universal Boltzmann equation solver”), that solves Eqs. (1) and (2) in an arbitrarily user-specified coordinate system up to three dimensions, with great freedom in specifying boundary geometry and boundary conditions and with various combinations of equation terms. Therefore, it is expected that the solver can be applied to a wide spectrum of radiation belt modeling problems. More importantly, the freedom of specifying equation terms implies that, in an asymptotic manner, UBER can even solve the integro-differential kinetic equations arising from non-diffusive particle transport, such as that formulated in Artemyev et al. (2018) for nonlinear wave–particle interactions, and thereby provides a viable means to incorporate non-diffusive transport into global radiation belt modeling.

The underlying mathematical theory of the solver is stochastic differential equation (SDE) theory. The SDE method had been utilized by Tao et al. (2008), Selesnick et al. (2013), and Zheng et al. (2014) in their modeling of radiation belts. The method is grid-free and enjoys unparalleled advantages in dealing with cross-diffusion components and complicated boundary geometry (e.g., Zheng et al.2016), but it is meanwhile notorious for low efficiency ascribed to its Monte Carlo nature. In this article, we also describe specially designed numerical techniques that have enhanced the computational speed of the SDE method by an order of magnitude, thus making the solver much more affordable for large-scale simulations. Four example problems with distinct physical backgrounds are provided in this article to demonstrate the abilities and versatility of the solver.

2 Mathematical theory

The kinetic equations given as Eqs. (1) and (2) above are parabolic partial differential equations (PDEs). Written in the Kolmogorov backward form (see below), a parabolic PDE corresponds to a multidimensional SDE that describes the motion of an Itô stochastic process whose certain functional expectation satisfies the PDE; the PDE can then be solved by calculating path integrals of the corresponding stochastic process (e.g., Freidlin1985; Øksendal1998).

Let us consider the following partial differential problem composed of a Kolmogorov backward equation and a general set of initial and boundary conditions:


where t and α are shorthand for the partial differentials with respect to t and the αth coordinate, respectively. In Eqs. (4)–(6), Ω¯ denotes the closure of the domain and ∂Ω its boundary. In particular, 1Ω represents the boundary pieces of the first type (Dirichlet) boundary condition, and Ω\1Ω indicate the boundary pieces excluding those in 1Ω, which are of the second (Neumann, λ≡0) or the third type (Robin, λ≠0) boundary conditions. The unit vector γ^ points into Ω¯ and is not tangent to the local boundary.

The mathematical theory of SDEs establishes a relation between Eqs. (3)–(6) and the Itô stochastic process, whose spatial positions are denoted by the random variable Xs in Ω that obeys the reflected SDE:

(7) d X s = b ( t - s , X s ) d s + σ ( t - s , X s ) d W s + γ ^ ( X s ) d k s ,

where the dot product on the right-hand side is between a rank-2 tensor and a vector, and the parameter s runs from 0 to t so that the stochastic process retrogrades in time from t to 0. The first term on the right-hand side describes the ballistic part of its motion. The second term describes the stochastic part, with the coefficient tensor σ satisfying σσT=a (whose components are aαβ). Note that this condition does not uniquely determine σ, but all satisfying σ tensors are equivalent (Levi's theorem; Freidlin1985; Zheng et al.2014). Ws is a vector Wiener process of the same dimensions as Xs, with each dimension an independent Gaussian stochastic variable that has zero mean and variance s. The third term describes reflection of the stochastic process in the direction given by γ^ on the boundary Ω\1Ω, and ks is a monotonic stochastic variable that only increases when the stochastic process is on that boundary to force Xs to stay in Ω¯. ks can thus be considered a measure of the time that the stochastic process spent on Ω\1Ω and hence has the name local time. The Itô process stops either in Ω¯ when s=t or on 1Ω at s=τ<t.

A formal solution of the problem in Eqs. (3)–(6) is given by the Feynman–Kac formula (e.g., Kac1949; Øksendal1998; Klebaner2005):

(8) f ( t , x ) = E [ F t , x [ X s ] ] ,

in which 𝔼 is the expectation operator, and t,x[Xs] is a functional of the stochastic path Xs started from t and x that has the expression

(9) F t , x [ X s ] = I τ t g 0 ( X t ) exp 0 t c ( t - s , X s ) d s - 0 t λ ( t - s , X s ) d k s + I τ < t g 1 ( t - τ , X τ ) exp 0 τ c ( t - s , X s ) d s - 0 τ λ ( t - s , X s ) d k s + 0 t τ u ( t - s , X s ) exp 0 s c ( t - r , X r ) d r - 0 s λ ( t - r , X r ) d k r d s ,

where the symbol 𝕀τt is equal to 1 when τt, which means the stochastic process has stopped in Ω¯ before it had a chance to reach 1Ω, and zero otherwise; tτ means the smaller between the two. Physically, the functional t,x[Xs] is a propagator of contribution carried along the stochastic path from either the initial condition or the first type boundary condition to the point of solution, and the exponential functions indicate how this contribution enhances or decays along this path.

To formally solve the Fokker–Planck equation (Eq. 1) by the Feynman–Kac formula (Eq. 8), it remains to transform the equation together with its proper initial and boundary conditions into the form of Eqs. (3)–(6). To this end, directly expanding Eq. (1) and collecting terms with the same differentiation order yields its Kolmogorov backward form:

(10) t f ¯ = D α β α β f ¯ + ( β D α β + D α β β ln G ) - h α α f ¯ + S - ( α h α + h α α ln G ) f ¯ + v .

Comparing Eq. (10) with Eq. (3) and taking Qα equivalent to xα, we thus have the correspondences of coefficients:

(11) a = 2 D , b = D - h , c = S - h , u = v ,

where in curvilinear coordinates, the divergence operator on a tensor field Γ is

(12) Γ = α Γ α + Γ α α ln G ,

in which the dots stand for all other indices irrelevant to the operation, and the terms αln G come from summation of the Christoffel symbols in a covariant derivative (e.g., Mathews and Walker1970, chap. 15). It is worth remarking that h appears in the expression for b so that the Itô process travels against the advection velocity. This is indeed the case since it is time-backwards. Also, from the expression for c, divergence of the advection serves as a loss of phase space density.

Initial and boundary conditions to Eq. (1) are transformed as follows. For the initial condition and the first type boundary condition, values of f¯(t,x) are specified just as in Eqs. (4) and (5). For a flux boundary condition of the form Φ=g2(t,x)f¯, we note that the outward flux Φ across a boundary is given by (n^Df¯-n^hf¯), with n^ the unit inward normal vector of Ω\1Ω. Therefore, the corresponding boundary condition is

(13) n ^ D f ¯ - ( n ^ h + g 2 ) f ¯ = 0 .

Comparing Eq. (13) with Eq. (6), we determine that

(14) γ ^ = n ^ D | n ^ D | , λ = n ^ h + g 2 | n ^ D | .

Although the SDE (Eq. 7) does not prevent σ, and hence D, from being zero, the expressions in Eq. (14) do become singular for vanishing D on Ω\1Ω. In the region where D vanishes, Eq. (3) is no longer parabolic but degenerates to an advection equation (a first-order PDE), for which imposing a Neumann or Robin boundary condition is over-determinant. In this case, we invoke on the boundary minimal diffusion in the eigen-direction of n^ so that γ^=n^, and let g2-n^h so that λ=0, which means the advective flow is free to cross the boundary. The situation that D is finite but |n^D| vanishes is considered pathological to our problem.

Up to this point, we have transformed the Fokker–Planck equation (Eq. 1) and its initial and boundary conditions to the problem in Eqs. (3)–(6) and gathered all expressions in Eqs. (11) and (14) for the constructing components of the SDE (Eq. 7) as well as the functional (Eq. 9). In order to solve the Boltzmann equation (Eq. 2), it suffices for us to just transform the equation into the form of Eq. (1). To this end, we expand the phase space by concatenating the coordinates Qμ and φι so that xα={Qμ,φι} (recall that α={μ,ι}) and introduce the new coefficients D̃αβ, h̃α, S̃, and ṽ that satisfy the following conditions:

(15) D ̃ μ ν = D μ ν , D ̃ α ι = 0 , h ̃ α = { h μ , φ ˙ ι } , S ̃ = S + ι φ ˙ ι , v ̃ = v .

It can be verified that Eq. (1) in the new xα coordinates with the new coefficients given by Eq. (15) transforms into Eq. (2) after replacing xα by Qμ and φι. The transformation (Eq. 15) essentially treats φι as new dimensions of the stochastic motion, except that the stochastic part of the motion in these dimensions is identically zero. A new type of boundary condition might emerge for problems involving Eq. (2), which is the periodic boundary condition for the phases φι. From the viewpoint of stochastic motion, though, such periodicity is not really a boundary but rather a topology of Ω. The treatment of periodic boundary condition will be exemplified in the third problem in Sect. 4 below.

Table 1User input items to the UBER code.

A set of the boundary-related items for each piece of boundary.

Download Print Version | Download XLSX

To summarize this section, the above mathematical theory allows us to fully define a PDE problem involving Eq. (1) or (2) in an arbitrary coordinate system given the input functions and equations as listed in Table 1, which can be either analytical or numerical in the UBER code. The equation terms may be freely turned off by setting their corresponding coefficients to zero. The number of boundary pieces is totally up to choice, which can even be zero to put the boundary at infinity. The boundary geometry may be time-variable for boundary pieces in 1Ω in the UBER code, but it must be fixed for those in Ω\1Ω. Solutions of this problem are obtained once we find a way to evaluate the functional in Eq. (9) for a realization of a stochastic path and to estimate the expectation of the functional. These numerical techniques are the subject of the next section.

3 Numerical techniques

We give an outline of the algorithms used by the UBER code in this section, with an emphasis on the techniques that improve both its accuracy and efficiency. Lower-level numerical techniques, such as the generation of pseudo-random variables, linear algebraic operations, and parallelized computation, are based on the works presented in Zheng (2015). The general idea for numerically implementing the SDE method is as follows: (i) for a given spatiotemporal position (t,x) for which an equation solution is wanted, a number of stochastic paths starting from this common position are simulated; (ii) for each stochastic path, its functional value is evaluated by the path integrals as in Eq. (9); and (iii) from these sampled functional values, their expectation is estimated, and this gives the solution at (t,x). Therefore, the SDE method is essentially a Monte Carlo method. It does not rely on a computational grid and is able to solve the problem locally. However, on many occasions it is still worth obtaining global solutions on a grid so that the solutions at time stamp Ti may be used as the initial condition for the solutions at Ti+1, analogous to the idea of layer methods (e.g., Tao et al.2009). In this way, the stochastic processes need only to be simulated for a short duration of t=Ti+1-Ti to obtain the new solutions, for which the calculation of functional expectation would converge much faster than those simulated for the full length t=Ti+1. The only operation on this grid would be interpolation and possibly extrapolation; therefore, unlike in layer methods, the grid can be irregular and thus allows for the use of sophisticated interpolation libraries on irregular grids, although in the present version only a nonuniform Cartesian grid (in the given coordinate system) is implemented with user-specified nodes.

Integration of the SDE (Eq. 7) employs the Euler–Maruyama scheme that is order 1 for weak convergence problems such as ours, meaning that when only the statistical distribution of stochastic paths matters but not the individual path, the expectation of the schematic error is proportional to the first power of the time step size (Kloeden and Platen1992). To further reduce the schematic error, an adaptive time step size is used in UBER. It can be shown that (e.g., Zheng2015) the root mean square (rms) distance an Itô stochastic process travels in infinitesimal time ds is

(16) d X ¯ s = tr ( a ) d s .

Numerically, the first-order contribution from |bΔs| cannot be neglected due to the finite Δs. Therefore, we prescribe a desired rms spatial step size ΔX¯s, which is sufficiently small compared to the size of Ω and any scale length of the equation coefficients, and then choose the smaller Δs inferred from either Eq. (16) or ΔX¯s=|bΔs| at every step of integration as the adaptive step size. This scheme evidently reduces to a simple adaptive Euler scheme for integrating ordinary differential equations when a approaches zero.

Oblique reflection of the stochastic process on Ω\1Ω and the calculation of dks follow the projected half-space algorithm presented in Gobet (2001), which is also order 1 in the weak convergence sense. The idea is that, for an exact half-space boundary, ks can be proven to share the same probabilistic distribution with a composite stochastic variable involving Ws, coefficients of the SDE, the normal vector n^, and an independent exponential random variable with parameter (2s)−1 (Lépingle1995); therefore, ks can be explicitly calculated by these known quantities. For general smooth boundary geometry, an additional contribution to dks may also come from possible projection along the γ^ direction needed to keep the stochastic process within domain. With dks obtained and the SDE (Eq. 7) integrated, the functional (Eq. 9) can be readily evaluated by an ordinary numerical integration technique implemented along the realized stochastic path.

Expectation of the functionals can be estimated, in principle, from an arithmetic mean of a number N of sampled stochastic path integrals. The error of this estimation, ϵ=|E[Ft,x]-F̃t,x|, where a tilde is used to indicate a numerical realization in this section and 〈⋯〉 indicates averaging over samples, can be estimated by dividing the simulation of stochastic processes into batches (Zheng2015). Although the probabilistic distribution of individual F̃t,x is generally far from normal and largely unknown, that of the batch-wise mean of F̃t,x approaches a Gaussian for a large enough sample number per batch due to the central limit theorem, and a confidence interval can thereby be calculated for the batch-wise means using the Student's t distribution (e.g., Kloeden and Platen1992). We use this confidence interval as an approximation to ϵ. In this way, UBER adaptively stops simulating more batches of stochastic processes when the estimated error meets a prescribed tolerance.

In typical radiation belt problems, the functional values from various stochastic paths may differ by orders of magnitude, and hence their contributions to the arithmetic mean also differ by orders of magnitude, whereas their computational efforts are of the same order. Therefore, straightforward calculation of their arithmetic mean could result in extremely slow convergence with N and squander computational power. To reduce statistical variance in this procedure, a process-splitting technique is developed based on the idea of importance sampling, i.e., to make “denser” sampling in more important “regions”. In conventional Monte Carlo methods, the “region” is an “area” in a parameter space, and importance sampling effectively splits one sample point therein that would have made a huge contribution to the calculation into many sample points nearby, while weights of these samples are reduced accordingly to keep the probabilistic distribution of samples unbiased (e.g., Press et al.1992). But unlike conventional Monte Carlo methods, the samples in the SDE method are paths which belong to a functional space. To still implement this idea, we split the stochastic path when it is projected to contribute a large functional value.

Figure 1Schematic illustration of process splittings in a t⊗ℝ2 space. A stochastic process travels backward in time from point P and splits into two at point Q, where its projected functional value is found to be sufficiently large (see text for exact meaning). One child process splits again at point R, where its projected functional value is found to be even larger. The independent child processes would eventually stop either in Ω¯ as at points A and B or on 1Ω as at point C.


Figure 1 gives an illustration of this technique in a t⊗ℝ2 space. As a stochastic path being integrated from point P, the functional value of the entire path (from s=0 to s=t) is continuously predicted based on the partial path that has been realized. This projected functional value is compared to the value of some quantile (e.g., the 80th percentile) statistically derived from all previously completed stochastic paths starting from the same position. When at some place Q the projected functional value falls above this quantile, the stochastic process is deemed to make a significant contribution to the arithmetic mean. It is then split into a number of child processes at Q, and each child process traces down an independent path thereafter. These child paths, together with their common parent path segment PQ, hence constitute “nearby samples” in the functional space. This procedure can be further iterated if the projected functional value later falls into an even higher quantile (e.g., the 90th percentile), as shown at R. After all procedures finish, the eventual result is a tree structure of stochastic paths rooted at P. For the illustration in Fig. 1, the actual functional value of the path PQA will be weighted by 1/2, and those of PQRB and PQRC will be weighted by 1/4, when calculating their contributions to the mean. In the UBER code, a practical choice for the number of children at each splitting is 4, and that for the upper limit of offspring generations is 3, so that a stochastic process can be split into a maximum of 43=64 processes. Effects of the process-splitting technique are studied in the first problem in the next section.

It still remains to find a method to project the functional value of a stochastic path when it is only partially realized. For this purpose, we insert a break point at s=s(0,t) to the integrations in Eq. (9) and see how it transforms. We simplify the situation by only considering the stochastic processes stopping in Ω¯ for the moment and denote the following functional integrals:


in which the integrand functions c, λ, and u are as those in Eq. (9). Then, the functional t,x with the above presumptions and notations is transformed as

(19) F t , x = g 0 ( X t ) U 0 t + V 0 t = g 0 ( X t ) U 0 s U s t + V 0 s + U 0 s V s t = g 0 ( X t ) U s t + V s t U 0 s + V 0 s = F t - s , X s U 0 s + V 0 s ,

where Ft-s,Xs is the functional for a stochastic process that starts from the break point (t-s,Xs) and continues until s=t.

Suppose that a partial path has been realized up to s=s. From it we can readily evaluate U0s and V0s in Eq. (19) and therefore need an estimated F¯t-s,Xs to project the functional value F¯t,x, where a bar is put over all unrealized entities. Specifically, we would need these three estimates: X¯t, U¯st, and V¯st. In principle, a good estimation of X¯t is given by integrating along the streamline of the b(t-s,x) field through Xs until s=t, which means projecting for X¯t along the ballistic trajectory of motion while ignoring all the stochasticity since the Wiener process has zero mean. However, this integration is not much cheaper than the realization of Ft-s,Xs itself and is thus unaffordable. In anticipation that the total time length t would not be too large, especially when using a solution grid, and that b(t-s,x) would not vary drastically in this time interval, mapping X¯t along the constant vector b(t-s,Xs) is a good enough but much cheaper approximation. If X¯t is mapped out of Ω¯ so that g0(X¯t) is unable to be evaluated, the particular stochastic process is then disabled from splitting.

The functional values U¯st and V¯st are estimated by assuming that, for all possible stochastic paths belonging to the same solution point, there are mean functions c¯, λ¯, and u¯ that are independent of time and that the mean local time is proportional to the total time length of the stochastic process so that k¯s=k¯s with k¯ the proportionality constant. Under these assumptions, U¯st and V¯st can be expressed by

(20) U ¯ s t = exp c ¯ s t d s - λ ¯ s t d k s = exp ( c ¯ t - λ ¯ k ¯ t ) - ( c ¯ s - λ ¯ k ¯ s ) = U ¯ 0 t exp s t ln U ¯ 0 t ,


(21) V ¯ s t = V ¯ 0 t U ¯ 0 t - 1 U ¯ 0 t - exp s t ln U ¯ 0 t ,

if U¯0t1, or by


if U¯0t=1. The values of U¯0t and V¯0t can be well estimated, respectively, by the medians of Ũ0t and Ṽ0t that are obtained from all previously completed stochastic paths. Medians are preferred to means here because the probabilistic distributions of these functionals are usually very skewed and heavy-tailed. This projection mechanism would become statistically more accurate with more stochastic processes having been simulated.

4 Example problems

Four example problems are provided in this section. In the first problem, we solve a Fokker–Planck equation with two source terms, one proportional to the unknown function and the other independent of the unknown function, in both a spherical coordinate system and a Cartesian coordinate system. Effects of the process-splitting technique are analyzed in this example. In the second problem, an advection-dominated Fokker–Planck equation is considered. We further show that, even for a pure advection equation, the UBER code still gives the correct solutions, although it is not designed for such an equation and may not be the most efficient method. Code behavior with advection equations is further studied in the third problem in which the treatment of periodic boundary condition is also illustrated. In the last problem, we simulate the Earth's inner radiation belt by solving its Boltzmann equation involving realistic pitch-angle diffusion and CRAND source.

4.1 Problem 1: neutron generation and diffusion in nuclear material

In this problem, we consider the diffusion and generation of neutrons in a spherical nuclear material at detonation, with an initially injected Gaussian neutron distribution from a small source at the center and a neutron-reflecting coat that allows only one-half of the surface neutrons to escape. In a spherical coordinate system, the equation, initial condition, and boundary conditions are (Serber1992):


where f is neutron density, the constant diffusion coefficient D=0.1, the constant rate of neutron generation from chain reaction S=2.5, and v(r)=10-6/(1+r) characterizes a weak source of neutrons spontaneously emitted in the material. The values and functional forms of these coefficients are solely designed for demonstration purposes and are not meant to be experimentally accurate.

Figure 2(a) UBER and finite-difference solutions (dashed line) to the problem in Eqs. (24)–(27). The UBER 1D solutions (circles) are obtained in a one-dimensional spherical coordinate system, and the UBER 3D solutions (triangles) are obtained in three-dimensional Cartesian coordinates along a sphere radius. (b) Left y axis: the relative errors of the UBER 3D solutions at T=0.05, respectively obtained with the same total number of stochastic processes (2048 per batch) but different upper limits of offspring generations (ν) in the process-splitting technique. ν=0 means the process splitting is turned off. Right y axis: the percentage of stochastic processes that have undergone splitting for ν=4. (c) The reduction of relative errors with an increasing number of stochastic processes at the slowest-converging solution point (r=0.76) for ν=0 (dashed line and squares) and 4 (solid line and triangles). Colors denote different numerical experiments.


UBER solutions are obtained at four time stamps and are compared with those from a staggered-grid finite-difference method (e.g., Ames2014), as shown in Fig. 2a. A turning point is observed in the solutions at T=0.05, which marks the transition of the dominating neutron source from chain reaction at high background density to spontaneous emission at low density. As time goes by, the effect of the spontaneous emission is overwhelmed by the fast-growing chain reaction. Even though the solutions span 8 orders of magnitude, the UBER results are virtually identical to the finite-difference ones, and statistical fluctuation, which is a typical feature in Monte Carlo methods, is not observed in these solutions due to the adaptive algorithms and the variance reduction technique.

To demonstrate UBER's ability in multiple dimensions with a complicated boundary geometry, the same problem is also solved in a three-dimensional Cartesian coordinate system along a sphere radius. In this coordinate system, the diffusion coefficient becomes a 3 × 3 matrix with each diagonal component equal to D, and the boundary condition in Eq. (27) is applied to the only boundary that is a sphere with unit radius. The solutions are over-plotted in Fig. 2a. Consistency between the one-dimensional and the three-dimensional results is quite evident.

To analyze the effects of the process-splitting technique, we repeated the three-dimensional solutions at T=0.05, but with a fixed number of stochastic processes (2048 samples per batch, 200 batches) for each solution point and with various upper limits of the offspring generations ν; ν=0 indicates that the process-splitting technique is disabled. For a fixed number of samples, the relative error of a solution is proportional to the square root of the variance of sampled functional values and determines how fast the calculation of expectation converges. The relative errors as functions of r are plotted against the left y axis of Fig. 2b, and each curve is in fact formed by the medians from eight independent and identical numerical experiments to be more statistically representative. In the range 0.4<r<0.9, the relative errors are consistently reduced with higher offspring generations. At the slowest-converging point r=0.76, the process-splitting technique with a maximum of four offspring generations could reduce the relative error by an order of magnitude compared to that without splitting. For this curve (ν=4), the percentages of stochastic processes that have undergone splitting are plotted as a shaded area against the right y axis. For r<0.4, the relative errors are small and computational convergence is fast enough, so process splitting is automatically suppressed by the code to achieve an optimal speed. If the relative errors are large, usually a small fraction of split stochastic processes can be rather effective.

To further reveal the behavior of the process-splitting technique, Fig. 2c plots how the relative error reduces with an increasing number of samples (N) in the Monte Carlo procedure for the solution point at r=0.76. There are eight independent and identical numerical experiments, respectively, for ν=0 and 4, and each line represents the results from one numerical experiment. The general trend is that the relative error reduces linearly in a log–log scale plot, resembling its dependence on N-1/2. However, without process splitting, the relative error often jumps up sharply due to the occurrence of a very low-probability sample that made a very large contribution, which severely slows down the computational convergence. With process splitting, such jumps are largely avoided, and on average, the code uses just a little more than 1/100 of the samples without process splitting to achieve the same relative error of 0.1.

Table 2Normalized wall clock time versus maximum offspring generations (ν) for the UBER 3D solutions at T=0.05.

* Median value from eight independent numerical tests.

Download Print Version | Download XLSX

In practical UBER usage, solutions are achieved with a prescribed tolerance of relative error and an adaptive number of samples. Therefore, fast convergence with the process-splitting technique could save a significant amount of computational effort even with its extra computational burden. Table 2 lists the normalized wall clock time consumed by UBER for obtaining the solution curve in three dimensions at T=0.05 with a relative error tolerance of 0.1 and a range of maximum offspring generations in process splitting. Again, each of these numbers is the median from eight independent and identical numerical experiments. With ν=3 and 4, the code is nearly an order of magnitude faster than without process splitting. The same wall clock time in these two cases indicates that the faster convergence with more offspring generations starts to be traded off by the computational overhead associated with more complicated splitting, and therefore further increasing ν would not be optimal.

Figure 3(a) UBER (circles) and staggered-grid finite-difference (dashed line) solutions to the problem in Eqs. (28)–(31). (b) UBER (circles) and Lax–Wendroff (dashed line) solutions to the same problem but with zero diffusion.


4.2 Problem 2: magnetized plasma evolution under instability

In the second problem, we consider a Fokker–Planck equation for the pitch-angle distribution of a magnetized plasma (e.g., Dendy1990). Suppose that the electrons are initially in a sin 2(x) background pitch-angle distribution with x as the pitch angle. An electron beam is injected into the system centered at pitch angle x=0.4. In addition to pitch-angle diffusion, the injected beam excites some kind of plasma instability that kinetically transports the distribution toward the π/2 pitch angle. The equation, initial condition, and boundary conditions are written as


where f is the electron distribution function, the Jacobian determinant G=sin (x), the diffusion coefficient D(t,x)=(1/2π2)erf(t/2)[1+sin2(2x)], and the advection coefficient h(x)=cos (x). Note that, in most of the x range, the advection coefficient is about an order of magnitude larger in value than the diffusion coefficient. Equation (30) indicates a loss cone at pitch angle x=0.05. UBER solutions for this problem are plotted in Fig. 3a as circles and are in excellent agreement with those from the staggered-grid finite-difference method. In these solutions, the beam evolves toward x=π/2 because of the kinetic advection. As the system relaxes, the beam eventually merges into the background, and a final stable distribution is then approached.

Equation (28) degenerates to a continuity equation if pitch-angle diffusion is turned off by setting D(t,x) to zero. Even for such a pure advection problem, UBER can still obtain accurate and robust solutions compared to the widely used Lax–Wendroff method (e.g., Ames2014), as shown in Fig. 3b. Before T=2, an advection of the beam toward x=π/2 is seen in the solutions without dispersion, and UBER results are almost identical to the Lax–Wendroff ones. The system, however, is unstable due to the positive advection velocity at x<π/2 and the zero advection velocity at x=π/2 so that the electron distribution will be piled up near x=π/2 and ultimately evolve into a singularity. For this reason, the Lax–Wendroff method begins to fail at T=2 by generating unphysical negative solutions near x=1.3 and will be divergent henceforth; this problem would require careful choices of additional flux limiters to overcome. UBER nonetheless gives the correct results that still resolve the peak height and position of the beam.

Figure 4(a) Initial condition (solid black line) and UBER solutions of Eqs. (32) and (33) for a constant drift frequency after 1 round (T=1, blue dashed line) and 20 rounds (T=20, red dotted line) of drift around Earth. (b) Initial condition and UBER solutions of the same problem but with an accelerating drift frequency after 1 round (T=2) and 20 rounds (T=210) of drift around Earth.


4.3 Problem 3: particle adiabatic drift around Earth

In this problem, we study the UBER code behavior when solving the advection equation resulting from Eq. (2) but without any non-adiabatic process or source and loss, which describes the adiabatic drift of radiation belt particles around Earth. To be specific, the equation and boundary condition are


and an idealized initial condition g0(φ) consisting of a step function, a triangle, and a semicircle is given as that in Fig. 4a, which can induce unphysical negative solutions to finite-difference methods due to its discontinuities and infinite gradients (e.g., Aseev et al.2016). Equation (33) specifies the periodic boundary condition for the drift phase φ. In the UBER code, the periodic boundary condition is not really considered a boundary condition; rather, it is dealt with by extending the computational domain to include multiple periods so that the Itô stochastic processes would not move out of the domain within the given time duration, except for stopping on other first type boundaries. In this specific problem, the time stamp for obtaining solutions is every one-half drift period; therefore, the computational domain is extended for one extra period of φ from 0 to −2π since the stochastic processes retrograde in time. However, solutions are only sought in the right half of the domain for φ between 0 and 2π at each time stamp, and after that, they are copied to the left half to form the entire initial condition for the next time stamp.

By the method of characteristics, Eq. (32) is trivially solved for a constant drift frequency φ˙=2π, and the solution f¯(t,φ)=g0(φ-2πt) preserves its shape and returns to its initial position after each period of t=1. As shown in Fig. 4a, this is exactly what the UBER solutions do. Unlike finite-difference methods which often introduce numerical dispersion to advection equations, the UBER solutions can be exact because, without the stochastic terms, the solver of the SDE (Eq. 7) actually integrates along the characteristic curves of the advection equation, and for constant advection velocity, these characteristic curves are straight lines for which the Euler scheme is exact.

The Euler scheme is well known to overshoot when integrating along a curve that consistently curls to one side, and this would cause error in UBER solutions. To show this, we replace the drift frequency by φ˙=2πt in Eq. (32), for which the characteristic curves are now a family of parabolas whose equations are φ-φ0=πt2, with φ0 the parameter. Note that the drift is no longer periodic but forever accelerating. UBER solutions after the first round (T=2) and the 20th round (T=210) of drift are plotted in Fig. 4b. Compared with the initial condition, the UBER solutions are slightly displaced due to the overshoot, and the solution shapes are very slightly distorted, too. For radiation belt applications, though, any frequency φ˙ι must meanwhile be an approximate periodic function of φι, and hence the corresponding characteristic curves would at most wobble rather than curling, and the Euler scheme overshoot largely cancels rather than accumulating. Therefore, we would not expect the UBER solution errors in those applications to be worse than that illustrated in this case.

4.4 Problem 4: Earth's inner radiation belt simulation

In the last problem, we demonstrate UBER's ability to solve a radiation belt Boltzmann equation by performing an inner radiation belt simulation involving both the stably trapped (out of the drift loss cone) and the quasi-trapped (in the drift loss cone) electron populations. Inspired by Xiang et al. (2020), we consider the 304 keV electrons at McIlwain's LM=1.25, which are subject to pitch-angle scattering caused by Coulomb collisions with upper-atmospheric neutrals and ionospheric ions and electrons. The equation, initial condition, and boundary conditions are


Figure 5(a) Bounce-averaged pitch-angle diffusion coefficient Dαα (s−1) for 304 keV electrons. The blank area is in the bounce loss cone. (b) CRAND electron source rate Se/p2 (c3cm-3MeV-3s-1) for 304 keV electrons. The black line plots the variation of dipole L-shell versus geomagnetic longitude against the right y axis, corresponding to the McIlwain's LM=1.25. (c) Calculated electron fluxes (cm-2s-1sr-1MeV-1) at T= 2 h. (d) Calculated electron fluxes (cm-2s-1sr-1MeV-1) at T= 10 h.


In Eq. (34), drift frequency,

(39) φ ˙ = 3 c L R E e μ E p 2 m e D ( sin α 0 ) T ( sin α 0 ) ,

is evaluated using dipole field approximation (Schulz1991), in which c is the speed of light in a vacuum, L is dipole L-shell, RE is the radius of Earth, e is the elementary charge, μE is the magnetic moment of Earth's intrinsic dipole field, me is electron mass, p is electron momentum, α0 is electron equatorial pitch angle, and the functions D(sin α0) and T(sin α0) are bounce motion integrals in the dipole field that are given in Schulz (1991, pp. 205–210). For simplicity, we ignore the dependence of drift frequency on φ so that the drift phase becomes equivalent to geomagnetic longitude. The Jacobian determinant G=T(sin α0)sin (2α0). The bounce-averaged pitch-angle diffusion rate is empirically given by

(40) D α α = 10 - 5 exp 92.55 cos 4 α 0 - cos 4 α L ( φ ) + 10 - 9 ( s - 1 ) ,

which features quantitative resemblance to that calculated by realistic atmosphere and ionosphere models in Xiang et al. (2020). In this expression, αL(φ) is the bounce loss cone angle dependent on geomagnetic longitude that is determined by drift shell tracing in the International Geomagnetic Reference Field (IGRF; Finlay et al.2010). Dαα as a function of φ and α0 is plotted in Fig. 5a: it is only significant near the bounce loss cone and in the South Atlantic Anomaly (SAA) centered at about 20 of geomagnetic longitude due to the closer proximity of the drift shell to the upper atmosphere in these regions. The CRAND source rate Se/p2 is approximated by (Lenchek et al.1961; Selesnick2015)

(41) S e p 2 1.7 × 10 - 12 ( E max - E ) 2 L 2.7 sin α 0 ( c 3 cm - 3 MeV - 3 s - 1 ) ,

where Emax is the maximum kinetic energy (782 keV) available to electrons from neutron β decay and E is the electron kinetic energy in question, with both being measured in units of the electron rest energy (511 keV), and L is the dipole L-shell, which is a variable dependent on geomagnetic longitude due to multipoles of the Earth's magnetic field. Figure 5b plots the CRAND source rate and the dipole L-shell values corresponding to McIlwain's LM=1.25 obtained from drift shell tracing in IGRF, which vary from less than 1.2 in the SAA to above 1.3 near 180 of geomagnetic longitude.

The simulation is performed with an initially empty radiation belt as indicated by Eq. (35), and electrons are gradually generated by the CRAND source and meanwhile lost to the bounce loss cone. Figure 5c and d show the solution electron fluxes calculated from j¯=f¯p2 after 2 h (about one drift period) and 10 h, respectively. The characteristic west–east electron flux gradient is formed for the quasi-trapped population (α0<60) within the first 2 h and changes very little over time because the SAA sweeps these electrons out every drift period. Weak pitch-angle diffusion of electron fluxes from the quasi-trapped population toward the stably trapped population can be observed at T= 2 h when the stably trapped fluxes are still low due to the stronger source rate in the quasi-trapped region. At T= 10 h, the direction of the pitch-angle diffusion is reversed. Even with atmospheric loss, the CRAND source is strong enough to continuously contribute to the trapped electron fluxes, which are increased by 1 order of magnitude in 8 h.

5 Conclusions and discussion

In conclusion, we have built a numerical solver for the general form of kinetic equations that appear in radiation belt studies. Based on the SDE method, the solver is coded to work in arbitrarily provided coordinate systems of up to three dimensions with user-specified boundary geometry, boundary conditions, and equation terms. We have also designed adaptive algorithms and a variance reduction technique for the SDE method, which enhanced its computational speed by 1 order of magnitude in our test. The example problems in this article demonstrated the solver's versatility and robustness in dealing with a range of problems that might each require a different solver in other methods. The solver, named UBER, has been programmed into a Fortran library that can be easily incorporated with other more complicated space physics models.

Figure 6Left y axis: UBER code speedup, defined as the ratio between wall clock time of serial and parallel executions, as a function of parallel threads for solving Problem 1 in spherical coordinates. Each data point (cross) is the median from eight independent and identical numerical experiments, with the associated error bar covering the range of those eight experiments. The dashed line gives a fit to the data points using Amdahl's law with parameter p=0.85 (see text for its meaning). The right y axis marks the actual wall clock time in seconds for the numerical experiments on a Linux server with a maximum of 86 CPUs (the rightmost data point) at 2.1 GHz of floating-point operation frequency per processor.


The strengths of the SDE method lie in its abilities to solve problems that are difficult for other methods, but not in its speed. Even with the presented improvements, the SDE method is in general nowhere near the finite-difference counterparts in terms of speed due to its Monte Carlo nature. However, Monte Carlo methods are perfectly parallelized for no communication among parallel tasks; therefore, the UBER code gains handsome speedup from parallel computation. Figure 6 displays the UBER code speedup, measured as the ratio between wall clock time of serial and parallel executions, against different numbers of parallel threads employed in solving Problem 1 in spherical coordinates. For a fixed amount of computation, speedup is limited by the portion of work that cannot be parallelized; this is described by Amdahl's law (inset formula in Fig. 6Amdahl1967), which fits the numerical experiment data with parameter p=0.85, meaning that effectively 85 % of the total work has been parallelized. With 86 threads, the code approaches its theoretical maximum speedup of 6.7 for this problem, which uses about 620 s of wall clock time on a Linux server. The referenced finite-difference code uses 65 s in serial execution to solve the problem on the same server. In other words, the UBER code, in its ideal parallelization, is an order of magnitude slower than serial finite-difference methods, and it is also roughly the case for Problem 4 (Zheng Xiang, personal communication, 2021) for which the UBER code spends 22 min to simulate one drift period. This conclusion, however, is rather qualitative because in these comparisons different methods have used quite different grids. Finite-difference methods generally require a much finer grid than the SDE method in order to deal with short-wavelength components of the solution, and time spacing is restricted accordingly by the Courant–Friedrichs–Lewy condition (e.g., Press et al.1992). On the other hand, the SDE method affords a grid that is only concentrated in interesting areas as long as global interpolation can be warranted, and the time stamps for dumping solutions are up to user choice. Moreover, if only local solutions are sought over a short time period, the SDE method could totally disregard the uninteresting solutions and might be more efficient. For radiation belt simulations in particular, finite-difference models usually rely on the operator splitting technique (e.g., Press et al.1992) to solve tensorial diffusion in three dimensions, which involves frequent interpolation between two sets of grids and hence greatly hinders speed, whereas problem dimensionality has little effect on the SDE method.

Several other forms of radiation belt kinetic equations should also be solvable by the method presented in this article. In formulating the Boltzmann equation (Eq. 2), we have assumed that the unperturbed particle Hamiltonian H0 is independent of phases of particle motion. For lower-energy ring current particles, the convective electric field potential energy is not negligible in their Hamiltonian, and therefore H0 would be dependent on the drift phase. As such, expanding the Poisson bracket [f¯,H0] on the left-hand side of the Boltzmann equation will result in additional terms involving partial differentials with respect to the generalized momenta Qι (see Appendix A). For a radiation belt model including ring current particles, the general form of the Boltzmann equation will be

(42) f ¯ t + φ ˙ ι f ¯ φ ι + Q ˙ ι f ¯ Q ι = ,

in which the omitted right-hand side is exactly the same as that of Eq. (2). The Boltzmann equations of the so-called four-dimensional radiation belt models, such as the CIMI model (Fok et al.2014), the VERB-4D model (Aseev et al.2016), and the K2 MHD-particle model (Elkington et al.2019), are of this type. Similar to the treatment of Eq. (2), Eq. (42) can be obtained from Eq. (1) by introducing the new coordinates xξ={Qμ,Qι,φι}{xμ,xκ}, which enlarges the index set from α to ξ, and performing the following transformation of equation coefficients:

(43) D ̃ μ ν = D μ ν , D ̃ ξ κ = 0 , h ̃ ξ = { h μ , x ˙ κ } , S ̃ = S + κ x ˙ κ + Q ˙ ι ι ln G , v ̃ = v .

Therefore, the Boltzmann equation (Eq. 42) can also be solved by the method presented in this article in principle. However, such four-dimensional simulations are beyond the current scope of the UBER code since it is only coded for up to three dimensions in space.

Nonlinear evolution of phase space density occurs when the particle scatterings are not only small-scale but also large-scale, usually as a result of trapping by intense plasma waves (e.g., Bortnik et al.2008; Albert et al.2013). In this case, the right-hand side of the kinetic equation must include terms of nonlocal transport of phase space density by these large-scale scatterings, and the equation is formulated as (Artemyev et al.2016; Zheng et al.2019)

(44) f ¯ t = 1 G Q α G D α β f ¯ Q β - 1 G Q α G h α f ¯ - P Q Q ̃ G ̃ d Q ̃ α f ¯ + P Q ̃ Q f ̃ G ̃ d Q ̃ α ,

in which f̃ is a shorthand for the function f¯(t,Q̃α), and G̃ is the Jacobian determinant evaluated at Q̃α. With nonlinear wave–particle interactions, the phase bunching effect gives rise to advection characterized by the coefficients hα. The function PQQ̃ is the trapping probability density per unit time from Qα to Q̃α; that is, particles are trapped by the wave field at Qα and subsequently escape from trapping at Q̃α. This is considered a known function which can be evaluated from single particle behaviors by either perturbation theory of Hamiltonian mechanics (e.g., Artemyev et al.2016) or test-particle simulations (e.g., Vainchtein et al.2018). Note that, since the unknown function is contained in the last integral term, Eq. (44) is an integro-differential equation. However, formal similarity between Eq. (44) and the Fokker–Planck equation (Eq. 1) suggests that an asymptotic solution of Eq. (44) may be achieved by Taylor expanding f̃ as

(45) f ̃ = f ̃ 0 + f ̃ 0 t + ,

where f̃0=f¯(0,Q̃α), and f̃0 indicates its time-derivative function evaluated at t=0. When applying the SDE method with a solution grid, the functions f̃0 and f̃0 can be obtained from solutions of previous time stamps. Then, by defining the following coefficients


which are now known functions, Eq. (44) is transformed into the form of Eq. (1) and is readily solvable by the UBER code. In this way, simulations of nonlinear wave–particle interactions in radiation belts could hence be unified with well-developed simulations in quasi-linear theory.

Appendix A: A formal derivation of radiation belt kinetic equations

We consider, for simplicity, a hypothetical radiation belt whose particle motion has two well-separated periods which define two pairs of action-angle variables {φ1,φ2,J1,J2}. We assume that the phase angle φ1 changes much faster than φ2 and hence call {φ1,J1} the fast variables and {φ2,J2} the slow variables. We further assume, for a moment, that the Hamiltonian of particle motion,

(A1) H ( φ 1 , φ 2 , J 1 , J 2 , t ) = H 0 ( φ 2 , J 1 , J 2 , t ) + δ H ( φ 1 , J 1 , J 2 , t ) ,

is constituted of an unperturbed part H0 that depends on the slow phase and a perturbation δH that is caused by electromagnetic forces whose variation timescale is shorter than the periodicity 2π/φ˙1. Apparently, δH is a periodic function of φ1. Upon averaging over φ1, the perturbation cancels out so that

(A2) δ H 1 2 π 0 2 π δ H d φ 1 = 0 .

These presumptions allow the particle phase space density,

(A3) f ( φ 1 , φ 2 , J 1 , J 2 , t ) = f ¯ ( φ 2 , J 1 , J 2 , t ) + δ f ( φ 1 , J 1 , J 2 , t )

to be so decomposed into a fast-phase-averaged part f¯f and a perturbation δf, also periodic in φ1, which has δf〉=0 by definition.

For a collisionless plasma, evolution of f is governed by the Vlasov equation:

(A4) f t + φ ˙ α f φ α + J ˙ α f J α = 0 ,

in which the index α=1,2. Expressing φ˙α and J˙α by Hamilton's canonical equations, the Vlasov equation can be expanded in light of Eqs. (A1) and (A3). When averaging the expanded equation over φ1, all terms to the first order in perturbation vanish due to either their null phase average or periodicity in φ1, and the remaining terms form the equation

(A5') f ¯ t + H 0 J 2 f ¯ φ 2 - H 0 φ 2 f ¯ J 2 = - δ H J 1 δ f φ 1 - δ H φ 1 δ f J 1 ,

organized into Poisson brackets with respect to the canonical coordinates {φα,Jα},

(A5) f ¯ t + [ f ¯ , H 0 ] = - [ δ f , δ H ] .

In fact, this equation form is more neatly derived from Liouville's theorem (e.g., Goldstein1980, chap. 9), which says

(A6) d f d t = f t + [ f , H ] = 0 .

Phase averaging Eq. (A6) over φ1 and noting that [δf,H0]=[f¯,δH]=0 directly gives Eq. (A5).

Equation (A5) appears in the form of a Boltzmann equation for the phase-averaged phase space density: the left-hand side describes evolution of the unperturbed system in the slow variables, whereas the right-hand side, involving only the perturbed quantities and the fast variables, serves the role of a collision integral. Indeed, it can be viewed as “collisions” between particles and the perturbing electromagnetic field. In this regard, we symbolically denote the right-hand side (ft)w in analogy to that caused by real collisions, with the subscript designating wave–particle interaction.

Equation (A5) is closed when its right-hand side can be expressed in terms of f¯ under certain approximations. If both δH and δf are small compared to their unperturbed counterparts, δf can be directly solved from the linearized Vlasov equation retaining only the first-order terms in expansion and, after mathematical transformations, gives the expression (e.g., Lerche1968; Kaufman1972)

(A7) f t w = J 1 D w f ¯ J 1 ,

where the coefficient Dw is a functional of δH. The corresponding theory is called the quasi-linear theory. However, when the perturbing electromagnetic wave is sufficiently coherent, δf may become large even if δH remains small. In this situation, (ft)w is estimated by considering particle phase trajectories near the resonance point (e.g., Artemyev et al.2016). The result would then contain corrections to Eq. (A7), which are due to particles trapped in phase with the wave, whose formulation in the current setup could be inferred from that of Eq. (44) in the body; we hereby do not elaborate.

Taking account of collisions in the plasma would introduce to Eq. (A4) a collision term not describable by the single-particle Hamiltonian so that the transport equation becomes the Boltzmann equation:

(A8) f t + φ ˙ α f φ α + J ˙ α f J α = f t c .

Following the same treatment from Eqs. (A4) to (A5), Eq. (A8) leads to the fast-phase-averaged equation

(A9) f ¯ t + [ f ¯ , H 0 ] = f t w + f t c .

For Coulomb collisions, small-angle scatterings at large impact parameters dominate due to the long range of Coulomb force, and consequently the phase-averaged collision integral can be expanded into a Fokker–Planck form in the generalized momenta that are changed by the collisions (Lifshitz and Pitaevskii1981, chaps. 2 and 4), i.e.,

(A10) f t c = J 1 D c f ¯ J 1 - J 1 h c f ¯ ,

which usually only involves the fast momentum J1 on timescales much shorter than 2π/φ˙2. The transport coefficients Dc and hc are determined from the particle species and their collision cross sections. We note again that, in the phase-averaged kinetic equation (Eq. A9), slow and fast variables are separated onto each side of the equation.

Neglecting the source and loss terms, the quasi-linear kinetic equations in the body of this paper could all be recovered from Eqs. (A7), (A9), and (A10), which are already in the same form as Eq. (42). If there are no slow variables, the Poisson bracket on the left-hand side of Eq. (A9) vanishes, and the equation reduces to the Fokker–Planck equation (Eq. 1). If there are slow variables but φ2 is cyclic to the unperturbed Hamiltonian, the left-hand side of Eq. (A9) would then contain the first two terms shown in Eq. (A5'), which is in the form of Eq. (2). In this case, the dependence of f¯ on φ2 is introduced by means other than the Hamiltonian, such as the φ2-dependent boundary geometry, boundary conditions, or collision terms.

Code and data availability

The UBER library is free and open-source. The current version of UBER is available from the GitHub repository at (last access: 14 April 2021) under the MIT license. The exact version of the UBER library used to produce the results used in this paper is archived on Zenodo (, as are input data and scripts to run the library and produce the plots and tables for all the simulations presented in this paper ( et al.2021).

Author contributions

LZ developed the UBER library, conducted the benchmark experiments, and wrote the paper. LC and AAC contributed to the conceptualization of the UBER library. PW, ZX, and XL provided ancillary code and data for the benchmark experiments.

Competing interests

The contact author has declared that neither they nor their co-authors have any competing interests.


Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.


The authors would like to express their gratitude to Michael Schulz for the very enlightening discussions on this work and to Zheng Xiang for exchanging code performance data. Liheng Zheng and Anthony A. Chan acknowledge NASA for financial support of this work.

Financial support

This research has been supported by the National Aeronautics and Space Administration (grant nos. 80NSSC18K1224 and NNX15AI93G).

Review statement

This paper was edited by Josef Koller and reviewed by three anonymous referees.


Albert, J. M., Meredith, N. P., and Horne, R. B.: Three-dimensional diffusion simulation of outer radiation belt electrons during the 9 October 1990 magnetic storm, J. Geophys. Res.-Space, 114, A09214,, 2009. a

Albert, J. M., Tao, X., and Bortnik, J.: Aspects of Nonlinear Wave-Particle Interactions, American Geophysical Union,, 255–264, 2013. a

Amdahl, G. M.: Validity of the single processor approach to achieving large scale computing capabilities, in: Proceedings of the April 18–20, 1967, spring joint computer conference, 483–485, 1967. a

Ames, W. F.: Numerical methods for partial differential equations, Academic Press, San Diego, California, 2014. a, b

Artemyev, A. V., Neishtadt, A. I., Vasiliev, A. A., and Mourenas, D.: Kinetic equation for nonlinear resonant wave-particle interaction, Phys. Plasmas, 23, 090701,, 2016. a, b, c

Artemyev, A. V., Neishtadt, A. I., Vainchtein, D. L., Vasiliev, A. A., Vasko, I. Y., and Zelenyi, L. M.: Trapping (capture) into resonance and scattering on resonance: Summary of results for space plasma systems, Commun. Nonlinear Sci., 65, 111–160,, 2018. a

Aseev, N. A., Shprits, Y. Y., Drozdov, A. Y., and Kellerman, A. C.: Numerical applications of the advective-diffusive codes for the inner magnetosphere, Space Weather, 14, 993–1010,, 2016. a, b

Baker, D. N.: The occurrence of operational anomalies in spacecraft and their relationship to space weather, IEEE T. Plasma Sci., 28, 2007–2016,, 2000. a

Beutier, T., Boscher, D., and France, M.: Salammbô: A three-dimensional simulation of the proton radiation belt, J. Geophys. Res., 100, 17181–17188, 1995. a

Bolton, S. J., Janssen, M., Thorne, R., Levin, S., Klein, M., Gulkis, S., Bastian, T., Sault, R., Elachi, C., Hofstadter, M.,Bunker, A., Dulk, G., Gudim, E., Hamilton, G., Johnson, W. T. K., Leblanc, Y., Liepack, O., McLeod, R., Roller, J., Roth, L., and West, R.: Ultra-relativistic electrons in Jupiter's radiation belts, Nature, 415, 987–991,, 2002. a

Bolton, S. J., Thorne, R. M., Bourdarie, S., de Pater, I., and Mauk, B.: Jupiter's inner radiation belts, in: Jupiter: The Planet, Satellites, and Magnetosphere, p. 671, 2004. a

Bortnik, J., Thorne, R. M., and Inan, U. S.: Nonlinear interaction of energetic electrons with large amplitude chorus, Geophys. Res. Lett., 35, L21102,, 2008. a

Cunningham, G. S., Loridan, V., Ripoll, J.-F., and Schulz, M.: Neoclassical Diffusion of Radiation-Belt Electrons Across Very Low L-Shells, J. Geophys. Res.-Space, 123, 2884–2901,, 2018. a

Dendy, R. O.: Plasma Dynamics, Clarendon Press, Oxford, 1990. a

Elkington, S. R., Chan, A. A., Jaynes, A. N., Malaspina, D., and Albert, J.: K2: Towards a Comprehensive Simulation Framework of the Van Allen Radiation Belts, AGU Fall Meeting, 2019, SM44B–01, 2019. a

Finlay, C. C., Maus, S., Beggan, C. D., Bondar, T. N., Chambodut, A., Chernova, T. A., Chulliat, A., Golovkov, V. P., Hamilton, B., Hamoudi, M., Holme, R., Hulot, G., Kuang, W., Langlais, B., Lesur, V., Lowes, F. J., Lühr, H., Macmillan, S., Mandea, M., McLean, S., Manoj, C., Menvielle, M., Michaelis, I., Olsen, N., Rauberg, J., Rother, M., Sabaka, T. J., Tangborn, A., Tøffner-Clausen, L., Thébault, E., Thomson, A. W. P., Wardinski, I., Wei, Z., and Zvereva, T. I.: International Geomagnetic Reference Field: the eleventh generation, Geophys. J. Int., 183, 1216–1230,, 2010. a

Fok, M.-C., Buzulukova, N. Y., Chen, S.-H., Glocer, A., Nagai, T., Valek, P., and Perez, J. D.: The Comprehensive Inner Magnetosphere-Ionosphere Model, J. Geophys. Res.-Space, 119, 7522–7540,, 2014. a

Freidlin, M.: Functional Integration and Partial Differential Equations, no. 109 in Annals of Mathematics Studies, Princeton Univ. Press, Princeton, New Jersey, 1985. a, b

Gobet, E.: Euler schemes and half-space approximation for the simulation of diffusion in a domain, ESAIM-Probab. Stat., 5, 261–297,, 2001. a

Goldstein, H.: Classical Mechanics, 2nd edn., Addison-Wesley, Reading, Mass., 1980. a, b

Horne, R. B., Thorne, R. M., Glauert, S. A., Menietti, J. D., Shprits, Y. Y., and Gurnett, D. A.: Gyro-resonant electron acceleration at Jupiter, Nat. Phys., 4, 301–304, 2008. a

Kac, M.: On distributions of certain Wiener functionals, T. Am. Math. Soc., 65, 1–13, 1949. a

Kaufman, A. N.: Quasilinear diffusion of an axisymmetric toroidal plasma, Phys. Fluids, 15, 1063–1069, 1972. a

Kim, H.-J. and Chan, A. A.: Fully-Adiabatic Changes in Storm-Time Relativistic Electron Fluxes, J. Geophys. Res., 102, 22107–22116, 1997. a

Klebaner, F. C.: Introduction to stochastic calculus with applications, Imperial College Press, London, 2005. a

Kloeden, P. E. and Platen, E.: Numerical Solution of Stochastic Differential Equations, vol. 23 of Applications of Mathematics, Springer-Verlag, Berlin, 1992. a, b

Lenchek, A. M., Singer, S. F., and Wentworth, R. C.: Geomagnetically trapped electrons from cosmic ray albedo neutrons, J. Geophys. Res., 66, 4027–4046,, 1961. a

Lépingle, D.: Euler scheme for reflected stochastic differential equations, Math. Comput. Simulat., 38, 119–126, 1995. a

Lerche, I.: Quasilinear theory of resonant diffusion in a magneto-active, relativistic plasma, Phys. Fluids, 11, 1720–1727, 1968. a

Li, W. and Hudson, M. K.: Earth's Van Allen Radiation Belts: From Discovery to the Van Allen Probes Era, J. Geophys. Res.-Space, 124, 8319–8351,, 2019. a

Li, X., Selesnick, R., Schiller, Q., Zhang, K., Zhao, H., Baker, D. N., and Temerin, M. A.: Measurement of electrons from albedo neutron decay and neutron density in near-Earth space, Nature, 552, 382–385,, 2017. a

Lifshitz, E. M. and Pitaevskii, L. P.: Physical Kinetics, Pergamon Press, Oxford, 1981. a

Lorenzato, L., Sicard, A., and Bourdarie, S.: A physical model for electron radiation belts of Saturn, J. Geophys. Res.-Space, 117, A08214,, 2012. a

Mathews, J. and Walker, R. L.: Mathematical Methods of Physics, 2nd Edn., Addison-Wesley, Reading, Mass., 1970. a

Nénon, Q., Sicard, A., and Bourdarie, S.: A new physical model of the electron radiation belts of Jupiter inside Europa's orbit, J. Geophys. Res.-Space, 122, 5148–5167,, 2017. a

Øksendal, B.: Stochastic Differential Equations, 5th Edn., Springer, Berlin, 1998. a, b

Press, W. H., Teukolsky, S. A., Vetterling, W. T., and Flannery, B. P.: Numerical Recipes in FORTRAN: The Art of Scientific Computing, Cambridge Univ. Press, New York, 1992. a, b, c

Schulz, M.: The Magnetosphere, in: Geomagnetism, Vol. 4, edited by: Jacobs, J. A., Academic Press, London, 87–293, 1991. a, b, c

Schulz, M.: Canonical Coordinates for Radiation-Belt Modeling, in: Radiation Belts: Models and Standards, edited by: Lemaire, J. F., Heynderickx, D., and Baker, D. N., American Geophysical Union, Washington, DC, 153–160, 1996. a

Selesnick, R. S.: Atmospheric scattering and decay of inner radiation belt electrons, J. Geophys. Res.-Space, 117, A08218,, 2012. a

Selesnick, R. S.: High-energy radiation belt electrons from CRAND, J. Geophys. Res.-Space, 120, 2912–2917,, 2015. a, b

Selesnick, R. S., Blake, J. B., and Mewaldt, R. A.: Atmospheric losses of radiation belt electrons, J. Geophys. Res.-Space, 108, 1468,, 2003. a

Selesnick, R. S., Hudson, M. K., and Kress, B. T.: Direct observation of the CRAND proton radiation belt source, J. Geophys. Res.-Space, 118, 7532–7537,, 2013. a

Serber, R.: The Los Alamos primer: The first lectures on how to build an atomic bomb, Univ of California Press, Berkeley, California, 1992. a

Shprits, Y. Y., Chen, L., and Thorne, R. M.: Simulations of pitch angle scattering of relativistic electrons with MLT-dependent diffusion coefficients, J. Geophys. Res.-Space, 114, A03219,, 2009. a

Subbotin, D. A. and Shprits, Y. Y.: Three-dimensional radiation belt simulations in terms of adiabatic invariants using a single numerical grid, J. Geophys. Res.-Space, 117, A05205,, 2012. a

Subbotin, D. A., Shprits, Y. Y., and Ni, B.: Three-dimensional VERB radiation belt simulations including mixed diffusion, J. Geophys. Res.-Space, 115, A03205,, 2010. a

Tao, X., Chan, A. A., Albert, J. M., and Miller, J. A.: Stochastic modeling of multidimensional diffusion in the radiation belts, J. Geophys. Res.-Space, 113, A07212,, 2008. a, b

Tao, X., Albert, J. M., and Chan, A. A.: Numerical modeling of multidimensional diffusion in the radiation belts using layer methods, J. Geophys. Res.-Space, 114, A02215,, 2009. a

Tu, W., Selesnick, R., Li, X., and Looper, M.: Quantification of the precipitation loss of radiation belt electrons observed by SAMPEX, J. Geophys. Res.-Space, 115, A07210,, 2010. a

Tu, W., Cunningham, G. S., Chen, Y., Henderson, M. G., Camporeale, E., and Reeves, G. D.: Modeling radiation belt electron dynamics during GEM challenge intervals with the DREAM3D diffusion model, J. Geophys. Res.-Space, 118, 6197–6211,, 2013. a

Turner, D. L. and Ukhorskiy, A. Y.: Outer radiation belt losses by magnetopause incursions and outward radial transport: new insight and outstanding questions from the Van Allen Probes era, in: The dynamic loss of Earth's radiation belts, Elsevier, Amsterdam, 1–28, 2020. a

Vainchtein, D. L., Zhang, X.-J., Artemyev, A. V., Mourenas, D., Angelopoulos, V., and Thorne, R. M.: Evolution of Electron Distribution Driven by Nonlinear Resonances With Intense Field-Aligned Chorus Waves, J. Geophys. Res.-Space, 123, 8149–8169,, 2018. a

Van Allen, J. A. and Frank, L. A.: Radiation around the Earth to a radial distance of 107,400 km, Nature, 183, 430–434,, 1959. a

Vernov, S., Chudakov, A., Gorchakov, E., Logachev, J., and Vakulov, P.: Study of the cosmic-ray soft component by the 3rd Soviet Earth satellite, Planet. Space Sci., 1, 86–93,, 1959. a

Wang, C., Ma, Q., Tao, X., Zhang, Y., Teng, S., Albert, J. M., Chan, A. A., Li, W., Ni, B., Lu, Q., and Wang, S.: Modeling radiation belt dynamics using a 3-D layer method code, J. Geophys. Res.-Space, 122, 8642–8658,, 2017. a

Welling, D. T.: The long-term effects of space weather on satellite operations, Ann. Geophys., 28, 1361–1367,, 2010. a

Xiang, Z., Li, X., Selesnick, R., Temerin, M. A., Ni, B., Zhao, H., Zhang, K., and Khoo, L. Y.: Modeling the Quasi-Trapped Electron Fluxes From Cosmic Ray Albedo Neutron Decay (CRAND), Geophys. Res. Lett., 46, 1919–1928,, 2019. a

Xiang, Z., Li, X., Temerin, M. A., Ni, B., Zhao, H., Zhang, K., and Khoo, L. Y.: On Energetic Electron Dynamics During Geomagnetic Quiet Times in Earth's Inner Radiation Belt due to Atmospheric Collisional Loss and CRAND as a Source, J. Geophys. Res.-Space, 125, e2019JA027678,, 2020.  a, b, c

Zheng, L.: Development and Application of Stochastic Methods for Radiation Belt Simulations, PhD thesis, Rice University, Houston, TX, 2015. a, b, c

Zheng, L.: zheng-lh/UBER: First release of UBER [code], Zenodo,, 2021. a

Zheng, L., Chan, A. A., Albert, J. M., Elkington, S. R., Koller, J., Horne, R. B., Glauert, S. A., and Meredith, N. P.: Three-dimensional stochastic modeling of radiation belts in adiabatic invariant coordinates, J. Geophys. Res.-Space, 119, 7615–7635,, 2014. a, b, c

Zheng, L., Chan, A. A., O'Brien, T. P., Tu, W., Cunningham, G. S., Albert, J. M., and Elkington, S. R.: Effects of magnetic drift shell splitting on electron diffusion in the radiation belts, J. Geophys. Res.-Space, 121, 11,985–12,000,, 2016. a

Zheng, L., Chen, L., and Zhu, H.: Modeling Energetic Electron Nonlinear Wave-Particle Interactions With Electromagnetic Ion Cyclotron Waves, J. Geophys. Res.-Space, 124, 3436–3453,, 2019. a

Zheng, L., Chen, L., Chan, A. A., Wang, P., Xia, Z., and Liu, X.: UBER v1.0: A universal kinetic equation solver for radiation belts [data set], Zenodo,, 2021. a

Short summary
Earth’s Van Allen belts are studied by solving particular kinds of equations that could be notoriously difficult when different physical processes are acting together. In this article, we describe a numerical code that can solve these equations with unprecedented freedom from the numerous restrictions of existing models, even the ones that no other can solve. The abilities of our code could mean a breakthrough in Van Allen belt studies from the diffusive into the non-diffusive transport regime.