UBER v1.0: a universal kinetic equation solver for radiation belts

. Recent proceedings in radiation belt studies have proposed new requirements for numerical methods to solve the kinetic equations involved. In this article, we present a numerical solver that can solve the general form of the radiation belt Fokker–Planck equation and Boltzmann equation in arbitrarily provided coordinate systems and with user-speciﬁed boundary geometry, boundary conditions, and equation terms. The solver is based upon the mathematical theory of stochastic differential equations, whose computational accuracy and efﬁciency are greatly enhanced by specially designed adaptive algorithms and a variance reduction technique. The versatility and robustness of the solver are exhibited in four example problems. The solver applies to a wide spectrum of radiation belt modeling problems, including the ones featuring non-diffusive particle transport such as that arising from nonlinear wave–particle interactions.


Introduction
In the space plasma environment, radiation belts refer to torus-shaped regions surrounding Earth and other magnetized planets that are filled with highly energetic charged particles trapped in the planetary magnetic field. Since their discovery (Van Allen and Frank, 1959;Vernov et al., 1959), radiation belts have been the focus of intense research due to the innumerable unknowns concerning their extremely dynamic behavior and their damaging effects on spacecraft (e.g., Baker, 2000;Welling, 2010). During slowly changing conditions, radiation belt particles undergo three types of periodic motion: gyration about field lines, bounce along field lines, and drift about the planet. With each periodic motion there is a corresponding adiabatic invariant, defined through the Hamiltonian action integral (e.g., Goldstein, 1980, chap. 10), that is only violated when the conditions are undergoing change on timescales shorter than the period. A widely adopted method to study the dynamics of radiation belts is to solve a kinetic equation describing the evolution of particle phase space density. In quasi-linear theory, this kinetic equation is usually a Fokker-Planck equation that takes the general covariant form (Schulz, 1991) wheref is the phase-averaged phase space density, G = det( ∂J α ∂Q β ) is the Jacobian determinant for the transformation from canonical action variables J α (α = 1, 2, 3) to the generalized coordinates Q α , and D αβ , h α , S, and v are coefficients of the equation. Summation on repeated Greek indices is implied throughout this paper. In different radiation belts, the number of terms emerging on the right-hand side of Eq. (1) and their respective physical backgrounds may be different. For the Earth's outer radiation belt, the second and the fourth terms are usually missing; the first term represents diffusion caused by wave-particle interactions, and the third term is often a loss characterized by the particle lifetime (e.g., Li and Hudson, 2019, and the reference therein). In the lowaltitude inner radiation belt where wave-particle interactions are not as significant, the first and second terms are often provided by the diffusion and dynamic friction caused by interparticle Coulomb collisions (e.g., Selesnick, 2012), and the fourth term may be a source from cosmic ray albedo neutron decay (CRAND; e.g., Selesnick, 2015;Li et al., 2017). For radiation belts of the gas giants, all terms could be present (e.g., Horne et al., 2008;Lorenzato et al., 2012). The first two terms may be attributable to both wave-particle interactions and inter-particle collisions and, in addition, synchrotron radiation, which is negligible in Earth's radiation belts, bleeds energy for the ultra-relativistic electrons, and thus also contributes to the second term (e.g., Bolton et al., 2002Bolton et al., , 2004. The third term could represent the moon-sweeping loss, and the fourth term may come from moon volcanic activities as a plasma source (e.g., Nénon et al., 2017).
In some circumstances, the dependence of phase space density on certain phases ϕ ι could be discerned, and the radiation belt kinetic equation takes the form where a dot over ϕ ι indicates its time derivative. The phase space densityf here is only averaged over the phases varying faster than ϕ ι , and the mutually exclusive indices ι and µ together form the complete set of α. The most common situation is perhaps the dependence off on the drift phase ϕ 3 , which in the Earth's outer radiation belt may be caused by the wave activity dependence on magnetic local time (e.g., Shprits et al., 2009) and in the inner belt by the longitudinal variation of drift shell altitude (e.g., Tu et al., 2010;Xiang et al., 2019). With the spatial derivative term on the left-hand side, Eq.
(2) appears as a Boltzmann equation forf , but by Hamiltonian mechanics, the conjugating termQ ι (∂f/∂Q ι ) should have also appeared on the left-hand side. Its absence is due to the fact that, for particles in the radiation belt energy range, the drift-phase-dependent electric potential energy is usually negligible in the unperturbed particle Hamiltonian so that ϕ 3 becomes a cyclic variable. Appendix A provides a more comprehensive explanation of this equation, and we will return to the general case in which ϕ 3 is not cyclic in the "Conclusions and discussion" section. Various numerical models have been built to solve a specific form of either Eq. (1) or (2) (e.g., Beutier et al., 1995;Selesnick et al., 2003;Tao et al., 2008;Albert et al., 2009;Subbotin et al., 2010;Tu et al., 2013;Wang et al., 2017;Xiang et al., 2020, to name a few). Though their underlying numerical schemes might not restrict, these existing models have in practice been implemented with hard-coded choices of coordinates, relatively simple boundary geometry, and a roughly fixed number of equation terms; therefore, each model is applicable to a specific set of problems. This situation could become quite inconvenient when adiabatic invariants of particle motion are used as coordinates of phase space to model radiation belt dynamics, as promoted by Schulz (1996). The reasons are twofold: first, due to their vast range of magnitude and dramatically varying resolution, adiabatic invariant coordinates often require some kind of rescaling and transformation (e.g., Zheng et al., 2014), specific to the problem, to be computationally efficient; second, boundary geometry becomes complicated in adiabatic invariant coordinates, which could be challenging for finite-difference methods and led Subbotin and Shprits (2012) to seek new coordinates from combinations of the adiabatic invariants. However, the use of adiabatic invariant coordinates is crucial for some compelling problems in the radiation belts: for example, the mechanisms of storm-time electron loss in which adiabatic modulations due to magnetic field configuration change must be separated from non-adiabatic processes (Kim and Chan, 1997;Turner and Ukhorskiy, 2020) and the relative significance of Earthward diffusion versus CRAND as a possible inner belt electron source with the drift shell splitting effect contributing (Cunningham et al., 2018). It is the purpose of this article to present a numerical code, named UBER (for "universal Boltzmann equation solver"), that solves Eqs. (1) and (2) in an arbitrarily user-specified coordinate system up to three dimensions, with great freedom in specifying boundary geometry and boundary conditions and with various combinations of equation terms. Therefore, it is expected that the solver can be applied to a wide spectrum of radiation belt modeling problems. More importantly, the freedom of specifying equation terms implies that, in an asymptotic manner, UBER can even solve the integrodifferential kinetic equations arising from non-diffusive particle transport, such as that formulated in Artemyev et al. (2018) for nonlinear wave-particle interactions, and thereby provides a viable means to incorporate non-diffusive transport into global radiation belt modeling.
The underlying mathematical theory of the solver is stochastic differential equation (SDE) theory. The SDE method had been utilized by Tao et al. (2008), Selesnick et al. (2013), andZheng et al. (2014) in their modeling of radiation belts. The method is grid-free and enjoys unparalleled advantages in dealing with cross-diffusion components and complicated boundary geometry (e.g., Zheng et al., 2016), but it is meanwhile notorious for low efficiency ascribed to its Monte Carlo nature. In this article, we also describe specially designed numerical techniques that have enhanced the computational speed of the SDE method by an order of magnitude, thus making the solver much more affordable for largescale simulations. Four example problems with distinct physical backgrounds are provided in this article to demonstrate the abilities and versatility of the solver.

Mathematical theory
The kinetic equations given as Eqs. (1) and (2) above are parabolic partial differential equations (PDEs). Written in the Kolmogorov backward form (see below), a parabolic PDE corresponds to a multidimensional SDE that describes the motion of an Itô stochastic process whose certain functional expectation satisfies the PDE; the PDE can then be solved by calculating path integrals of the corresponding stochastic process (e.g., Freidlin, 1985;Øksendal, 1998).
Let us consider the following partial differential problem composed of a Kolmogorov backward equation and a general set of initial and boundary conditions: where ∂ t and ∂ α are shorthand for the partial differentials with respect to t and the αth coordinate, respectively. In Eqs. (4)-(6),¯ denotes the closure of the domain and ∂ its boundary. In particular, ∂ 1 represents the boundary pieces of the first type (Dirichlet) boundary condition, and ∂ \∂ 1 indicate the boundary pieces excluding those in ∂ 1 , which are of the second (Neumann, λ ≡ 0) or the third type (Robin, λ = 0) boundary conditions. The unit vectorγ points into¯ and is not tangent to the local boundary. The mathematical theory of SDEs establishes a relation between Eqs. (3)-(6) and the Itô stochastic process, whose spatial positions are denoted by the random variable X s in that obeys the reflected SDE: (7) where the dot product on the right-hand side is between a rank-2 tensor and a vector, and the parameter s runs from 0 to t so that the stochastic process retrogrades in time from t to 0. The first term on the right-hand side describes the ballistic part of its motion. The second term describes the stochastic part, with the coefficient tensor σ satisfying σ · σ T = a (whose components are a αβ ). Note that this condition does not uniquely determine σ , but all satisfying σ tensors are equivalent (Levi's theorem; Freidlin, 1985;Zheng et al., 2014). W s is a vector Wiener process of the same dimensions as X s , with each dimension an independent Gaussian stochastic variable that has zero mean and variance s. The third term describes reflection of the stochastic process in the direction given byγ on the boundary ∂ \∂ 1 , and k s is a monotonic stochastic variable that only increases when the stochastic process is on that boundary to force X s to stay in¯ . k s can thus be considered a measure of the time that the stochastic process spent on ∂ \∂ 1 and hence has the name local time. The Itô process stops either in¯ when s = t or on ∂ 1 at s = τ < t.
A formal solution of the problem in Eqs.
(3)-(6) is given by the Feynman-Kac formula (e.g., Kac, 1949;Øksendal, 1998;Klebaner, 2005): in which E is the expectation operator, and F t,x [X s ] is a functional of the stochastic path X s started from t and x that has the expression where the symbol I τ ≥t is equal to 1 when τ ≥ t, which means the stochastic process has stopped in¯ before it had a chance to reach ∂ 1 , and zero otherwise; t ∧τ means the smaller between the two. Physically, the functional F t,x [X s ] is a propagator of contribution carried along the stochastic path from either the initial condition or the first type boundary condition to the point of solution, and the exponential functions indicate how this contribution enhances or decays along this path.
To formally solve the Fokker-Planck equation (Eq. 1) by the Feynman-Kac formula (Eq. 8), it remains to transform the equation together with its proper initial and boundary conditions into the form of Eqs. (3)-(6). To this end, directly expanding Eq. (1) and collecting terms with the same differentiation order yields its Kolmogorov backward form: Comparing Eq. (10) with Eq. (3) and taking Q α equivalent to x α , we thus have the correspondences of coefficients: where in curvilinear coordinates, the divergence operator on a tensor field is in which the dots stand for all other indices irrelevant to the operation, and the terms ∂ α ln G come from summation of the Christoffel symbols in a covariant derivative (e.g., Mathews and Walker, 1970, chap. 15). It is worth remarking that −h appears in the expression for b so that the Itô process travels against the advection velocity. This is indeed the case since it is time-backwards. Also, from the expression for c, divergence of the advection serves as a loss of phase space density.
L. Zheng et al.: UBER v1.0 Initial and boundary conditions to Eq. (1) are transformed as follows. For the initial condition and the first type boundary condition, values off (t, x) are specified just as in Eqs. (4) and (5). For a flux boundary condition of the form = g 2 (t, x)f , we note that the outward flux across a boundary is given by (n · D · ∇f −n · hf ), withn the unit inward normal vector of ∂ \∂ 1 . Therefore, the corresponding boundary condition iŝ n · D · ∇f − (n · h + g 2 )f = 0. (13) Comparing Eq. (13) with Eq. (6), we determine that Although the SDE (Eq. 7) does not prevent σ , and hence D, from being zero, the expressions in Eq. (14) do become singular for vanishing D on ∂ \∂ 1 . In the region where D vanishes, Eq.
(3) is no longer parabolic but degenerates to an advection equation (a first-order PDE), for which imposing a Neumann or Robin boundary condition is over-determinant.
In this case, we invoke on the boundary minimal diffusion in the eigen-direction ofn so thatγ =n, and let g 2 ≡ −n · h so that λ = 0, which means the advective flow is free to cross the boundary. The situation that D is finite but |n·D| vanishes is considered pathological to our problem. Up to this point, we have transformed the Fokker-Planck equation (Eq. 1) and its initial and boundary conditions to the problem in Eqs. (3)-(6) and gathered all expressions in Eqs. (11) and (14) for the constructing components of the SDE (Eq. 7) as well as the functional (Eq. 9). In order to solve the Boltzmann equation (Eq. 2), it suffices for us to just transform the equation into the form of Eq. (1). To this end, we expand the phase space by concatenating the coordinates Q µ and ϕ ι so that x α = {Q µ , ϕ ι } (recall that α = {µ, ι}) and introduce the new coefficients D αβ , h α , S, and v that satisfy the following conditions: It can be verified that Eq. (1) in the new x α coordinates with the new coefficients given by Eq. (15) transforms into Eq.
(2) after replacing x α by Q µ and ϕ ι . The transformation (Eq. 15) essentially treats ϕ ι as new dimensions of the stochastic motion, except that the stochastic part of the motion in these dimensions is identically zero. A new type of boundary condition might emerge for problems involving Eq. (2), which is the periodic boundary condition for the phases ϕ ι . From the viewpoint of stochastic motion, though, such periodicity is

Input items Comments
∂ α ln G Vector field to specify the coordinate system D αβ , h α , S, v Coefficients to define the PDE g 0 (x α ) Function to provide the initial condition ψ(t, x α ) = 0 Equation to define a boundary piece's geometry g * (t, x α ) Function to provide the boundary condition, * = 1 or 2 depending on the type of the boundarŷ n(x α ) Inward unit normal vector only for ∂ \∂ 1 A set of the boundary-related items for each piece of boundary.
not really a boundary but rather a topology of . The treatment of periodic boundary condition will be exemplified in the third problem in Sect. 4 below.
To summarize this section, the above mathematical theory allows us to fully define a PDE problem involving Eq. (1) or (2) in an arbitrary coordinate system given the input functions and equations as listed in Table 1, which can be either analytical or numerical in the UBER code. The equation terms may be freely turned off by setting their corresponding coefficients to zero. The number of boundary pieces is totally up to choice, which can even be zero to put the boundary at infinity. The boundary geometry may be time-variable for boundary pieces in ∂ 1 in the UBER code, but it must be fixed for those in ∂ \∂ 1 . Solutions of this problem are obtained once we find a way to evaluate the functional in Eq. (9) for a realization of a stochastic path and to estimate the expectation of the functional. These numerical techniques are the subject of the next section.

Numerical techniques
We give an outline of the algorithms used by the UBER code in this section, with an emphasis on the techniques that improve both its accuracy and efficiency. Lower-level numerical techniques, such as the generation of pseudo-random variables, linear algebraic operations, and parallelized computation, are based on the works presented in Zheng (2015). The general idea for numerically implementing the SDE method is as follows: (i) for a given spatiotemporal position (t, x) for which an equation solution is wanted, a number of stochastic paths starting from this common position are simulated; (ii) for each stochastic path, its functional value is evaluated by the path integrals as in Eq. (9); and (iii) from these sampled functional values, their expectation is estimated, and this gives the solution at (t, x). Therefore, the SDE method is essentially a Monte Carlo method. It does not rely on a computational grid and is able to solve the problem locally. However, on many occasions it is still worth obtaining global solutions on a grid so that the solutions at time stamp T i may be used as the initial condition for the solutions at T i+1 , analogous to the idea of layer methods (e.g., Tao et al., 2009).
In this way, the stochastic processes need only to be simulated for a short duration of t = T i+1 − T i to obtain the new solutions, for which the calculation of functional expectation would converge much faster than those simulated for the full length t = T i+1 . The only operation on this grid would be interpolation and possibly extrapolation; therefore, unlike in layer methods, the grid can be irregular and thus allows for the use of sophisticated interpolation libraries on irregular grids, although in the present version only a nonuniform Cartesian grid (in the given coordinate system) is implemented with user-specified nodes.
Integration of the SDE (Eq. 7) employs the Euler-Maruyama scheme that is order 1 for weak convergence problems such as ours, meaning that when only the statistical distribution of stochastic paths matters but not the individual path, the expectation of the schematic error is proportional to the first power of the time step size (Kloeden and Platen, 1992). To further reduce the schematic error, an adaptive time step size is used in UBER. It can be shown that (e.g., Zheng, 2015) the root mean square (rms) distance an Itô stochastic process travels in infinitesimal time ds is dX s = tr(a)ds.
Numerically, the first-order contribution from |b s| cannot be neglected due to the finite s. Therefore, we prescribe a desired rms spatial step size X s , which is sufficiently small compared to the size of and any scale length of the equation coefficients, and then choose the smaller s inferred from either Eq. (16) or X s = |b s| at every step of integration as the adaptive step size. This scheme evidently reduces to a simple adaptive Euler scheme for integrating ordinary differential equations when a approaches zero. Oblique reflection of the stochastic process on ∂ \∂ 1 and the calculation of dk s follow the projected half-space algorithm presented in Gobet (2001), which is also order 1 in the weak convergence sense. The idea is that, for an exact half-space boundary, k s can be proven to share the same probabilistic distribution with a composite stochastic variable involving W s , coefficients of the SDE, the normal vector n, and an independent exponential random variable with parameter (2s) −1 (Lépingle, 1995); therefore, k s can be explicitly calculated by these known quantities. For general smooth boundary geometry, an additional contribution to dk s may also come from possible projection along theγ direction needed to keep the stochastic process within domain. With dk s obtained and the SDE (Eq. 7) integrated, the functional (Eq. 9) can be readily evaluated by an ordinary numerical integration technique implemented along the realized stochastic path.
Expectation of the functionals can be estimated, in principle, from an arithmetic mean of a number N of sampled stochastic path integrals. The error of this estimation, where a tilde is used to indicate a numerical realization in this section and · · · indicates averaging over samples, can be estimated by dividing the sim- Figure 1. Schematic illustration of process splittings in a t ⊗ R 2 space. A stochastic process travels backward in time from point P and splits into two at point Q, where its projected functional value is found to be sufficiently large (see text for exact meaning). One child process splits again at point R, where its projected functional value is found to be even larger. The independent child processes would eventually stop either in¯ as at points A and B or on ∂ 1 as at point C.
ulation of stochastic processes into batches (Zheng, 2015). Although the probabilistic distribution of individual F t,x is generally far from normal and largely unknown, that of the batch-wise mean of F t,x approaches a Gaussian for a large enough sample number per batch due to the central limit theorem, and a confidence interval can thereby be calculated for the batch-wise means using the Student's t distribution (e.g., Kloeden and Platen, 1992). We use this confidence interval as an approximation to . In this way, UBER adaptively stops simulating more batches of stochastic processes when the estimated error meets a prescribed tolerance.
In typical radiation belt problems, the functional values from various stochastic paths may differ by orders of magnitude, and hence their contributions to the arithmetic mean also differ by orders of magnitude, whereas their computational efforts are of the same order. Therefore, straightforward calculation of their arithmetic mean could result in extremely slow convergence with N and squander computational power. To reduce statistical variance in this procedure, a process-splitting technique is developed based on the idea of importance sampling, i.e., to make "denser" sampling in more important "regions". In conventional Monte Carlo methods, the "region" is an "area" in a parameter space, and importance sampling effectively splits one sample point therein that would have made a huge contribution to the calculation into many sample points nearby, while weights of these samples are reduced accordingly to keep the probabilistic distribution of samples unbiased (e.g., Press et al., 1992). But unlike conventional Monte Carlo methods, the samples in the SDE method are paths which belong to a functional space. To still implement this idea, we split the stochastic path when it is projected to contribute a large functional value. Figure 1 gives an illustration of this technique in a t ⊗ R 2 space. As a stochastic path being integrated from point P , the functional value of the entire path (from s = 0 to s = t) is continuously predicted based on the partial path that has been realized. This projected functional value is compared to the value of some quantile (e.g., the 80th percentile) statistically derived from all previously completed stochastic paths starting from the same position. When at some place Q the projected functional value falls above this quantile, the stochastic process is deemed to make a significant contribution to the arithmetic mean. It is then split into a number of child processes at Q, and each child process traces down an independent path thereafter. These child paths, together with their common parent path segment PQ, hence constitute "nearby samples" in the functional space. This procedure can be further iterated if the projected functional value later falls into an even higher quantile (e.g., the 90th percentile), as shown at R. After all procedures finish, the eventual result is a tree structure of stochastic paths rooted at P . For the illustration in Fig. 1, the actual functional value of the path PQA will be weighted by 1/2, and those of PQRB and PQRC will be weighted by 1/4, when calculating their contributions to the mean. In the UBER code, a practical choice for the number of children at each splitting is 4, and that for the upper limit of offspring generations is 3, so that a stochastic process can be split into a maximum of 4 3 = 64 processes. Effects of the process-splitting technique are studied in the first problem in the next section.
It still remains to find a method to project the functional value of a stochastic path when it is only partially realized. For this purpose, we insert a break point at s = s ∈ (0, t) to the integrations in Eq. (9) and see how it transforms. We simplify the situation by only considering the stochastic processes stopping in¯ for the moment and denote the following functional integrals: in which the integrand functions c, λ, and u are as those in Eq. (9). Then, the functional F t,x with the above presumptions and notations is transformed as where F t−s ,X s is the functional for a stochastic process that starts from the break point (t − s , X s ) and continues until s = t.
Suppose that a partial path has been realized up to s = s . From it we can readily evaluate U s 0 and V s 0 in Eq. (19) and therefore need an estimatedF t−s ,X s to project the functional valueF t,x , where a bar is put over all unrealized entities. Specifically, we would need these three estimates: X t ,Ū t s , andV t s . In principle, a good estimation ofX t is given by integrating along the streamline of the b(t − s, x) field through X s until s = t, which means projecting forX t along the ballistic trajectory of motion while ignoring all the stochasticity since the Wiener process has zero mean. However, this integration is not much cheaper than the realization of F t−s ,X s itself and is thus unaffordable. In anticipation that the total time length t would not be too large, especially when using a solution grid, and that b(t −s, x) would not vary drastically in this time interval, mappingX t along the constant vector b(t −s , X s ) is a good enough but much cheaper approximation. IfX t is mapped out of¯ so that g 0 (X t ) is unable to be evaluated, the particular stochastic process is then disabled from splitting.
The functional valuesŪ t s andV t s are estimated by assuming that, for all possible stochastic paths belonging to the same solution point, there are mean functionsc,λ, andū that are independent of time and that the mean local time is proportional to the total time length of the stochastic process so thatk s =ks withk the proportionality constant. Under these assumptions,Ū t s andV t s can be expressed bȳ and ifŪ t 0 = 1. The values ofŪ t 0 andV t 0 can be well estimated, respectively, by the medians of U t 0 and V t 0 that are obtained from all previously completed stochastic paths. Medians are preferred to means here because the probabilistic distributions of these functionals are usually very skewed and heavytailed. This projection mechanism would become statistically more accurate with more stochastic processes having been simulated.

Example problems
Four example problems are provided in this section. In the first problem, we solve a Fokker-Planck equation with two source terms, one proportional to the unknown function and the other independent of the unknown function, in both a spherical coordinate system and a Cartesian coordinate system. Effects of the process-splitting technique are analyzed in this example. In the second problem, an advection-dominated Fokker-Planck equation is considered. We further show that, even for a pure advection equation, the UBER code still gives the correct solutions, although it is not designed for such an equation and may not be the most efficient method. Code behavior with advection equations is further studied in the third problem in which the treatment of periodic boundary condition is also illustrated. In the last problem, we simulate the Earth's inner radiation belt by solving its Boltzmann equation involving realistic pitch-angle diffusion and CRAND source.

Problem 1: neutron generation and diffusion in nuclear material
In this problem, we consider the diffusion and generation of neutrons in a spherical nuclear material at detonation, with an initially injected Gaussian neutron distribution from a small source at the center and a neutron-reflecting coat that allows only one-half of the surface neutrons to escape. In a spherical coordinate system, the equation, initial condition, and boundary conditions are (Serber, 1992): where f is neutron density, the constant diffusion coefficient D = 0.1, the constant rate of neutron generation from chain reaction S = 2.5, and v(r) = 10 −6 /(1 + r) characterizes a weak source of neutrons spontaneously emitted in the material. The values and functional forms of these coefficients are solely designed for demonstration purposes and are not meant to be experimentally accurate. UBER solutions are obtained at four time stamps and are compared with those from a staggered-grid finite-difference method (e.g., Ames, 2014), as shown in Fig. 2a. A turning point is observed in the solutions at T = 0.05, which marks the transition of the dominating neutron source from chain reaction at high background density to spontaneous emission at low density. As time goes by, the effect of the spontaneous emission is overwhelmed by the fast-growing chain reaction. Even though the solutions span 8 orders of magnitude, the UBER results are virtually identical to the finite-difference ones, and statistical fluctuation, which is a typical feature in Monte Carlo methods, is not observed in these solutions due to the adaptive algorithms and the variance reduction technique.
To demonstrate UBER's ability in multiple dimensions with a complicated boundary geometry, the same problem is also solved in a three-dimensional Cartesian coordinate system along a sphere radius. In this coordinate system, the diffusion coefficient becomes a 3 × 3 matrix with each diagonal component equal to D, and the boundary condition in Eq. (27) is applied to the only boundary that is a sphere with unit radius. The solutions are over-plotted in Fig. 2a. Consistency between the one-dimensional and the three-dimensional results is quite evident.
To analyze the effects of the process-splitting technique, we repeated the three-dimensional solutions at T = 0.05, but with a fixed number of stochastic processes (2048 samples per batch, 200 batches) for each solution point and with various upper limits of the offspring generations ν; ν = 0 indicates that the process-splitting technique is disabled. For a fixed number of samples, the relative error of a solution is proportional to the square root of the variance of sampled functional values and determines how fast the calculation of expectation converges. The relative errors as functions of r are plotted against the left y axis of Fig. 2b, and each curve is in fact formed by the medians from eight independent and identical numerical experiments to be more statistically representative. In the range 0.4 < r < 0.9, the relative errors are consistently reduced with higher offspring generations. At the slowest-converging point r = 0.76, the processsplitting technique with a maximum of four offspring generations could reduce the relative error by an order of magnitude compared to that without splitting. For this curve (ν = 4), the percentages of stochastic processes that have undergone splitting are plotted as a shaded area against the right y axis. For r < 0.4, the relative errors are small and computational convergence is fast enough, so process splitting is automatically suppressed by the code to achieve an optimal speed. If the relative errors are large, usually a small fraction of split stochastic processes can be rather effective.
To further reveal the behavior of the process-splitting technique, Fig. 2c plots how the relative error reduces with an increasing number of samples (N ) in the Monte Carlo procedure for the solution point at r = 0.76. There are eight independent and identical numerical experiments, respectively, for ν = 0 and 4, and each line represents the results from one numerical experiment. The general trend is that the relative error reduces linearly in a log-log scale plot, resembling its dependence on N −1/2 . However, without process splitting,  the relative error often jumps up sharply due to the occurrence of a very low-probability sample that made a very large contribution, which severely slows down the computational convergence. With process splitting, such jumps are largely avoided, and on average, the code uses just a little more than 1/100 of the samples without process splitting to achieve the same relative error of 0.1.
In practical UBER usage, solutions are achieved with a prescribed tolerance of relative error and an adaptive number of samples. Therefore, fast convergence with the processsplitting technique could save a significant amount of computational effort even with its extra computational burden. Table 2 lists the normalized wall clock time consumed by UBER for obtaining the solution curve in three dimensions at T = 0.05 with a relative error tolerance of 0.1 and a range of maximum offspring generations in process splitting. Again, each of these numbers is the median from eight independent and identical numerical experiments. With ν = 3 and 4, the code is nearly an order of magnitude faster than without process splitting. The same wall clock time in these two cases indicates that the faster convergence with more offspring gen-erations starts to be traded off by the computational overhead associated with more complicated splitting, and therefore further increasing ν would not be optimal.

Problem 2: magnetized plasma evolution under instability
In the second problem, we consider a Fokker-Planck equation for the pitch-angle distribution of a magnetized plasma (e.g., Dendy, 1990). Suppose that the electrons are initially in a sin 2 (x) background pitch-angle distribution with x as the pitch angle. An electron beam is injected into the system centered at pitch angle x = 0.4. In addition to pitch-angle diffusion, the injected beam excites some kind of plasma instability that kinetically transports the distribution toward the π/2 pitch angle. The equation, initial condition, and boundary conditions are written as where f is the electron distribution function, the Jacobian determinant G = sin(x), the diffusion coefficient D(t, x) = (1/2π 2 )erf(t/2)[1 + sin 2 (2x)], and the advection coefficient h(x) = cos(x). Note that, in most of the x range, the advection coefficient is about an order of magnitude larger in value than the diffusion coefficient. Equation (30) indicates a loss cone at pitch angle x = 0.05. UBER solutions for this problem are plotted in Fig. 3a as circles and are in excellent agreement with those from the staggered-grid finite-difference method. In these solutions, the beam evolves toward x = π/2 because of the kinetic advection. As the system relaxes, the beam eventually merges into the background, and a final stable distribution is then approached. Equation (28) degenerates to a continuity equation if pitchangle diffusion is turned off by setting D(t, x) to zero. Even for such a pure advection problem, UBER can still obtain accurate and robust solutions compared to the widely used Lax-Wendroff method (e.g., Ames, 2014), as shown in Fig. 3b. Before T = 2, an advection of the beam toward x = π/2 is seen in the solutions without dispersion, and UBER results are almost identical to the Lax-Wendroff ones. The system, however, is unstable due to the positive advection velocity at x < π/2 and the zero advection velocity at x = π/2 so that the electron distribution will be piled up near x = π/2 and ultimately evolve into a singularity. For this reason, the Lax-Wendroff method begins to fail at T = 2 by generating unphysical negative solutions near x = 1.3 and will be divergent henceforth; this problem would require careful choices of additional flux limiters to overcome. UBER nonetheless gives the correct results that still resolve the peak height and position of the beam.

Problem 3: particle adiabatic drift around Earth
In this problem, we study the UBER code behavior when solving the advection equation resulting from Eq. (2) but without any non-adiabatic process or source and loss, which describes the adiabatic drift of radiation belt particles around Earth. To be specific, the equation and boundary condition are and an idealized initial condition g 0 (ϕ) consisting of a step function, a triangle, and a semicircle is given as that in Fig. 4a, which can induce unphysical negative solutions to finite-difference methods due to its discontinuities and infinite gradients (e.g., Aseev et al., 2016). Equation (33) specifies the periodic boundary condition for the drift phase ϕ. In the UBER code, the periodic boundary condition is not really considered a boundary condition; rather, it is dealt with by extending the computational domain to include multiple periods so that the Itô stochastic processes would not move out of the domain within the given time duration, except for stopping on other first type boundaries. In this specific problem, the time stamp for obtaining solutions is every one-half drift period; therefore, the computational domain is extended for one extra period of ϕ from 0 to −2π since the stochastic processes retrograde in time. However, solutions are only sought in the right half of the domain for ϕ between 0 and 2π at each time stamp, and after that, they are copied to the left half to form the entire initial condition for the next time stamp.
By the method of characteristics, Eq. (32) is trivially solved for a constant drift frequencyφ = 2π, and the solutionf (t, ϕ) = g 0 (ϕ − 2π t) preserves its shape and returns to its initial position after each period of t = 1. As shown in Fig. 4a, this is exactly what the UBER solutions do. Unlike finite-difference methods which often introduce numerical dispersion to advection equations, the UBER solutions can be exact because, without the stochastic terms, the solver of the SDE (Eq. 7) actually integrates along the characteristic curves of the advection equation, and for constant advection velocity, these characteristic curves are straight lines for which the Euler scheme is exact.
The Euler scheme is well known to overshoot when integrating along a curve that consistently curls to one side, and this would cause error in UBER solutions. To show this, we replace the drift frequency byφ = 2π t in Eq. (32), for which the characteristic curves are now a family of parabolas whose equations are ϕ − ϕ 0 = π t 2 , with ϕ 0 the parameter. Note that the drift is no longer periodic but forever accelerating. UBER solutions after the first round (T = √ 2) and the 20th round (T = 2 √ 10) of drift are plotted in Fig. 4b. Compared with the initial condition, the UBER solutions are slightly displaced due to the overshoot, and the solution shapes are very slightly distorted, too. For radiation belt applications, though, any frequencyφ ι must meanwhile be an approximate periodic function of ϕ ι , and hence the corresponding characteristic curves would at most wobble rather than curling, and the Euler scheme overshoot largely cancels rather than accumulating. Therefore, we would not expect the UBER solution errors in those applications to be worse than that illustrated in this case.

Problem 4: Earth's inner radiation belt simulation
In the last problem, we demonstrate UBER's ability to solve a radiation belt Boltzmann equation by performing an inner radiation belt simulation involving both the stably trapped (out of the drift loss cone) and the quasi-trapped (in the drift loss cone) electron populations. Inspired by Xiang et al. (2020), we consider the 304 keV electrons at McIlwain's L M = 1.25, which are subject to pitch-angle scattering caused by Coulomb collisions with upper-atmospheric neutrals and ionospheric ions and electrons. The equation, initial condition, and boundary conditions are In Eq. (34), drift frequency, is evaluated using dipole field approximation (Schulz, 1991), in which c is the speed of light in a vacuum, L is dipole Lshell, R E is the radius of Earth, e is the elementary charge, µ E is the magnetic moment of Earth's intrinsic dipole field, m e is electron mass, p is electron momentum, α 0 is electron equatorial pitch angle, and the functions D(sin α 0 ) and T (sin α 0 ) are bounce motion integrals in the dipole field that are given in Schulz (1991, pp. 205-210). For simplicity, we ignore the dependence of drift frequency on ϕ so that the drift phase becomes equivalent to geomagnetic longitude. The Jacobian determinant G = T (sin α 0 ) sin(2α 0 ). The bounceaveraged pitch-angle diffusion rate is empirically given by D αα = 10 −5 exp 92.55 cos 4 α 0 − cos 4 α L (ϕ) which features quantitative resemblance to that calculated by realistic atmosphere and ionosphere models in Xiang et al. (2020). In this expression, α L (ϕ) is the bounce loss cone angle dependent on geomagnetic longitude that is determined by drift shell tracing in the International Geomagnetic Reference Field (IGRF; Finlay et al., 2010). D αα as a function of ϕ and α 0 is plotted in Fig. 5a: it is only significant near the bounce loss cone and in the South Atlantic Anomaly (SAA) centered at about 20 • of geomagnetic longitude due to the closer proximity of the drift shell to the upper atmosphere in these regions. The CRAND source rate S e /p 2 is approximated by (Lenchek et al., 1961;Selesnick, 2015) S e p 2 ≈ 1.7 × 10 −12 (E max − E) 2 L 2.7 sin α 0 where E max is the maximum kinetic energy (782 keV) available to electrons from neutron β decay and E is the electron kinetic energy in question, with both being measured in units of the electron rest energy (511 keV), and L is the dipole L-shell, which is a variable dependent on geomagnetic longitude due to multipoles of the Earth's magnetic field. Figure 5b plots the CRAND source rate and the dipole L-shell values corresponding to McIlwain's L M = 1.25 obtained from drift shell tracing in IGRF, which vary from less than 1.2 in the SAA to above 1.3 near 180 • of geomagnetic longitude. The simulation is performed with an initially empty radiation belt as indicated by Eq. (35), and electrons are gradually generated by the CRAND source and meanwhile lost to the bounce loss cone. Figure 5c and d show the solution electron fluxes calculated fromj =f p 2 after 2 h (about one drift period) and 10 h, respectively. The characteristic westeast electron flux gradient is formed for the quasi-trapped population (α 0 < 60 • ) within the first 2 h and changes very little over time because the SAA sweeps these electrons out every drift period. Weak pitch-angle diffusion of electron fluxes from the quasi-trapped population toward the stably trapped population can be observed at T = 2 h when the stably trapped fluxes are still low due to the stronger source rate in the quasi-trapped region. At T = 10 h, the direction of the pitch-angle diffusion is reversed. Even with atmospheric loss, the CRAND source is strong enough to continuously contribute to the trapped electron fluxes, which are increased by 1 order of magnitude in 8 h.

Conclusions and discussion
In conclusion, we have built a numerical solver for the general form of kinetic equations that appear in radiation belt studies. Based on the SDE method, the solver is coded to work in arbitrarily provided coordinate systems of up to three dimensions with user-specified boundary geometry, boundary conditions, and equation terms. We have also designed adaptive algorithms and a variance reduction technique for the SDE method, which enhanced its computational speed by 1 order of magnitude in our test. The example problems in this article demonstrated the solver's versatility and robustness in dealing with a range of problems that might each require a different solver in other methods. The solver, named UBER, has been programmed into a Fortran library that can be easily incorporated with other more complicated space physics models.
The strengths of the SDE method lie in its abilities to solve problems that are difficult for other methods, but not in its speed. Even with the presented improvements, the SDE method is in general nowhere near the finite-difference counterparts in terms of speed due to its Monte Carlo nature. However, Monte Carlo methods are perfectly parallelized for no communication among parallel tasks; therefore, the UBER code gains handsome speedup from parallel computation. Figure 6 displays the UBER code speedup, measured as the ratio between wall clock time of serial and parallel executions, against different numbers of parallel threads employed in solving Problem 1 in spherical coordinates. For a fixed amount of computation, speedup is limited by the portion of work that cannot be parallelized; this is described by Amdahl's law (inset formula in Fig. 6;Amdahl, 1967), which fits the numerical experiment data with parameter p = 0.85, meaning that effectively 85 % of the total work has been parallelized. With 86 threads, the code approaches its theoretical maximum speedup of 6.7 for this problem, which uses about 620 s of wall clock time on a Linux server. The referenced finite-difference code uses 65 s in serial execution to solve the problem on the same server. In other words, the UBER code, in its ideal parallelization, is an order of magnitude slower than serial finite-difference methods, and it is also roughly the case for Problem 4 (Zheng Xiang, personal communication, 2021) for which the UBER code spends 22 min to simulate one drift period. This conclusion, however, is rather qualitative because in these comparisons different methods have used quite different grids. Finite-difference methods generally require a much finer grid than the SDE method in order to deal with short-wavelength components of the solution, and time spacing is restricted accordingly by the Courant-Friedrichs-Lewy condition (e.g., Press et al., 1992). On the other hand, the SDE method affords a grid that is only concentrated in interesting areas as long as global interpolation can be warranted, and the time stamps for dump-ing solutions are up to user choice. Moreover, if only local solutions are sought over a short time period, the SDE method could totally disregard the uninteresting solutions and might be more efficient. For radiation belt simulations in particular, finite-difference models usually rely on the operator splitting technique (e.g., Press et al., 1992) to solve tensorial diffusion in three dimensions, which involves frequent interpolation between two sets of grids and hence greatly hinders speed, whereas problem dimensionality has little effect on the SDE method.
Several other forms of radiation belt kinetic equations should also be solvable by the method presented in this article. In formulating the Boltzmann equation (Eq. 2), we have assumed that the unperturbed particle Hamiltonian H 0 is independent of phases of particle motion. For lower-energy ring current particles, the convective electric field potential energy is not negligible in their Hamiltonian, and therefore H 0 would be dependent on the drift phase. As such, expanding the Poisson bracket [f, H 0 ] on the left-hand side of the Boltzmann equation will result in additional terms involving partial differentials with respect to the generalized momenta Q ι (see Appendix A). For a radiation belt model including ring current particles, the general form of the Boltzmann equation will be ∂f ∂t +φ ι ∂f ∂ϕ ι +Q ι ∂f ∂Q ι = . . ., in which the omitted right-hand side is exactly the same as that of Eq.
(2). The Boltzmann equations of the so-called four-dimensional radiation belt models, such as the CIMI model (Fok et al., 2014), the VERB-4D model (Aseev et al., 2016), and the K2 MHD-particle model (Elkington et al., 2019), are of this type. Similar to the treatment of Eq.
(2), Eq. (42) can be obtained from Eq. (1) by introducing the new coordinates x ξ = {Q µ , Q ι , ϕ ι } ≡ {x µ , x κ }, which enlarges the index set from α to ξ , and performing the following transformation of equation coefficients: Therefore, the Boltzmann equation (Eq. 42) can also be solved by the method presented in this article in principle. However, such four-dimensional simulations are beyond the current scope of the UBER code since it is only coded for up to three dimensions in space.
Nonlinear evolution of phase space density occurs when the particle scatterings are not only small-scale but also largescale, usually as a result of trapping by intense plasma waves (e.g., Bortnik et al., 2008;Albert et al., 2013). In this case, the right-hand side of the kinetic equation must include terms of nonlocal transport of phase space density by these large-scale scatterings, and the equation is formulated as (Artemyev et al., 2016;Zheng et al., 2019) in which f is a shorthand for the functionf (t, Q α ), and G is the Jacobian determinant evaluated at Q α . With nonlinear wave-particle interactions, the phase bunching effect gives rise to advection characterized by the coefficients h α . The function P Q→ Q is the trapping probability density per unit time from Q α to Q α ; that is, particles are trapped by the wave field at Q α and subsequently escape from trapping at Q α . This is considered a known function which can be evaluated from single particle behaviors by either perturbation theory of Hamiltonian mechanics (e.g., Artemyev et al., 2016) or test-particle simulations (e.g., Vainchtein et al., 2018). Note that, since the unknown function is contained in the last integral term, Eq. (44) is an integro-differential equation. However, formal similarity between Eq. (44) and the Fokker-Planck equation (Eq. 1) suggests that an asymptotic solution of Eq. (44) may be achieved by Taylor expanding f as where f 0 =f (0, Q α ), and f 0 indicates its time-derivative function evaluated at t = 0. When applying the SDE method with a solution grid, the functions f 0 and f 0 can be obtained from solutions of previous time stamps. Then, by defining the following coefficients which are now known functions, Eq. (44) is transformed into the form of Eq. (1) and is readily solvable by the UBER code. In this way, simulations of nonlinear wave-particle interactions in radiation belts could hence be unified with welldeveloped simulations in quasi-linear theory.
L. Zheng et al.: UBER v1.0 Appendix A: A formal derivation of radiation belt kinetic equations We consider, for simplicity, a hypothetical radiation belt whose particle motion has two well-separated periods which define two pairs of action-angle variables {ϕ 1 , ϕ 2 , J 1 , J 2 }. We assume that the phase angle ϕ 1 changes much faster than ϕ 2 and hence call {ϕ 1 , J 1 } the fast variables and {ϕ 2 , J 2 } the slow variables. We further assume, for a moment, that the Hamiltonian of particle motion, H (ϕ 1 , ϕ 2 , J 1 , J 2 , t) = H 0 (ϕ 2 , J 1 , J 2 , t) is constituted of an unperturbed part H 0 that depends on the slow phase and a perturbation δH that is caused by electromagnetic forces whose variation timescale is shorter than the periodicity 2π/φ 1 . Apparently, δH is a periodic function of ϕ 1 . Upon averaging over ϕ 1 , the perturbation cancels out so that These presumptions allow the particle phase space density, to be so decomposed into a fast-phase-averaged partf ≡ f and a perturbation δf , also periodic in ϕ 1 , which has δf = 0 by definition. For a collisionless plasma, evolution of f is governed by the Vlasov equation: In fact, this equation form is more neatly derived from Liouville's theorem (e.g., Goldstein, 1980, chap. 9 . Equation (A5) appears in the form of a Boltzmann equation for the phase-averaged phase space density: the left-hand side describes evolution of the unperturbed system in the slow variables, whereas the right-hand side, involving only the perturbed quantities and the fast variables, serves the role of a collision integral. Indeed, it can be viewed as "collisions" between particles and the perturbing electromagnetic field. In this regard, we symbolically denote the right-hand side ∂f ∂t w in analogy to that caused by real collisions, with the subscript designating wave-particle interaction. Equation (A5) is closed when its right-hand side can be expressed in terms off under certain approximations. If both δH and δf are small compared to their unperturbed counterparts, δf can be directly solved from the linearized Vlasov equation retaining only the first-order terms in expansion and, after mathematical transformations, gives the expression (e.g., Lerche, 1968;Kaufman, 1972) where the coefficient D w is a functional of δH . The corresponding theory is called the quasi-linear theory. However, when the perturbing electromagnetic wave is sufficiently coherent, δf may become large even if δH remains small. In this situation, ∂f ∂t w is estimated by considering particle phase trajectories near the resonance point (e.g., Artemyev et al., 2016). The result would then contain corrections to Eq. (A7), which are due to particles trapped in phase with the wave, whose formulation in the current setup could be inferred from that of Eq. (44) in the body; we hereby do not elaborate.
Taking account of collisions in the plasma would introduce to Eq. (A4) a collision term not describable by the singleparticle Hamiltonian so that the transport equation becomes the Boltzmann equation: For Coulomb collisions, small-angle scatterings at large impact parameters dominate due to the long range of Coulomb force, and consequently the phase-averaged collision integral can be expanded into a Fokker-Planck form in the generalized momenta that are changed by the collisions (Lifshitz and Pitaevskii, 1981, chaps. 2 and 4), i.e., which usually only involves the fast momentum J 1 on timescales much shorter than 2π/φ 2 . The transport coefficients D c and h c are determined from the particle species and their collision cross sections. We note again that, in the phase-averaged kinetic equation (Eq. A9), slow and fast variables are separated onto each side of the equation.
Neglecting the source and loss terms, the quasi-linear kinetic equations in the body of this paper could all be recovered from Eqs. (A7), (A9), and (A10), which are already in the same form as Eq. (42). If there are no slow variables, the Poisson bracket on the left-hand side of Eq. (A9) vanishes, and the equation reduces to the Fokker-Planck equation (Eq. 1). If there are slow variables but ϕ 2 is cyclic to the unperturbed Hamiltonian, the left-hand side of Eq. (A9) would then contain the first two terms shown in Eq. (A5'), which is in the form of Eq. (2). In this case, the dependence off on ϕ 2 is introduced by means other than the Hamiltonian, such as the ϕ 2 -dependent boundary geometry, boundary conditions, or collision terms.
Code and data availability. The UBER library is free and opensource. The current version of UBER is available from the GitHub repository at https://github.com/zheng-lh/UBER (last access: 14 April 2021) under the MIT license. The exact version of the UBER library used to produce the results used in this paper is archived on Zenodo (https://doi.org/10.5281/zenodo.4671646, Zheng, 2021), as are input data and scripts to run the library and produce the plots and tables for all the simulations presented in this paper (https://doi.org/10.5281/zenodo.5221599, Zheng et al., 2021).
Author contributions. LZ developed the UBER library, conducted the benchmark experiments, and wrote the paper. LC and AAC contributed to the conceptualization of the UBER library. PW, ZX, and XL provided ancillary code and data for the benchmark experiments.
Competing interests. The contact author has declared that neither they nor their co-authors have any competing interests.
Disclaimer. Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.