Knowledge about muon tomography has spread in recent years in the geoscientific community and several collaborations between geologists and physicists have been founded. As the data analysis is still mostly done by particle physicists, much of the know-how is concentrated in particle physics and specialised geophysics institutes. SMAUG (Simulation for Muons and their Applications UnderGround), a toolbox consisting of several modules that cover the various aspects of data analysis in a muon tomographic experiment, aims at providing access to a structured data analysis framework. The goal of this contribution is to make muon tomography more accessible to a broader geoscientific audience. In this study, we show how a comprehensive geophysical model can be built from basic physics equations. The emerging uncertainties are dealt with by a probabilistic formulation of the inverse problem, which is finally solved by a Monte Carlo Markov chain algorithm. Finally, we benchmark the SMAUG results against those of a recent study, which, however, have been established with an approach that is not easily accessible to the geoscientific community. We show that they reach identical results with the same level of accuracy and precision.

Among the manifold geophysical imaging techniques, muon tomography has increasingly gained the interest of geoscientists during the course of the past years. Before its application in Earth sciences, it was initially used for archaeological purposes. Alvarez et al. (1970) used this method to search for hidden chambers in the pyramids of Giza, in Egypt; this was an experiment which was recently repeated by Morishima et al. (2017), as better technologies have continuously been developed. Other civil engineering applications include the monitoring of nuclear power plant operations (Takamatsu et al., 2015) and the search for nuclear waste repositories (Jonkmans et al., 2013), as well as the investigation of underground tunnels (e.g. Thompson et al., 2020; Guardincerri et al., 2017). A serious deployment of muon tomography in Earth sciences has only begun in the past decades. These undertakings mainly encompass the study of the interior of volcanoes in France (Ambrosino et al., 2015; Jourde et al., 2016; Noli et al., 2017; Rosas-Carbajal et al., 2017), Italy (Ambrosino et al., 2014; Lo Presti et al., 2018; Tioukov et al., 2017) and Japan (Kusagaya and Tanaka, 2015; Nishiyama et al., 2014; Oláh et al., 2018; Tanaka, 2016). Other experiments have been performed in order to explore the geometry of karst cavities in Hungary (Barnaföldi et al., 2012) and Italy (Saracino et al., 2017). Further studies (see also review article by Lechmann et al., 2021a) have been conducted by our group to recover the ice–bedrock interface of Alpine glaciers in central Switzerland (Nishiyama et al., 2017, 2019).

The core component of every geophysical exploration experiment is formed by the inversion, which might be better known to other communities as fitting or modelling. This is where the model parameters are found which best fit the observed data. Up until now, this central part has mostly been built specifically to meet the needs of the experimental campaign at hand. On the one hand, this approach has the advantage of allowing the consideration of the peculiarities of particle detectors, their data processing chain and other models involved (e.g. the cosmic-ray flux model). On the other hand, when every group develops a separate inversion algorithm, the reconstruction of the precise calculations performed in the data analysis procedure becomes a challenge. For a researcher who is not familiar with the intricacies of inversion, this might be even tougher. We thus see the need for a lightweight programme that incorporates a structured and modular approach to inversion, that also allows users with little inversion experience to familiarise themselves with this rather involved topic. This programme can be used to directly analyse experimental data in a stand-alone working environment, and the modules and theoretical foundations can be adapted, customised and integrated into new programmes. For this reason, the code is built in the Python programming language in order to facilitate the exchange between researchers and to enhance modifiability. Moreover, the source code is freely available online (Lechmann et al., 2021b).

To facilitate the further reading of our code, we introduce the reader at this point to our benchmark experiment, to which we will refer on multiple occasions throughout this work. The experimental campaign is explained in detail in Nishiyama et al. (2017), and thus we will resort to a description of the experimental design at this point. In the Nishiyama et al. (2017) study, we aimed at recovering the ice–bedrock interface of an Alpine glacier in central Switzerland. Figure 1 shows that we had access to the railway tunnel of the Jungfrau railway company, where we installed three detectors. In our measurement, we recorded muons from directions that consisted purely of rock and others where we knew that the muons must have crossed rock and ice (see the two cones in Fig. 1). From the former, it was possible, together with laboratory measurements, to determine the physical parameters of the rock more precisely. Subsequently, we utilised the directional measurements of the latter to infer the 3-D structure of the interface between rock and ice underneath the glacier. Finally, we will also use the results of that experiment (Nishiyama et al., 2017) to verify our new algorithm in the present study.

The goal of every muon tomography study is essentially to extract information on the physical parameters (usually density and/or the thickness of a part of the material) of the radiographed object through a measurement of the cosmic-ray muon flux and an assessment of its absorption as the muons cross that object. In geological applications, these objects are almost always lithological underground structures such as magma chambers, cavities or other interfaces with a high-density contrast. The reconstruction of the geometry of such structures can only be achieved if the measured muon data are compared to the results of a muon flux simulation. As stated earlier, this is the basic principle of the inversion procedure. However, the aforementioned “muon flux simulation” is not just a simple programme, but it consists of several physically independent models that act together. Taking a modular view, we will call these models “modules” from here on, as they will inevitably be part of a larger inversion code. We have visualised the components that are necessary to build an inversion and how they interact with each other in Fig. 2.

A schematic flowchart showing the different involved models in a muon tomographic experiment. The muon simulation consists of a model for rocks, detectors, the cosmic-ray flux and a particle physical model on how muons lose energy upon travelling through rocks. These models allow for a synthetic dataset to be computed. The systematic comparison between synthetic data and actual measured data, and the subsequent change of the model parameters to find the set of parameters that reproduce the measured data best is an (often iterative) optimisation problem. This procedure is termed “inversion” (usually) by geophysicists.

The first of the modules is the input module for the experiment results, which also considers the detectors that were used in the experiment. Typical detector setups include nuclear emulsion films (e.g. Ariga et al., 2018), cathode chambers (e.g. Oláh et al., 2013), scintillators (e.g. Anghel et al., 2015) or other hardware solutions. Although the detailed data processing chain may be comprehensive, the related output almost always comes in the form of a measured directional (i.e. from various incident angles) muon flux or equivalently the measured directional number of muons, which will be the input to the inversion scheme. Here, we primarily work with the premise that the muon flux data and the associated errors are given. The corresponding errors can then be furnished to the code by means of an interface.

The simulation module on the other hand, consists of two parts each containing two modules (see Fig. 2). The model parameterisation is needed in order to abstract the (geo)physical reality as a mathematical model. Subsequently, the forward model uses the model parameterisation and simulates an artificial dataset based on the chosen parameter values.

As we want to draw inferences on the physical parameters of the involved materials, we need a “rock model” first. The latter term is used in an earlier publication (Lechmann et al., 2018) where we split a rock into its mineral constituents and compute a mean composition and a mean density needed for further calculations. Even though this is called a “rock model” (Fig. 2), the approach can be used for other materials (e.g. ice) as well. It is also possible to infuse laboratory measurements of compositions and density into this family of models.

Once the materials have been described, it is necessary to model the experimental situation spatially. This means that materials as well as detectors have to be attributed a location in space. The central choice in the “model parameterisation” is usually to select how the material parameters are discretised spatially (this is hinted in Fig. 2 with the term “binning”). We refer the reader to Sect. 2.1 for further information on that topic. In order to conduct this parameterisation, we may employ pre-existing software solutions that mainly compromise GIS and geological 3-D modelling applications that excel at capturing geological information from various sources, e.g. digital elevation models (DEMs), and that also allow to compile field observations (maps, etc.) into a spatially organised database. Once the structure of the spatial model is determined, we need a physical model that allows us to calculate a synthetic dataset, based on the parameter values that we set up. We note that the parameterisation structure remains fixed for the time of the calculations and that changes are only performed on the parameter values.

Incident muons, on their way from the atmosphere to the detector, lose energy while traversing matter. The first step in the muon simulation is then to determine how great the initial energy of the muon has to be in order to be able to penetrate all the material up to the detector. This is done by means of a muon transportation model which calculates all physical processes by which a muon loses kinetic energy while travelling through matter. The particle physics community has a great variety of particle simulators, the most prominent being GEometry ANd Tracking (GEANT) 4 (Agostinelli et al., 2003), a Monte Carlo-based simulator. These have the advantage that stochastic processes resulting in energy loss are simulated according to their probabilistic occurrence – an upside that has to be traded off for longer computation times. In contrast to obtaining the full energy loss distribution, lightweight alternatives often resort to the calculation of only the mean energy loss. The solution of the resulting differential equation can even be tabulated, as has been done by Groom et al. (2001).

Lastly, based on that minimum energy, one may calculate the portion of the atmospheric muon flux that is fast enough to reach the detector. For this part, a cosmic-ray muon flux model is needed, which describes the muon abundance in the atmosphere, and which is generally dependent on the muon energy, its incidence angle and the altitude of the detector location. Lesparre et al. (2010) list and compare various muon flux models that may be incorporated into an extensive simulation.

The interplay of these four modules (schematically shown in Fig. 2) allows then to simulate a dataset. It is then possible to compare the measured data with a synthetic dataset and to quantify this difference using a specific metric (usually the squared sum of residuals, which is also termed the misfit in Fig. 2). The process of changing the model parameters (here density of the materials and thicknesses of the segments in each cone), comparing the synthetic dataset with the measured dataset and of iteratively tweaking the parameters in such a way that misfit is minimised is called “inversion” in Fig. 2. The solution of the inversion depends on the inversion method used and is either the full statistical distribution of the model parameters or an estimate thereof (e.g. the maximum likelihood estimate).

The reconstruction of material parameters from muon flux data has already been performed in a variety of ways and different methods and codes have already been published. Bonechi et al. (2015), for example, used a back-projection method such that size and location of underground objects can be determined. Jourde et al. (2015), on the other hand, describe the resolving kernel approach, where they show how muon flux data and gravimetric data can be combined to improve the resolution on the finally imaged 3-D density structure. This is a useful approach especially in the planning stages of an experiment. Barnoud et al. (2019) provide a perspective on how such a joint inversion between muon flux data and gravimetric data can be combined in a Bayesian framework, whereas Lelièvre et al. (2019) investigate different methods of joining these datasets using unstructured grids.

Existing frameworks that are especially used in physics communities are GEANT4 (Agostinelli et al., 2003) or MUSIC (Kudryavtsev, 2009). These are Monte Carlo simulators that excel at modelling how a particle (e.g. a muon) interacts with matter and propagates in space and time. The Monte Carlo aspect describes the fact that many particles are simulated to get a statistically viable distribution of different particle trajectories. This might be a very time-consuming process, as for each material distribution such a Monte Carlo simulation has to be performed, and only a fraction of all simulated muons actually hit the detector. In order to speed this calculation up, Niess et al. (2018) devised a backward Monte Carlo, which only simulates that portion of the muons that are actually observed.

In their area of use, the above-mentioned sources prove very valuable.
Unfortunately, these tools and approaches have in common that a rather good
understanding of either inversion, nuclear physics processes or programming
is required. Even more so, if one wants to tackle our problem of interface
detection, coupled with density inversion, then various parts of the above
codes need to be linked together. The construction of a specialised
programme is then a time-consuming process, as (a) programmes might not be
freely available, (b) different codes might be written in different and
specialised programming languages (such as C

We thus see the need for a versatile, user-friendly simulator, which allows users not only to quickly perform the necessary calculations without the need of additional coding but also tailor the individual models to custom needs. A new simulator can be more useful if an inversion functionality is already included. As can be seen in Fig. 2, the inversion compares the simulated flux data with the measured data. This problem is solved by finding the set of parameters (material density and the thicknesses of the overlying materials) that adhere to the constraints of the available a priori information and minimise the aforementioned discrepancy between measurement and simulation. This results in a density or structural rock model which best reproduces the measured data. As the energy loss equation in general is nonlinear, also the mathematical optimisation in muon tomography is nonlinear. This is classically solved by either a linearisation of the physical equations or by employing nonlinear solvers. A further difficulty is introduced when working in 3-D. Monte Carlo techniques are, however, versatile enough to tackle these challenges, which is our main motivation for working with them. This circumstance encourages us to work with a lightweight version of a muon transport simulator, because a nonlinear inversion of Monte Carlo simulations, although mathematically preferable, is computationally prohibitive. This allows us to make use of methods from the Bayesian realm that thrive when measurements from different sources (i.e. muon flux measurement, laboratory, geological field measurements, maps, etc.) have to be combined into a single comprehensive model. With the code presented in this paper, we aspire to make muon tomography accessible to a broader geoscientific community, as the know-how in this field is mainly concentrated in particle physics laboratories. We want to provide the tools for Earth scientists, or users that are mainly focused on the application of the method, so that they can perform their own analyses.

In this contribution, we present our new code, SMAUG (Simulation for Muons and their Applications UnderGround), which allows a broader scientific community to plan and analyse muon tomographic experiments more easily, by providing them with data analysis and inversion tools. Specifically, we describe the governing equations of the physical models and the mathematical techniques that were used. Section 2 depicts how the muon flux simulation is conducted by its submodules and how a muon flux simulation is performed. Section 3 then dives into the inversion module and explains how the parameters of the inferred density/rock model can be estimated based on measured data. This section includes a description of the model and data errors and an explanation of how a subsurface material boundary can be constructed. Section 4 discusses the model's performance based on the data that we collected in the framework of an earlier experimental campaign (see the supplement of Nishiyama et al., 2017). Section 5 then concludes this study by outlining a way of how this code can be developed further to fit the needs of the muon tomography and geology community.

In order to provide the reader with quick access to information about the vast number of variables that are used in this work, we refer to the Table 1, where all the parameters are listed and explained.

List of variables used in this work. Parameters are grouped into sections where they are introduced first. Parameters that are sought for muon tomography (in our case and also in general) are highlighted in bold font.

Continued.

In geophysical communities, this part is generally known as the forward model, i.e. a mathematical model which calculates synthetic data for given “model” parameters. In muon tomography experiments, this forward model consists of different physical models which are serially connected.

In geophysical problems, there are many ways to parameterise a given problem.
One frequently used approach is to partition the space into voxels (i.e.
volume pixels of the same size) and describe them by the material parameters
only. This has the advantage of imposing a fixed geometry (i.e.
“thickness” does not enter as a parameter). Unfortunately, the vast number
of parameters to be determined requires very good data coverage of the
voxelised region. Another drawback might be the use of smoothing techniques
(such that neighbouring voxels are forced to yield similar material
parameters) that might blur any sharp interface. Because of these reasons,
we employed in our code another parameterisation that mimics the actual
measurement process. We refer the reader again to Fig. 1a, where two
different cones (

The nature of the data used in muon tomography generally consists of several
counts within a directional bin, defined by two polar and two azimuthal
angles. Additionally, the measurement is taken over a defined period of
time, as well as over a given extent within the detector area. The simulated
number of muons, in the

One final tweak can be made to render Eq. (2) more accessible for future
uses. We may rewrite the right-hand side of Eq. (2) in terms of a flux:

Since muons permanently lose energy when travelling through matter, they
also need a certain amount of energy to enter the detector. If the detector
is now positioned underground, the muons have to traverse more matter to
reach the detector and consequently need a higher initial energy to reach
the target. For the goal of studying the interactions between particles and
matter, physicists regularly use energy loss models. We base our
calculations in large parts on the equations of Groom et al. (2001), where
the energy loss of a muon along its path is described by an ordinary
differential equation of first order,

Because Eq. (8) describes the energy loss in response to the interaction
with a single-element material, certain modifications have to be made to
make it also valid for rocks, which in this context represent a mixture of
minerals and elements. In this case, the modified equation takes an
equivalent form to Eq. (8) when replacing

By applying a change of variables to Eq. (11), i.e.

Equation (13) shows that for the calculation of the cut-off energy two types
of material parameters are required, which are the material density

In addition to the above explained physical models, we may also utilise available spatial data for our purposes. In this context, the use of a DEM of the surface allows the visualisation of the position of the detectors relative to the surface, as well as the spatial extent of the bins. Additionally, it allows us to determine the location where these bins intersect with the topographic surface. As a first deliverable, we can draw conclusions on which bins consist of how many parameters. For example, if we know that the detector is located underground and that there is ice at the surface, we can already infer the existence of at least two materials (rock and ice). For this purpose, we wrote the script “modelbuilder.py”, which allows the user to attach geographic and physical information to the selected bins. This process of building a coherent geophysical model is needed for the subsequent employment of the inversion algorithm to process all the data.

As stated in the Introduction, we solve the inversion by using Bayesian methods. An explanation is needed as to why we chose this way and not another. First, the equations in Sect. 2 enable us to calculate a synthetic dataset for fixed parameter values. There, one can see that the governing equations constitute a nonlinear relationship between parameter values and measured data. Despite this being of no particular interest in the forward model, the estimation of the parameters from measured data is rendered more complicated. Among muon tomographers, linearised versions have been extensively used with deterministic approaches (e.g. Nishiyama et al., 2014; Rosas-Carbajal et al., 2017), which are successfully applicable when the density or the intersection boundaries are the only variables. When deterministic approaches are viable, they efficiently produce good results. Descent algorithms or, generally speaking, locally optimising algorithms, offer a valid alternative, as they could cope with the nonlinearity of the forward model, while including all desired parameters. One difficulty of such algorithms is that in the case of non-unique solutions (which occur when there are local minima that might be a solution to the optimisation) the user has no constraints to infer if a local or the desired global minimum has been reached. A further problem of descent methods is the calculation of the derivatives of the forward model with respect to the parameter values. The analytical calculation of the derivatives is enormously tedious because the cut-off energy results from a numerical solver of a differential equation, as can be seen in Eqs. (12) and (13). Unfortunately, numerical derivatives do not produce better results, because they might easily produce artefacts, which are hard to track down. This is especially true if the derivative has to be taken from a numerical result, which is always slightly noisy. In that case, the differentiation amplifies the “noise”, resulting in unreliable gradient estimates. A good overview over deterministic inversion methods can be found in Tarantola (2005).

The reasons stated above and our goal to include as much information on the parameters as possible nudges us towards employing probabilistic methods. Those approaches are also known as Bayesian methods. The main feature that distinguishes them from the deterministic methods described above is the consistent formulation of the equations and additional information in a probabilistic way, i.e. as probability density functions (PDFs). This allows us to (i) incorporate, for example, density values that were measured in the lab (including its error), (ii) set bounds on the location of the material interface, or (iii) define a plausible range for the composition of the rock. All these changes act on the PDF of the respective parameter and naturally integrate into the Bayesian inversion. We have to add that Bayesian methods do not solve the non-uniqueness problem, but they provide the user with enough information to spot these local solutions of the optimisation. Readers may find the book of Tarantola (2005) very resourceful for the explanation and illustration of probabilistic inversion. Several studies in the muon tomography community have already employed such methods with success (e.g. Lesparre et al., 2012; Barnoud et al., 2019).

The flexibility of being able to include as much information on the parameters as we consider useful comes at the price of having to solve the inversion in a probabilistic way. This can either be done using Bayes' theorem and solving for the PDFs of the parameters of interest, or if the analytical way is not possible by employing Monte Carlo techniques. As the presence of a numerical solver renders the analytical solution impossible, we resort to the Monte Carlo approaches. In the following sections, we guide the reader through the various stages of how such a probabilistic model can be set up, how probabilities may be assigned, and how the inversion can finally be solved.

The starting point for a probabilistic formulation is denoted by the
equations that were elaborated on in Sect. 2. These deterministic equations
need to be upgraded into a probabilistic framework, where their attributed
model and/or parameter uncertainties are inherently described. In the
following paragraphs, we describe how each model component can be expressed
by a PDF before the entire model is composed at the end of this section. The
model is best visualised by a directed acyclic graph (DAG) (see
Kjaerulff and Madsen 2008) that depicts which variables enter the
calculation at what point. For our muon tomography experiment, this is
visualised in Fig. 3. In the following, the PDFs are denoted with the bold
Greek letter

Directed acyclic graph (DAG) for the problem of muon tomography.
Variables in a square (

The data in muon tomography experiments are usually count data, i.e. a
certain number of measured tracks within a directional bin, which have been
collected over a certain exposure time and detector area. As the measured
number of muons is always an integer, we may model such data by a Poisson
distribution:

The next step is to set up a probabilistic model for the muon flux. First,
we observe that “flux” is a purely positive parameter, i.e.

The energy loss model has multiple sources of errors that have to be taken
into account. Most notably, the relative errors on the different physical
energy loss processes are given by Groom et al. (2001) as

The calculated energy loss depends also on the material parameters and
subsequently on their uncertainties. However, these will be explained in
detail in Sect. 3.1.4. A last uncertainty enters by the numerical solution
of the ordinary differential equation, Eq. (12). We decided not to model
this error, as its magnitude is directly controlled by the user (by setting
a small enough step length in the Runge–Kutta algorithm) and thus can be
made arbitrarily small. Lastly, we assume that all the errors in the energy
loss model are explained by uncertainties in the energy loss terms as well
as in the material parameters. Although this assumption is rather strong,
since it excludes the possibility of a wrong model, we argue that this
approach works as long as the variation in these parameters can explain the
variation in the calculated cut-off energy. If this requirement is met, we
may model the PDF for the energy loss model as a delta function,

The density model can take different forms of probability densities (see
Appendix B1), such as normal, log normal, uniform, etc. For either form, it
is possible to describe it by a generic function

The situation for the thicknesses of the segments,

With the help of the DAG, introduced in Fig. 3, it is now straightforward to
factorise the joint probability distribution for the whole problem, as their
structure is equal. This results in

Equation (30) depicts the full joint PDF. However, the relations between the parameters, as shown by the DAG (see Fig. 3), classify this model as a hierarchical model (Betancourt and Girolami, 2013). The key characteristic of such models is their tree-like parameter structure; i.e. the measured number of muons is related to the thickness or the density of the material by the flux parameter only, which “relays” the information. A central problem of such models is the presence of a hierarchical “funnel” (see Figs. 2 and 3 of Betancourt and Girolami, 2013), which renders it very difficult for standard Monte Carlo methods to adequately sample the model space. In high-dimensional parameter spaces, this problem is exacerbated even more.

Our aim to provide a simple and easy-to-use programme somewhat contradicts this necessity of a sophisticated method (which inevitably requires the user to possess a strong statistical background). As the main problem is the rising number of parameters, it should be possible to mend the joint PDF by imposing thought-out simplifications.

We first get rid of the flux parameter, as for our problem it merely is a
nuisance parameter. This is an official term for a parameter in the
inversion which is of no particular interest but still has to be accounted
for. Specifically, we mean that even though the calculation of the muon flux
is important, we do not want to treat it as an explicit parameter that is
simulated by the code. To achieve this, we integrate over all possible
values of the muon flux,

This marginalisation roughly halves the number of parameters, but there is
still another simplification, which we may use. Many muon tomography
applications deal with a two-material problem, while there may also be
measurement directions where only one material is present. If we
conceptually split those two problems and solve them independently, it is
possible to further reduce the number of simultaneously modelled parameters.
In the study of Nishyiama et al. (2017), the results of which we will use
later, these two cases encompass bins where we measured only rock and others
where we know there is ice and rock. The joint PDF for rock bins
subsequently is

For the second problem, we can interpret

Usually in Bayesian inference, the goal is to calculate the posterior PDF,
given the measured data, i.e. the quantity

The basic MCMC algorithm, which we also use in this study, is the Metropolis–Hastings (MH) algorithm (Hastings, 1970; Metropolis et al., 1953), which allows for the sampling of the joint PDF to obtain a quantitative sample. We note, however, that many different MCMC algorithms exist for various purposes and that the MH has no special status except for being comparatively simple to use and implement. An example of another MCMC algorithm in muon tomography can be found in Lesparre et al. (2017). The authors used a simulated annealing technique on the posterior PDF in order to extract the maximum a posteriori (MAP) model. As every simulated annealing algorithm has some type of MH algorithm at its core, we directly use the MH algorithm in its original form such that we not only retrieve a point estimate but a PDF for the posterior parameter distribution. The algorithm is explained in detail by Gelman et al. (2013), such that we only provide a short pseudo-code description.

Draw a starting model,

Until convergence:

Propose a new model according to

Evaluate log-PDF value of

Evaluate the acceptance probability,

If

else: sample

The advantage of this algorithm, compared to a “normal” sampling, lies in its efficiency. It is often not possible, or even reasonable, to probe the whole model space, as the largest part of the model space is “empty”, where the PDF value of the posterior is uninterestingly small. The fact that regions of high probability are scarce, and this becomes worse in high-dimensional model spaces, is known as the “curse of dimensionality” (Bellman, 2016). MCMC algorithms (including the here-presented MH algorithm) allows one to focus on regions of high probability, and therefore we are able to construct a reliable and representative sample of the posterior PDF. We again refer to Gelman et al. (2013) for a discussion of why the MH algorithm converges to the correct distribution and why we may use samples that were gained this way to estimate the posterior probability density.

The above-stated advantages, however, come at a price. First and foremost,
we must ensure that the algorithm advances fast enough, but not too fast,
through the model space. This is mainly controlled by the proposal
distribution

A second crucial point is the presence of a warm-up period. The starting point, which usually lies in a region of high prior probability, does not necessarily lie in a region of high posterior probability. The time it takes to move from the latter to the former is exactly this warm-up. This can usually be visualised by a trace plot, e.g. Fig. 4, in which the value of a parameter is plotted against the number of iterations. After this warm-up phase, the algorithm can be run in operational mode and “true” samples can be collected.

Example of a trace plot (two independent chains; blue and orange) of
a MH run with 500 draws. This plot shows the parameter value (here material
density) vs. number of steps of a collection of cones in which we (Nishyiama
et al., 2017) knew that only rock is present. This is a calculation that is
included in the code base. The warm-up phase of this MH algorithm takes
roughly 150 simulations indicated by the subsequent oscillation around a
parameter value of

As in a Markov chain the actual sample is dependent on the last one, we need a criterion to argue that the samples created in that way really represent “independent” samples. Qualitatively, we may say that if the Markov chain forgets the past samples fast enough, then we may sooner treat them as independent from each other. Gelman et al. (2013) suggests that in order to assess this quantitatively, multiple MH chains could be run in parallel, and statistical quantities within and between each chain are analysed. For a detailed discussion thereof, we refer the reader to Appendix C.

Once a satisfying number of samples has been drawn from the posterior PDF, a marginalisation of the nuisance parameters can be done by looking at the parameters of interest only. These samples may then be treated like counts in a histogram, i.e. distributional estimates, or simply the interesting statistical moments, such as mean and variance, can be obtained.

The main analysis programme allows us to export all parameters either as a
full-chain dataset, where every single draw is recorded, or as a statistical
summary (i.e. mean and variance). Both are then converted to point data,
i.e. (

The “modelviewer.py” routine is able to read datasets from different
detectors (which are saved as JSON files) and computes for each cone the
statistic, which the user is interested in (see “sigma” entry in programme).
Thus, it is possible to use the mean or, for example, the

Two-dimensional stencil, used to summarise the bilinear
interpolation of interface positions within cones
(

As a second step, the programme interpolates this point cloud in a bilinear
way to a rectangular grid with a user specified cell size,

We could also have fitted a surface through the resulting point cloud.
However, by formulating this surface as a matrix, we gain access to the whole
machinery of linear algebra. Moreover,

In order to calculate the height at a grid point,

The concept of damping usually revolves around the idea to force parameters to a certain value (e.g. in deterministic inversion by introducing a penalty term in the misfit function for deviations from that value). From a Bayesian viewpoint, this would be accomplished by setting the prior mean to a specific value. In our code, we implemented this idea by allowing the user to read a DEM and a “damping weight” to the code (see “fixed length group” in code). The programme effectively computes a weighted average between the bedrock positions within the cones and a user-defined DEM. The higher the chosen damping weight, the more the resulting interface will match the DEM when pixels overlap.

The matrix formulation also enables us to use a further data processing
technique without much tinkering. As geophysical data are often quite noisy,
a standard procedure in nearly every geophysical inversion is a smoothing
constraint. This effectively introduces a correlation between the parameters
and forces them to be similar to each other. From a Bayesian perspective, we
could have achieved this correlation by defining a prior covariance matrix
of the thickness parameters, such that neighbouring cones should have
similar thicknesses (which makes sense as we do expect the bedrock–ice
interface to be relatively continuous; Fig. 1). As we work with independent
cones in this study, we leave the exploration of this aspect open for a
future study. Nevertheless, we offer the possibility in our code to use a
smoothing on the final interpolated grid. This is achieved by a convolution
of a smoothing kernel,

Finally, we added a checkbox to our code to allow it to change the order of the damping and smoothing operations. Sometimes when a strong damping is necessary, this may result in rather unsmooth features at DEM boundaries, such that it makes sense to perform a smoothing only afterwards.

In this part, we test the presented reconstruction algorithm on previously published data. For this purpose, we compare our calculations to the ones already published in the study by Nishiyama et al. (2017), where the goal was to measure the interface between the glacier and the rock, in order to determine the spatial distribution of the rock surface (also below the glacier). This study was conducted in the central Swiss Alps in a railway tunnel that featured a glacier (part of the Great Aletsch Glacier) above. A situation sketch is shown in Fig. 1. For a detailed verification of the energy loss calculations, we refer the reader to Appendix E.

The results shown below (Figs. 6–8) represent the bedrock–ice interface
interpolated to an 8 m grid, which was first damped (weight 8) and then
smoothed (two grid pixels, i.e.

Western cross section. The brown and dashed blue lines indicate the
ice–bedrock interface solutions of this study and the one from Nishiyama et
al. (2017), respectively. The

Central cross section. The brown and dashed blue lines indicate the
ice–bedrock interface solutions of this study and the one from Nishiyama et
al. (2017), respectively. The

Eastern cross section. The brown and dashed blue lines indicate the
ice–bedrock interface solutions of this study and the one from Nishiyama et
al. (2017), respectively. The

Figure 6 shows the western profile, where our bedrock–ice interface and the one from the previous study agree well and both lie within the given error margins. The lack of fit in areas where the steepness changes rapidly (i.e. around 40 and 80 m) can be explained as a smoothing artefact. Towards the end of the profile, the decreasing data coverage becomes evident as the uncertainties rise. This effect can also be seen in the jagged behaviour of the interface curves around 100 to 120 m, hinting at the effect where the interpolation has occurred with few data.

Figure 7 presents the central profile. Similar to the western profile (Fig. 6),
the fits match quite well and are within the error margins. It may be
possible that the point where the actual bedrock begins might be further
down (i.e.

The eastern profile is shown in Fig. 8. One sees that the results from this
study are internally consistent. The surface from the previous study plunges
down earlier with respect to the surfaces calculated here. This may in fact
be a damping effect, as the bedrock–ice interface from Nishiyama et al. (2017)
has not been constrained to the bedrock (via damping) and thus
plunges down before the damping mark at

In all three results (Figs. 6, 7 and 8), it can be seen that the reconstructed surfaces in the bedrock region are following the DEM within 5–10 m. The reason for this deviation may be explained by the smoothing. At the beginning of this section, we explained that the reconstructed interface has been smoothed by two grid pixels, which corresponds on an 8 m grid to a smoothing of 16 m to each side. This is also valid for the direction perpendicular to the cross sections shown in Figs. 6–8. Thus, the over-/underestimation can very well be a smoothing effect. The behaviour of the reconstruction in the western cross section (Fig. 6) further supports this explanation, as we see at 40 m an underestimation and at 80 m an overestimation of the height. This is a typical behaviour of smoothing around “sharper” edges.

It might also be possible that the over-/underestimation might be due to heterogeneities of the rock density due to uneven fracturing and/or weathering. However, during our fieldwork (see Mair et al., 2018), when we also inspected the train tunnel from within, we did not see any signs of such a heterogeneous behaviour. These observations are still only superficial. For an in-depth study of this effect, one would need, for example, a much longer muon flux exposure, such that the density of the rock could be better resolved. Alternatively a borehole, or another geophysical study could be performed. As we are not in possession of such information, we will not draw a definite conclusion here. Nevertheless, the performance of the whole workflow, which is shown in this study, produces results which are similar to the ones published in the previous study (Nishiyama et al., 2017). We use the results of this comparison to validate the base of our code.

In this study, we have presented an inversion scheme that allows us to integrate geological information into a muon tomography framework. The inherent problem of parameter estimation has been formulated in a probabilistic way and solved accordingly. The propagation of uncertainties thus occurs automatically within this formalism, providing uncertainty estimates on all parameters of interest. We also considered approaches including DAGs or the simplex subspace of compositions which could be helpful to the muon tomography community while tackling their own research. We condensed these approaches in a modular toolbox. This assortment of Python programmes allows the user to address the subproblems during the data analysis of a muon tomography experiment. The programmes are modular in the sense that the user can always access the intermediate results, as the files are mostly in a portable format (JSON). Thus, it is perfectly possible to only use one submodule of the toolbox while working with an own codebase. As every “tool” is embedded in a GUI, the programme is made accessible without the need to first read and consider several thousand code lines. Furthermore, we have shown that the results we obtain with our code are largely in good agreement with an earlier, already published experiment. The small deviations may be attributed to data analysis subtleties.

In its current state, SMAUG may be of help to researchers who (a) plan to use muon tomography in their own research, such that the feasibility of the use of this technology can be evaluated in a virtual experiment, (b) want to use a submodule for the analysis of their own muon tomography or (c) plan to perform a subsurface interface reconstruction similar to our study. We would like to stress that this work is merely a foundation upon which many extensions can be built when it is used in other applications as well. Future content might, for example, include a realistic treatment of multiple scattering and the inclusion of compositional uncertainties in the inversion, for which we laid out the basis in this study.

As many empirical muon flux models, the one that we employed consists of an
energy spectrum for vertically incident muons at sea level at its core. An
accepted instance is the energy spectrum of Bugaev et al. (1998) that takes
the form

The density distribution of a lithology can be determined through various methods. In our work, we estimated the density of the lithology by analysing various rock samples from our study area in the laboratory. Two experimental setups were employed to gain insight into the grain, skeletal as well as the bulk density of the rocks. Grain and skeletal density were measured by means of the AccuPyc 1340 He pycnometer, which is a standardised method that yields information on the volume. Bulk density values were then determined based on Archimedes' principle, where paraffin-coated samples were suspended into water (ASTM C914-09, 2015; Blake and Hartge, 1986).

Every sample

Example output of “subsample_ analysis.py” for a bulk density measurement of subsample JT-27-1 (see the Supplement for data). Green bars represent the histogram of 10 000 Monte Carlo simulation draws. The orange curve indicates the fitted normal probability density function.

Example output of “materializer.py”. Here, a set of subsample mean values (red crosses) is processed in a kernel density estimate (solid blue line). Finally, a normal distribution is fitted to the kernel density estimate (dashed yellow line).

We have found 10 000 draws per subsample to be sufficient to retrieve a solid final distribution. However, this parameter can easily be changed in the script, depending on the user's preference of precision and/or speed. From this point onwards, we may work with a Gaussian distribution as Fig. B1 assures us that a normal PDF describes the results of the Monte Carlo simulation rather well.

The determination of the grain and skeletal densities is simpler than the
bulk density measurements because the corresponding method consists of a
mass and a volume measurement, respectively. The density formula reads then
simply

The kernel density estimation has the advantage that only the mean values of the subsamples have to be processed as the bandwidth is determined from the spread of the subsample means. Following the methodology of Vermeesch (2012), we end up with a PDF like the one visualised in Fig. B2. We could at this point use the kernel density estimated (KDE) PDF for further calculations. However, for simplicity we approximate the KDE with a normal distribution and intend to add support for the KDE in a later code version.

Visual test for multivariate normality of the log-ratio data from Table B3 (this plot shows the full dataset, of which Table B3 is only an excerpt). Each subplot checks for marginal normality. Oxygen is the denominator variable (arbitrarily chosen) and does thus not appear in the plot.

One word of warning has to be made here. The measured densities of rock might be affected with a systematic error. Namely, the rock samples that are analysed were all gathered from near-surface locations (in our case inside the tunnel or outside, i.e. where rocks are accessible). This means that they could have been subject to weathering processes that alter the density of the rock in such a way that the samples are not representative of the whole rock body anymore. Possible countermeasures would be to compare drilled samples from deeper within the rock body with the surface samples, etc.

We have seen in Eq. (12) that the material density parameter enters the
energy loss calculations rather directly. Contrariwise, the compositional
model affects the energy loss equations much more subtly through the average

Although a modal mineral analysis (i.e. the quantitative determination of mineral volumes) is preferable and can be treated according to Lechmann et al. (2018), its execution is a rather time-consuming effort. This is the reason why compositional data in muon tomography experiments predominantly consist of XRF data, which show the abundance of major oxides within the rock. We describe here a method to incorporate such type of information in a probabilistic way thereby following Aitchison (1986). Compositional data are usually available in the form of Table B1, which presents an excerpt of four samples for illustration purposes. We refer to the excel sheet in the Supplement of the present work for the full data.

Excerpt of XRF data for four samples. Data in column denote weight percentages of major oxides within the rock samples.

There are several challenges to these kinds of data. First, the parameters (i.e. the oxide percentages) can take a value between 0 and 1. This means that normal as well as log-normal distributions are not suitable to describe these parameters. Second, the requirement that the sum of all parameters has to ideally equal 1 poses a constraint on this parameter space, which effectively reduces the number of independent parameters by 1. Third, due to measurement uncertainties, this sum is never exactly 1.

Spaces which have this unit sum condition can be viewed as a simplex; e.g.
if we had three compositional parameters, the simplex would be a
two-dimensional surface (i.e. a subspace) in this three-dimensional parameter
space. The last issue, of not summing up exactly to 1, can be remedied by
projecting each sample dataset back to the simplex (Aitchison, 1986, pp. 257–261).
This works only if the measurement imprecisions are not too large,
which works well for the examples in Table B1. With respect to the energy
loss calculation, it is preferable to decompose the oxides into

Element weight percent data, transformed from oxide weight percent data with the use of Eq. (B5). All data have additionally been scaled to satisfy the unit sum constraint.

In order for the data to be in a statistically convenient form, Aitchison (1986) suggests to further transform the data in Table B2 by first forming a ratio with an arbitrary element (in the list) and then taking the logarithm. For the exemplary dataset, this is shown in Table B3.

Log ratio of element weight percentages, with respect to oxygen wt %.

The rationale behind this transformation is as follows. The division by an
arbitrarily present element effectively transforms the space into an (

With a graph like that in Fig. B3, it is possible to check if the multivariate normal
distribution is an appropriate model to describe the elemental composition
data. For the example shown in Fig. B3, this looks acceptable, with only
slight deviations for silicon, aluminium, manganese and sodium. Once the
normality has been verified, it is possible to generate random samples from
this distribution. For every drawn sample, it is then possible to calculate
the weight percentages of the single elements by using the inverse formula
to the log-ratio transformations:

As stated in Eq. (11), the energy loss equation for rocks needs parameters
that differ from the ones for pure elements. First, the expression for
density can directly be exchanged according to the density model (see
Appendix B1). Second, it is possible to generate an expression for the
average ionisation loss within a rock by exchanging three parameters.
Density values that also enter within

This Appendix is a short summary of Gelman et al. (2013, pp. 284–287) and we
refer to these pages for a detailed discussion of the calculations. This
work presents a concept of how to assess the quality of a MCMC run. In
particular, the aforementioned author proposes to analyse two quantities,
the potential scale reduction factor

One problem that arises in MCMC algorithms is the inherent dependence of one
simulation on the one before (this is the definition of a Markov chain). One
considers that such a dependency does not introduce a bias if enough samples
are drawn. However, this also means, that the effective, independent sample
size is much smaller than the number of simulations. Therefore, Gelman et al. (2013) proposes
to calculate the effective number of simulation draws,

As stated in the main text, the user specifies the number of neighbouring
pixels

The energy loss model that we use in our code generally reproduces the literature values well (the relative error is generally below 1 %) across the different energy loss processes and relevant energies. In Fig. E1, we present the energy loss calculations for each energy loss process (i.e. ionisation, bremsstrahlung, pair production and photonuclear interactions) across energies from 10 MeV to 100 TeV for silicon.

The overall characteristics between the different elements are the same with minor differences regarding the position of the critical energy and the 1 % radiative point. In Fig. E2, we show the relative error of our calculations to the tabulated values from Groom et al. (2001) for the whole energy range.

We note that the energy losses by ionisation are reproduced very well over the entire energy range. We also note that the relative error on the radiative energy losses is rather large below 10 GeV. This does not, however, introduce a major bias, because below this energy, radiative energy losses are negligible compared to ionisation losses, as can be seen in Fig. E1. Furthermore, the related errors are in an acceptable range at the energy level at which radiative losses begin to become noticeable (i.e. around 100 GeV). This can be seen in Fig. E2, in the sense that the total relative error remains well bounded within 0.5 %. In the ionisation domain (i.e. below 100 GeV), the total relative error is dominated by the ionisation relative error, whereas above this energy level the relative errors on radiative losses start to prevail. A close-up of this energy range is given in Fig. E3.

There are different sources and circumstances that contribute to the error in the different energy losses processes. The scatter of the relative ionisation-loss error around 0 with a rather small deviation can be viewed as simple rounding errors. The errors on the radiative processes, however, seem to be of a more systematic nature. We explain this behaviour through a different numerical integration scheme in Eq. (10), which tends to systematically under-/overestimate the true value, especially when the integrand comprises exponential functions. Whereas we used a double exponential integration scheme (see Takahasi and Mori, 1974), the integration scheme from Groom et al. (2001) is not discernible. However, as the relative errors on the processes of energy loss remain well within the theoretical uncertainties (see Sect. 3.1.3), we consider that our calculation accurately reproduces the literature values for elements.

The above calculations were performed for pure silicon. The respective
figures for other four important elements in the Earth's crust (Al, Fe, Ca
and O) can be found in Appendix E2. Those elements are, however, not
representative of any real material encountered in geological applications.
For this reason, we compiled the same computations for four selected,
geologically important compounds (SiO

Log–log plot of the stopping power of the different energy loss
processes for silicon. At

Relative error of our energy loss calculations compared to the tabulated values from Groom et al. (2001) for silicon. Ionisation losses agree very well with the literature values (within 0.025 %). At low energies, the relative errors of the radiative processes are large and converge to a value close to 0 towards higher energies, resulting in a relative error on the total energy loss of around 0.5 % compared to the literature.

Relative error of our energy loss calculations for silicon compared to the tabulated values from Groom et al. (2001) at higher energies (100 GeV–100 TeV). The relative errors remain bounded within their theoretical uncertainties (see Sect. 3.1.3).

Relative error of our energy loss calculations compared to the
tabulated values from Groom et al. (2001) for standard rock
(

Relative error of our energy loss calculations compared to the
tabulated values from Groom et al. (2001) for standard rock
(

Relative error of our energy loss calculations compared to the
tabulated values from Groom et al. (2001) for aluminium in the energy
ranges

Relative error of our energy loss calculations compared to the
tabulated values from Groom et al. (2001) for calcium in the energy ranges

Relative error of our energy loss calculations compared to the
tabulated values from Groom et al. (2001) for iron in the energy ranges

Relative error of our energy loss calculations compared to the
tabulated values from Groom et al. (2001) for oxygen in the energy ranges

Relative error of our energy loss calculations compared to the
tabulated values from Groom et al. (2001) for calcium carbonate (calcite) in
the energy ranges

Relative error of our energy loss calculations compared to the
tabulated values from Groom et al. (2001) for silicon dioxide (quartz) in
the energy ranges

Relative error of our energy loss calculations compared to the
tabulated values from Groom et al. (2001) for ice in the energy ranges

Our toolbox, SMAUG, contains several subprogrammes which are executed
separately. This allows the user to inspect intermediate results without any
difficulty. We also tried to keep the intermediate results as portable as
possible by using JSON files as often as possible. Here, we explain, in
logical order, the rational of the submodules (a detailed user manual is
available separately):

The source code of SMAUG 1.0 is publicly and freely available at

The data of the density and XRF measurements are included (i) in the files
that can be downloaded from

The supplement related to this article is available online at:

AL, FS and AE designed the study. AL developed the code with contributions by MV, CP and RN. AL performed the numerical experiments with support by RN. DM and AL compiled geological data. AA, TA, PS, RN and CP verified the outcome of the numerical experiments. AL wrote the text with contributions from all co-authors. AL designed the figures with contributions by DM. All co-authors contributed to the discussion and finally approved the manuscript.

The contact author has declared that neither they nor their co-authors have any competing interests.

Publisher’s note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

We thank the Swiss National Science Foundation (project no. 159299 awarded to Fritz Schlunegger and Antonio Ereditato) for their financial support of this research project. Further, we want to thank the Jungfrau Railway Company for their continuing logistic support during our fieldwork in the central Swiss Alps. Finally, we want also to thank the High-Altitude Research Stations Jungfraujoch and Gornergrat for providing us with access to their research facilities and accommodation.

This research has been supported by the Schweizerischer Nationalfonds zur Förderung der Wissenschaftlichen Forschung (grant no. 159299).

This paper was edited by Thomas Poulet and reviewed by Nolwenn Lesparre and one anonymous referee.